I am reading the book "Stochastic Optimal Control: The Discrete Time Case", by Bertsekas and Shreve (hereafter called "the Book"), and I recently observed that a statement made in page 10 of the book (Introduction) seems that can be stated somewhat more generally.
The statement under question is described in the following:
Let $\mathscr{B}_\mathbb{R}$, $\mathscr{B}_{\mathbb{R}^2}$ denote the Borel $\sigma$-algebras on $\mathbb{R}$, $\mathbb{R}^2$, and consider a Borel measurable function $g:\mathbb{R}^2\rightarrow\mathbb{R}$, such that
\begin{equation} \inf_{u\in\mathbb{R}}g\left(x,u\right)>-\infty,\quad \forall x\in \mathbb{R}. \end{equation}
Consider now the set of all Borel measurable functions (policies) from $\mathbb{R}$ to $\mathbb{R}$ and denote it by $\cal{P}$.
I claim that, for any $\varepsilon>0$, there exists a Borel measurable policy $\mu_\varepsilon\in\cal{P}$, such that
\begin{equation} g\left(x, \mu_\varepsilon \left(x \right) \right) \le\inf_{u\in\mathbb{R}}g\left(x,u\right) + \varepsilon,\quad \forall x\in \mathbb{R}. \end{equation}
In the Book, on the other hand, it is claimed that the inequality above holds only almost everywhere, with respect to some given probability measure on $\mathscr{B}_\mathbb{R}$.
The proof of my claim follows.
First, for any measurable policy $\mu\in\cal{P}$, it holds that
\begin{equation} g\left(x,\mu \left( x \right) \right) \ge \inf_{\mu\in\cal{P}}g\left(x, \mu \left( x \right) \right) \ge \inf_{u\in\mathbb{R}}g\left(x,u\right)>-\infty,\quad \forall x\in \mathbb{R}. \end{equation}
Fix an $\varepsilon>0$. Then, there exists a Borel measurable policy $\mu_\varepsilon\in\cal{P}$, such that
\begin{equation} g\left(x, \mu_\varepsilon \left(x \right) \right) \le \inf_{\mu\in\cal{P}}g\left(x, \mu \left( x \right) \right) + \varepsilon,\quad \forall x\in \mathbb{R}. \end{equation}
Note that such a policy may be always found, since otherwise we would be led to a contradiction: If such a policy does not exist, then it would be true that
\begin{equation} g\left(x, \mu \left(x \right) \right) > \inf_{\mu\in\cal{P}}g\left(x, \mu \left( x \right) \right) + \varepsilon,\quad \forall x\in \mathbb{R}, \end{equation}
for all $\mu\in\cal{P}$, contradicting the fact that $\inf_{\mu\in\cal{P}}g\left(x, \mu \left( x \right) \right)$ is the infimum over $\mu\in\cal{P}$.
Now, since $\cal{P}$ is the class of all Borel measurable functions from $\mathbb{R}$ to itself, the set containing all constant policies, defined as
\begin{equation} \mu_u \left( x \right) \triangleq u,\quad \forall x\in\mathbb{R}, \quad\text{for some } u\in\mathbb{R}, \end{equation}
will be a subset of $\cal{P}$, and, therefore,
\begin{equation} \inf_{\mu\in\cal{P}}g\left(x, \mu \left( x \right) \right) \le g\left(x, \mu_u \left( x \right) \right) = g\left(x, u \right) \quad\forall x\in\mathbb{R}\quad\text{and}\quad \forall u\in\mathbb{R}. \end{equation}
In particular, taking infima on both sides, it will also be true that
\begin{equation} \inf_{\mu\in\cal{P}}g\left(x, \mu \left( x \right) \right) \le \inf_{u\in\mathbb{R}}g\left(x, u \right) \quad\forall x\in\mathbb{R}. \end{equation}
This last inequality implies that there exists $\mu_\varepsilon\in\cal{P}$, such that
\begin{equation} g\left(x, \mu_\varepsilon \left(x \right) \right) \le\inf_{u\in\mathbb{R}}g\left(x,u\right) + \varepsilon,\quad \forall x\in \mathbb{R}. \end{equation}
for any arbitrary chosen $\varepsilon>0$, which seems to prove my claim.
Have I done anything wrong in the above derivations? Why is claimed in the Book that this result holds only almost everywhere in $x$?
Thanks!