In Shorack's Probability for Statisticians Notation 7.4.1, he notes that the conditional expectation (defined in the measure theoretic way) $\mathbb E(Y\mid X)$ is $g(X)$ for some measurable $g :(\mathbb R, \mathcal B_\mathbb R) \to (\mathbb R, \mathcal B_\mathbb R)$. He then defines $\mathbb E(Y \mid X=x)$ as simply $g(x)$. For conditional probabilities, I'm pretty sure this means that $P(A \mid X=x)$ will be defined as $\mathbb E(1_A \mid X=x)$. I'm not entirely confident that this measure-theoretic definition of conditional probability conditioned on $X=x$ matches with the classical notion of conditional probability, so if someone could shed some light on that too that'd be great.
Here's a picture of the relevant section from the book.

My question is: is there a generalization of this definition to $\mathbb E(Y\mid X\in B)$ for some Borel set $B \in \mathcal B_\mathbb R$?
Looking at this question/answer When do the measure-theoretic and elementary definitions of conditional probability/expectation coincide?, it seems like the generalization would require dividing $P(X\in B)$, which may not be possible since it could be $0$. In that case, why is it that we can have a general definition of $\mathbb E(Y\mid X=x)$ but not of $\mathbb E(Y\mid X\in B)$?
Why not continue the analogy and set $\mathbb{E}(Y \, \mid \, A) = \mathbb{E}(Y , \mid \, 1_{A} = 1)$? In this case, you can check that $\mathbb{E}(Y \, \mid \, 1_{A} = 1) = \frac{\mathbb{E}(Y : A)}{\mathbb{P}(A)}$ so that it is consistent with the naive approach. Further, it's worth noting that $\mathbb{E}(Y \, \mid \, 1_{A})$ has the form \begin{equation*} \mathbb{E}(Y \, \mid \, 1_{A}) = \left\{ \begin{array}{r l} \frac{\mathbb{E}(Y : A)}{\mathbb{P}(A)}, & \text{in} \, \, A, \\ \frac{\mathbb{E}(Y : A^{c})}{\mathbb{P}(A^{c})}, & \text{in} \, \, A^{c}. \end{array} \right. \end{equation*} I leave the confirmation of these identities as exercises, which are not too hard since $\sigma(1_{A}) = \{\phi, A, A^{c}, \Omega\}$.
An interesting fact that's worth mentioning here is that if $Y,X$ are two real random variables (you could also use random vectors, but I won't), if $\mu_{X}$ is the law of $X$, and if $Y$ is integrable, then \begin{equation*} \mathbb{E}(Y \, \mid \, X = x) = \lim_{\epsilon \to 0^{+}} \frac{\mathbb{E}(Y : X \in [x - \epsilon, x + \epsilon])}{\mathbb{P}\{X \in [x- \epsilon, x + \epsilon]\}} \quad \text{for} \, \, \mu_{X}\text{-a.e.} \, \, x \in \mathbb{R} \end{equation*} This follows from the Besicovitch Differentiation Theorem (cf. Chapter 5 of Sets of Finite Perimeter and Geometric Variational Problems by Maggi); I don't know an easier way to prove it in general. Replacing $Y$ by $1_{A}$ for suitable events $A$, we can also use this to compute probabilities conditioned on $\{X = x\}$.
On the topic of "conditioning on sets of measure zero," there is yet another fact worth mentioning. Let $U$ be a bounded open subset of $\mathbb{R}^{d}$ with smooth boundary, fix $x \in U$, and let $B^{x}$ be be a standard Brownian motion with $B^{x}_{0} = x$. Let $\tau_{U}$ be the first time $B^{x}$ reaches $\partial U$. It turns out that there is another process $\tilde{B}^{x}$ such that, for each $t \geq 0$, $B^{x}_{t}$ conditioned on $\{\tau_{U} \geq T\}$ converges in distribution to $\tilde{B}^{x}_{t}$ as $T \to \infty$. That is, \begin{equation*} \mathbb{E}(f(\tilde{B}^{x}_{t})) = \lim_{T \to \infty} \frac{\mathbb{E}(f(B_{t}^{x}) : \tau_{U} \geq T)}{\mathbb{P}(\tau_{U} \geq T)}. \end{equation*} Note that $\tau_{U} < \infty$ almost surely so here, as in the last paragraph, we have "(asymptotically) conditioned on a set of measure zero." I don't know if it's possible to interpret the statement "$\tilde{B}^{x}$ equals $B^{x}$ conditioned on $\tau_{U} = \infty$" in a more rigorous way. (More generally, in the theory of stochastic process, the previous construction is called a Yaglom limit.)