Background
I am working my way through Cedric Villani's Optimal Transport: Old and New and ran by something I've seen before do not understand one bit: disintegration (I only know it's some generalization of conditional expectation, it's more like a Markov kernel).
Question
In the proof of convexity of the optimal cost (Theorem 4.8, pg 59 [1]) he mentions $a\in L^{1}(d\mu_{\theta}d\lambda(\theta))$... what does this mean?! For further context
"""
Theorem 4.8 (Convexity of the optimal cost).
... Let $(\Theta, \lambda)$ be a probability space, and let $\mu_{\theta}$, $\nu_{\theta}$ be two measurable functions defined on $\Theta$, with values in $P(\mathcal{X})$ and $P(\mathcal{Y})$ respectively. Assume $c(x,y) \geqq a(x) + b(y)$, where $a \in L^{1}(d\mu_{\theta}d\lambda(\theta))$ and $b \in L^{1}(d\nu_{\theta}d\lambda(\theta))$.
"""
References
- C. Villani, Optimal Transport, vol. 338. Berlin, Heidelberg: Springer Berlin Heidelberg, 2009. Available: https://ljk.imag.fr/membres/Emmanuel.Maitre/lib/exe/fetch.php?media=b07.stflour.pdf
The notation is pretty intuitive, I think. It means that $a$ is measurable and $$ \int_\Theta\int_\mathcal{X} |a(x)|\,d\mu_\theta(x)\,d\lambda(\theta)<\infty. $$