Advantages of Riesz theorem over Caratheodory Extension theorem

699 Views Asked by At

I apologize in advance if this question seems vague. I'm self studying Real and Complex Analysis by W. Rudin and I've been reading the proof of Riesz theorem in the second chapter which is then used in the construction of the Lebesgue measure.

In an analysis class I took a few years ago in undergrad we used another approach and used the Caratheodory extension theorem.

I may be biased but I find the second approach much more intuitive in the immediate applications, specifically in the construction of the Lebesgue measure. My question is then the following:

Are there measures that are more easily defined (or that can only be defined) using Riesz theorem?

That is, is there any significant difference in the two approaches?

1

There are 1 best solutions below

2
On BEST ANSWER

General comments: Riesz is easier to remember Arguably Lebesgue measure is an example (of a measure where the Riesz construction is easier). The annoying thing about the Caratheodory extension theorem is you always have to check whether or not a certain class of sets is a semi-algebra, semi-ring, etc. I personally never could remember the definitions of these objects. Compared to remembering the statement of Caratheodory and checking its assumptions, I would say it is much easier to check that Riemannian integration defines a bounded linear functional on $C_{c}(\mathbb{R}^{d})$ and then invoke Riesz to get Lebesgue integration for free.

I want to emphasize what is nice about this proscription: if I have an idea for a measure $\mu$ that I already more or less understand, if I can recast it as a (bounded) linear functional on an appropriate space, then I can more-or-less mindlessly apply the Riesz Theorem, no further questions asked. We will see this in action in another example below.

General comments: Riesz Theorem has applications As for intuition, yes, the Caratheodory approach for Lebesgue measure is more concrete since you are building a measure from a set function --- you start with measures of elementary sets, then extend to "all" sets. However, I do not think Rudin proves the Riesz Theorem just to define Lebesgue measure --- he wants to prove the Riesz theorem for its own sake. It has its own applications, which make it worth including in the book --- see the last part of this answer for a deeper (by now classical) application.

General comments: Intuition If you think of a measure as describing how large a set is, then Caratheodory is intuitive --- start with elementary sets, then somehow "generate" the sizes of other sets by invoking the theorem. However, this geometric interpretation isn't the only way to think about measures. One of the other major applications of measure theory is in probability. Here a probability measure often describes the law of a random variable. The most natural way of thinking about that is if $X$ is my random variable (say, real-valued) and $f$ is some (say, continuous) function, then the law of $\mu_{X}$ is a probability measure such that \begin{equation*} \mathbb{E}(f(X)) = \int_{\mathbb{R}} f(y) \, \mu_{X}(dy). \end{equation*} You can think of $f(X)$ as being an "observable" --- it gives me a snapshot of $X$. The Riesz Theorem tells us that (if $X$ is real-valued) $\mu_{X}$ is uniquely determined by these "expectation values" (i.e. the integrals $\int f d \mu_{X}$ for compactly supported, continuous $f$). This is often useful to know in probability --- you can use it to check that two random variables have the same law, for example. As we will see below, in some cases, it is nice to know that I can construct my $X$ through the knowledge of these integrals.

General comments: Similarities in the proof I think it is important to note that many of the arguments appearing in the standard proof of the Caratheodory extension theorem also appear in Rudin's proof of the Riesz representation theorem. The two results are closely related. It is often true (even morally true) that what you can prove with one, you can also prove with another.

Another example: Volume measures Given a smooth compact orientable manifold $M$, we know there is a non-vanishing top dimensional form $\omega$ such that \begin{equation*} \int_{M} \omega > 0. \end{equation*} We can then define integration "$d\text{Vol}"$ on $M$ by \begin{equation*} \int_{M} f \, d\text{Vol} = \int_{M} f \omega \quad \text{if} \, \, f \in C^{\infty}(M). \end{equation*} A natural question is: what is "$d \text{Vol}$"?

The Riesz Theorem provides an easy answer. The linear functional $f \mapsto \int_{M} f \, d \text{Vol}$ has the following estimate \begin{equation*} \left| \int_{M} f d \text{Vol} \right| \leq \|f\|_{C(M)} \int_{M} \omega. \end{equation*} Hence we can extend it to a linear functional on $C(M)$ by density. This induces a measure by the Riesz Theorem. Therefore, there is a Borel measure $\text{Vol}$ on $M$ that gives "$d\text{Vol}$". Since $f \equiv 1$ defines a $C^{\infty}$ function, it is not hard to show that $\text{Vol}(M) = \int_{M} \omega$.

Here I do not know how to define $\text{Vol}$ using the Caratheodory theorem. I suspect it is possible to define it using Hausdorff measure. This should give a lot more information than what the Riesz Theorem does, but it is nice to have quick constructions for a first pass.

Another example: invariant measures in ergodic theory Invariant measures are basic objects in ergodic theory. If $X$ is a compact metric space and $T : X \to X$ a Borel measurable map, we would like to know whether or not there is a Borel probability measure $\mu$ on $X$ such that $T_{*}\mu = \mu$, i.e. \begin{equation*} \mu(A) = T_{*}\mu(A) := \mu(T^{-1}(A)). \end{equation*} One way to think of this is if $Z$ is a $X$-valued random variable with law $\mu$, then $T(Z)$ also has law $\mu$ --- so the sequence $\{T^{n}(Z)\}_{n \in \mathbb{N} \cup \{0\}}$ is statistically "stationary" (identically distributed). You can interpret this as saying that $\mu$ is an "equilibrium measure" for $T$.

Do these things exist? In the setting above, the answer is yes and there is a very simple proof --- using the Riesz Theorem in conjunction with the Banach-Alaoglu Theorem. Fix a $x \in X$ and consider the measures \begin{equation*} \mu_{N} = N^{-1}\sum_{i = 0}^{N-1} \delta_{T^{i}(x)}. \end{equation*} We can think of this is a bounded sequence in $C(X)^{*}$ by the Riesz-Theorem, and the Banach-Alaoglu Theorem gives us compactness. In fact, since $C(X)$ is separable, we have sequential compactness so we can extract a weak-$*$ accumulation point $\mu$. It is possible to show that $T_{*} \mu = \mu$, that is, $\mu$ is an invariant measure for $T$.

This is probably the best example of a result where the implicit nature of the Riesz Theorem wins out over the constructive approach of the Caratheodory Theorem. I have my doubts about doing this with Caratheodory. Anyway, the above existence argument goes by the name Krylov-Bogoliubov Theorem.

Basically the same idea is used to obtain some fundamental results in the mathematical foundations of statistical mechanics. See the remarks on the Ising model below.

A related example is Haar measure on a locally compact group: at least one way to prove such a measure exists involves the construction of a linear functional first, then an invokation of Riesz. (See Folland's book on harmonic analysis for such a proof.)

Another example: finite state Markov chains Let $S$ be a finite set and let $P : S \times S \to [0,1]$ be a function so that, for each $s \in S$, we have \begin{equation*} \sum_{s' \in S} P(s,s') = 1. \end{equation*}
Given a measure $p$ on $S$, we say that a $S$-valued stochastic process $\{X_{n}\}_{n \in \mathbb{N} \cup \{0\}}$ on a probability space $(\Omega,\mathcal{F},\mathbb{P})$ is a Markov chain with transition matrix $P$ and initial measure $p$ if, for each $s \in S$, \begin{equation*} \mathbb{P}\{X_{0} = s\} = p(s) \end{equation*} and, for each $s_{0},\dots,s_{n},s_{n+1} \in S$, \begin{equation*} \mathbb{P}\{X_{n+1} = s_{n+1} \, \mid \, X_{0} = s_{0}, \dots, X_{n} = s_{n}\} = P(s_{n},s_{n+1}). \end{equation*}

Here is a very easy and "intrinsic" construction of such a process, using the Riesz theorem. Let $\Omega = S^{\mathbb{N} \cup \{0\}}$. If we put the discrete topology on $S$ and then the product topology on $\Omega$, this makes $\Omega$ a compact Hausdorff space by the Tychonoff theorem. Let $\mathcal{C} \subseteq C(\Omega)$ denote the space of "cylindrical" functions that only depend on finitely many coordinates, that is, $f \in \mathcal{C}$ if and only if $f(X) = F(X_{0},X_{1},X_{2},\dots,X_{N})$ for some $F : S^{\{0,1,\dots,N\}} \to \mathbb{R}$ and $N \in \mathbb{N}$. Now define a linear functional $\phi : \mathcal{C} \to \mathbb{R}$ by \begin{equation*} \phi(f) = \sum_{s_{0} \in S} \sum_{s_{1} \in S} \dots \sum_{s_{N} \in S} F(s_{0},s_{1},\dots,s_{N}) p(s_{0}) P(s_{0},s_{1}) \dots P(s_{N-1},s_{N}) \quad \text{if} \, \, f(X) = F(X_{0},\dots,X_{N}). \end{equation*} It is not hard to check that this is well-defined (doesn't depend on the choice of $F$ or $N$, only on $f$) and \begin{equation*} |\varphi(f)| \leq \|f\|_{C(\Omega)}. \end{equation*} Therefore, by density of $\mathcal{C}$, there is a unique extension $\Phi : C(\Omega) \to \mathbb{R}$ of $\varphi$ such that $|\Phi(f)| \leq \|f\|_{C(\Omega)}$. By the Riesz Theorem, there is a measure $\mathbb{P}$ on the Borel $\sigma$-algebra $\mathcal{F}$ of $\Omega$ such that \begin{equation*} \Phi(f) = \int_{\Omega} f(\tilde{X}) \, \mathbb{P}(d\tilde{X}). \end{equation*} If we define $X_{n} : \Omega \to S$ by $X_{n}((\tilde{X}_{0},\tilde{X}_{1},\dots)) = \tilde{X}_{n}$, then the sequence of random variables $\{X_{n}\}_{n \in \mathbb{N} \cup \{0\}}$ on $(\Omega,\mathcal{F},\mathbb{P})$ is a Markov chain with transition probability $P$ and initial distribution $p$ as we sought.

A probabilist could very well complain that the above construction is silly. In response, I would submit the following special case: if $S = \{0,1\}$, $P : S \times S \to [0,1]$ is the constant function $P \equiv \frac{1}{2}$, and $p : S \to [0,1]$ is also constant $p \equiv \frac{1}{2}$, then the construction above gives us an i.i.d. sequence of Bernoulli random variables. That is, the process $\{X_{n}\}_{n \in \mathbb{N}}$ is determined by the property that, for each $s_{0},\dots,s_{N} \in \{0,1\}$, \begin{equation*} \mathbb{P}\{X_{0} = s_{0},\dots,X_{N} = s_{N}\} = 2^{-(N+1)} = \prod_{i = 0}^{N} \mathbb{P}\{X_{i} = s_{i}\}. \end{equation*} Such Bernoulli sequences are important building blocks for other stochastic processes (or other measures --- another construction of Lebesgue measure uses a Bernoulli sequence), and the proof that such a sequence exists can be a bit of a bug-a-boo.

A last note: here it is not very hard to construct the Markov chain above using Caratheodory's extension theorem. I leave it to you to figure out how to do this, and then you can decide which is easier to achieve.

Applications: Ising model As we saw above, when you combine the Riesz Theorem with the Banach-Alaoglu Theorem, you get powerful existence results. Here is a very nice application: in the Ising model of statistical mechanics, one is interested in studying what physicists think of as "statistical/physical equilibrium." It turns out that such equilibria mathematically correspond to probability measures, which, for the Ising model, are measures on $\{-1,1\}^{\mathbb{Z}^{d}}$. To prove that such measures exist, a by now standard approach is to start with certain simpler measures $(\mu_{N})_{N \in \mathbb{N}}$ --- you can think of $\mu_{N}$ as the probability measure describing physical equilibrium in an "experiment" conducted in $\mathbb{Z}^{d} \cap [-N,N]^{d}$ rather than the whole "universe" $\mathbb{Z}^{d}$ --- and then use compactness (guaranteed by the Banach-Alaoglu Theorem and Riesz Theorem) to obtain a sub-sequential limit $\mu = \lim_{j \to \infty} \mu_{N_{j}}$, which turns out to be exactly the kind of physical equilibrium (in the whole of $\mathbb{Z}^{d}$, not just a box) we seek. In the study of the Ising model, one then goes on to ask what properties $\mu$ has, how many possible equilibria $\mu$ there are, and more. (It is easy to construct the measures $(\mu_{N})_{N \in \mathbb{N}}$, and the limiting measure $\mu$ is not unique in general --- there can be many physical equilibria.)