Variance computation in B. Maurey's 'empirical method' proof of the approximate form of Caratheodory's theorem

Question

Variance computation in B. Maurey's 'empirical method' proof of the approximate form of Caratheodory's theorem

379 Views Asked by Bumbble Comm At 05 Apr 2026 - 7:05

I am currently studying the textbook High-Dimensional Probability by Roman Vershynin. Theorem 0.0.2 of the opening chapter Using Probability to Cover a Geometric Set presents the Approximate form of Caratheodory's theorem as follows:

Theorem 0.0.2 (Approximate form of Caratheodory's theorem) Consider a set $T \subset \mathbb{R}^n$ whose diameter$^1$ is bounded by $1$. Then, for every point $x \in \text{conv}(T)$ and every integer $k$, one can find points $x_1, \dots, x_k \in T$ such that $$\left\vert \left\vert x - \dfrac{1}{k} \sum_{j = 1}^k x_j \right\vert \right\vert_2 \le \dfrac{1}{\sqrt{k}}.$$ There are two reasons why this result is surprising. First, the number of points $k$ in convex combinations does not depend on the dimension $n$. Second, the coefficients of convex combinations can be made all equal. (Note, however, that repetitions among the points $x_i$ are allowed.)

$^1$ The diameter of $T$ is defined as $\text{diam}(T) = \text{sup} \{ \vert\vert s - t \vert\vert_2 : s, t \in T \}$. We have assumed that $\text{diam}(T) = 1$ for simplicity. For a general set $T$, the bound in the theorem changes to $\text{diam}(T)/\sqrt{k}$. Check this!

Proof Our argument is known as the empirical method of B. Maurey.
Translating $T$ if necessary, we may assume that not only the diameter but also the radius of $T$ is bounded by $1$, i.e., $$\vert\vert t \vert\vert_2 \le 1 \ \ \ \text{for all $t \in T$}. \tag{0.2}$$ Fix a point $x \in \text{conv}(T)$ and express it as a convex combination of some vectors $z_1, \dots, z_m \in T$ as in (0.1). Now, interpret the definition of the convex combination (0.1) probabilistically, with the $\lambda_i$ taking the roles of probabilities. Specifically, we can define a random vector $Z$ that takes the value $z_i$ with probability $\lambda_i$: $$\mathbb{P} \{ Z = z_i \} = \lambda_i, \ \ \ i = 1, \dots, m.$$ (This is possible by the fact that the weights $\lambda_i$ are non-negative and sum to $1$.) Then $$\mathbb{E} Z = \sum_{i = 1}^m \lambda_i z_i = x .$$ Consider independent copies $Z_1, Z_2, \dots$ of $Z$. by the strong law of large numbers, $$\dfrac{1}{k} \sum_{j = 1}^k Z_j \to x \ \ \ \text{almost surely as $k \to \infty$} .$$ To get a quantitative form of this result, let us compute the variance of $\dfrac{1}{k} \sum_{j = 1}^k Z_j$. (Incidentally, this computation is at the heart of the proof of the weak law of large numbers.) We obtain $$\begin{align} \mathbb{E} \left\vert \left\vert x - \dfrac{1}{k} \sum_{j = 1}^k Z_j \right\vert \right\vert_2^2 &= \dfrac{1}{k^2} \mathbb{E} \left\vert \left\vert \sum_{j = 1}^k (Z_j - x) \right\vert \right\vert_2^2 \ \ \ \text{(since $\mathbb{E} (Z_i - x) = 0)$} \\ &= \dfrac{1}{k^2} \sum_{j = 1}^k \mathbb{E} \vert\vert Z_j - x \vert\vert_2^2 . \end{align}$$

I am unsure about this part:

$$\begin{align} \mathbb{E} \left\vert \left\vert x - \dfrac{1}{k} \sum_{j = 1}^k Z_j \right\vert \right\vert_2^2 &= \dfrac{1}{k^2} \mathbb{E} \left\vert \left\vert \sum_{j = 1}^k (Z_j - x) \right\vert \right\vert_2^2 \ \ \ \text{(since $\mathbb{E} (Z_i - x) = 0)$} \\ &= \dfrac{1}{k^2} \sum_{j = 1}^k \mathbb{E} \vert\vert Z_j - x \vert\vert_2^2 . \end{align}$$

The Wikipedia article for variance derives variance using expected values as follows:

$${\displaystyle {\begin{aligned}\operatorname {Var} (X)&=\operatorname {E} \left[(X-\operatorname {E} [X])^{2}\right]\\[4pt]&=\operatorname {E} \left[X^{2}-2X\operatorname {E} [X]+\operatorname {E} [X]^{2}\right]\\[4pt]&=\operatorname {E} \left[X^{2}\right]-2\operatorname {E} [X]\operatorname {E} [X]+\operatorname {E} [X]^{2}\\[4pt]&=\operatorname {E} \left[X^{2}\right]-\operatorname {E} [X]^{2}\end{aligned}}}$$

Comparing this Wikipedia proof to the textbook proof, I don't see the correspondence. In particular, I don't see how the expressions $\mathbb{E} \left\vert \left\vert x - \dfrac{1}{k} \sum_{j = 1}^k Z_j \right\vert \right\vert_2^2$ and $\dfrac{1}{k^2} \sum_{j = 1}^k \mathbb{E} \vert\vert Z_j - x \vert\vert_2^2$ are analogous to the expressions in the Wikipedia proof of variance. It seems to me that the presence of the $L^2$-norm is the odd part out. So how exactly is this textbook proof analogous to a variance computation of $\dfrac{1}{k} \sum_{j = 1}^k Z_j$?

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Accepted Answer

By linearity of expectation we have $$\Bbb E\left[\sum_{j=1}^kZ_j\right]=\sum_{j=1}^k\Bbb E[Z_j]=\sum_{j=1}^kx=kx\implies \Bbb E\left[\frac1k\sum_{j=1}^kZ_j\right]=x.$$ Using the Wikipedia definition of variance, $$\Bbb V\left[\frac1k\sum_{j=1}^kZ_j\right]=\Bbb E\left[\left\|\frac1k\sum_{j=1}^kZ_j-\Bbb E\left[\frac1k\sum_{j=1}^kZ_j\right]\right\|_2^2\right]$$ where we take the $\ell^2$-norm as the square of the vector is just the dot product of itself. From the first equation, $$\Bbb V\left[\frac1k\sum_{j=1}^kZ_j\right]=\Bbb E\left[\left\|x-\frac1k\sum_{j=1}^kZ_j\right\|_2^2\right]=\Bbb E\left[\left\|\frac1k\left(kx-\sum_{j=1}^kZ_j\right)\right\|_2^2\right]$$ and the rest is straightforward.

Variance computation in B. Maurey's 'empirical method' proof of the approximate form of Caratheodory's theorem

There are 1 best solutions below

Related Questions in PROBABILITY-THEORY

Related Questions in PROOF-EXPLANATION

Related Questions in VARIANCE

Trending Questions

Popular # Hahtags

Popular Questions