Concentration of measure on sphere: Bounding the probability of a large angle

243 Views Asked by At

Fix any $y$ on the sphere $S^{n-1}:=\{x\in\mathbb{R}^n : \|x\|_2=1\}$. Let $z$ be a random variable, uniformly distributed on $S^{n-1}$. Show that for any $\epsilon\in(0,1/\sqrt{2})$. $$ \mathbb{P}[|y^Tz|> \epsilon]\leq \left(1-\epsilon^2\right)^{n/2} $$ This is an exercise left to the reader in Martin Wainwright's High Dimensional Statistics, p. 69. The author statest that it is a "geometric calculation".

What I tried: The most promising approach I have is the observation that the term $\sqrt{1-\epsilon^2}$ is the length of the leg of a right-angled triangle with hypotenuse of length 1 and leg of length $\epsilon$. Such a triangle indeed readily appears when one draws a two-dimensional sphere, and the region $\{x\in\mathbb{R}^n : |y^Tx|>\epsilon\}$. But I don't understand how to connect this to the probability in question, and consequentially how to get the exponent $n$. Can anyone help?

2

There are 2 best solutions below

4
On

I do not see what was the geometric method mentioned by Wainwright. The probability can be expressed as a one-dimensional integral as follows.

First, by invariance by rotations, one may choose $y$ equal to the first vector of the canonical basis. Moreover, the uniform distribution on $\mathbb{S}_{n-1}$ is the distribution of $U/(|U|)$, where $U=(U_1,\ldots,U_n)$ is a gaussian vector with distribution $\mathcal{N}(0,I_n)$, and $|U|$ its usual Euclidean norm.

Therefore the desired probability is $$\mathbb{P}[U_1^2 \ge \epsilon^2 |U|^2] = \mathbb{P}[(1-\epsilon^2)U_1^2 \ge \epsilon^2 (U_2^2+\cdots+U_n^2)].$$ The random variables $U_i^2$ are independent with distribution Gamma$(1/2,1/2)$, hence the distribution of $U_1^2/(U_2^2+\cdots+U_n^2)$ is Beta$(1/2,(n-1)/2)$. As a result, $$\mathbb{P}[U_1^2 \ge \epsilon^2 |U|^2] = \int_{\epsilon^2}^1\frac{\Gamma(n/2)}{\Gamma(1/2)\Gamma((n-1)/2)} x^{-1/2}(1-x)^{n-3/2}dx.$$ The change of variable $y=x-\epsilon^2$ yields $$\mathbb{P}[U_1^2 \ge \epsilon^2 |U|^2] = \int_0^{1-\epsilon^2}\frac{\Gamma(n/2)}{\Gamma(1/2)\Gamma((n-1)/2)} (\epsilon^2+y)^{-1/2}(1-\epsilon^2-y)^{n-3/2}dy.$$ I use the inequalities $\epsilon^2+y \ge y$ and $0 \le 1-\epsilon^2-y \le (1-\epsilon^2)(1-y)$ for $y \in [0,1-\epsilon^2]$. $$\mathbb{P}[U_1^2 \ge \epsilon^2 |U|^2] \le (1-\epsilon^2)^{n-3/2}\int_0^{1-\epsilon^2}\frac{\Gamma(n/2)}{\Gamma(1/2)\Gamma((n-1)/2)} y^{-1/2}(1-y)^{n-3/2}dy.$$ The last integral over $[0,1-\epsilon]$ is less that the same integral over $[0,1]$, so $$\mathbb{P}[U_1^2 \ge \epsilon^2 |U|^2] \le (1-\epsilon^2)^{n-3/2}.$$

2
On

The only geometric method I think could think of is the one similar to the method mentioned by @stochasticboy321 in the comments. I will add some details and complete the answer.

We begin by computing the surface area of the region $\mathcal{Z}_y = \{ z \in S^{n-1} : \langle y, z \rangle > \varepsilon\} $ for any fixed vector $y$. WLOG, let $y$ be the first vector in the standard basis of $\mathbb{R}^n$. Thus, $\mathcal{Z}_y$ corresponds to the region on the sphere above the plane $x_1 = \varepsilon$. Let $A_{d}$ denote the area of sphere $S^{d-1}$ and $A(\mathcal{Z}_y)$ denote the area of region $\mathcal{Z}_y$. We can calculate $A(\mathcal{Z}_y)$ by integrating area of thin rings (essentially scaled version of $S^{n-2}$) of radius $\sqrt{1 - r^2}$ and height $\mathrm{d}r$ at a height $r$ from $x_1 = 0$. Thus,
\begin{align*} A(\mathcal{Z}_y) & = \int_{\varepsilon}^{1} A_{n-1} \cdot (1 - r^2)^{(n-2)/2} \mathrm{d}r \\ & \leq A_{n-1} (1 - \varepsilon^2)^{n/2} \cdot \frac{1 - \varepsilon}{1 - \varepsilon^2} \\ & \leq A_{n-1} (1 - \varepsilon^2)^{n/2}. \end{align*} The required probability is $\dfrac{2A(\mathcal{Z}_y)}{A_{n}} = \dfrac{2A_{n-1}}{A_n} \cdot (1 - \varepsilon^2)^{n/2}$. Using the fact that $A_{n} = 2\pi^{n/2}/\Gamma(n/2)$, we can conclude that the leading constant is less than $1$ for $n \geq 3$. I think you can conclude the argument by separately showing it for $n = 2$.

On a separate note, I personally feel Christophe's method is more elegant and that is my preferred approach.