Showing that $\cos(u_1,u_3)=\cos(u_1,u_2)\cos(u_2,u_3)$ for nearby $u_i$

244 Views Asked by At

Edit if we replace rotations with "add isotropic noise", this relation can be proven using Chebychev inequality as shown here. The $\pi/4$ angle seems to be connected to forgetting of starting point. In high dimensions, random rotations seem to keep iterates $u_1,u_2,...,$ roughly along the same line (hence triangle inequality for cosines becomes equality), until $\pi/2$ angle is reached at which point the process becomes ergodic.


Suppose I start with vector $u_1$ in $d$ dimensions and obtain $u_{i+1}$ by performing a sequence of $i$ small rotations in $d$ dimensions. For $d=100$, the following gives a good approximation, within 0.1% of true value in expectation.

$$\cos(u_1,u_4)=\cos(u_1,u_2)\cos(u_2,u_3)\cos(u_3,u_4)$$

where

$$\cos(x,y)=\frac{\langle x, y\rangle}{\|x\| \|y\|}$$

"Small rotation" of $v$ is done by sampling entries $z$ from standard normal, and rotating $v$ in the plane defined by vectors $v,z$ by $\theta$ radians. This identity works for $\theta_i\le\pi/4$ and breaks down for $\theta$ slightly above $\pi/4$.

  1. How can this be justified?
  2. Why is $\pi/4$ special?
1

There are 1 best solutions below

5
On

EDIT: It turns out I was misunderstanding. This is not a solution to the problem.

Unless I’m misunderstanding, I think this is a corollary of the following fact:

”Theorem”: given two uniform random unit vectors $u,v \in \mathbb{R}^d$, for large values of $d$ we usually see that $\langle u, v \rangle \approx 0$.

I put theorem in quotes because I haven’t defined “large $d$“ and I haven’t defined “usually”. Another way of stating this theorem is that in high dimension, almost all vectors are almost orthogonal.

The argument for this “theorem” is pretty intuitive. Given two such unit vectors chosen uniformly randomly, we see by the central limit theorem that: $$\langle u, v \rangle \sim \mathcal{N}(0, \frac{1}{d})$$ That is, the dot product will be distributed as a normal random variable with zero mean and variance $1/d$.

As you can see, as $d \rightarrow \infty$, the variance $\sigma^2 \rightarrow 0$. Hence we “usually” see dot products close to zero.

It immediately follows that both the right hand side and the left hand side of your expression should on average be very close to zero for large $d$. As Brian Tung points out, the ratio may blow up though.