Variance of dot product of two normalized random vector

3.1k Views Asked by At

Given a set of normalized vectors $\mathbf{x}$ and $\mathbf{y}$ of length $N$, with each entry independently sampled from $\mathcal{N}(0,1)$ before being divided by the vector norm.

By running simulations I got this empirical result:

$$var(\mathbf{x}^T\mathbf{y})=\frac{1}{N}$$

Can this be proved?

3

There are 3 best solutions below

5
On BEST ANSWER

This can be treated in analogy to Expected value of inner product of uniformly distributed random unit vectors.

By rotational symmetry, we can assume one of the vectors to be a fixed unit vector, say, $\mathbf e_1$. The expected value of the dot product with this vector is $0$ by symmetry. Thus the variance is the expected value of the square. The sum of the squares of the $n$ components is $1$ due to normalization, so by symmetry the expected value for each component must be $\frac1N$.

7
On

The solution provided by joriki is great. One of the sufficent conditions is the uniformity (as joriki points out, the invariance under interchange of the coordinates is the name of the game) : The normalized gaussian vectors are uniformly distributed on a sphere. The intuitive explanation is that when you generate a gaussian vector, the probability density of this point is a function of sum of squares, which is exactly the same for all the points that is on the same sphere. When you do the normalization, this sphere is projected to the standard sphere $\left| r \right|=1$. So the normalization process is in fact projecting all the sphere onto the standard sphere. Since gaussian vectors are distributed uniformly on each sphere, their projection should be uniform, too.

If you want to verify the solution with a little bit calculus, you can use the sphere axis. Because the points is on the sphere, you have $n-1$ degrees of freedom and the dot product is the cosine, correspondingly.

$$\frac{2\pi^{n/2}}{\Gamma(n/2)}\int_{0}^{\pi}\cdots\int_{0}^{\pi}\int_{0}^{2\pi}\cos^{2}(\phi_{1})sin^{n-2}(\phi_{1})sin^{n-3}(\phi_{2})\cdots sin(\phi_{n-2})d\phi_{1}\cdots d\phi_{n-1} \\=1-\frac{\int_{0}^{\pi}sin^{n}xdx}{\int_{0}^{\pi}sin^{n-2}xdx}=\frac{1}{n}$$

0
On

With $u_i , v_i\stackrel{iid}{\sim} \mathcal{N}(0,1)$

\begin{align} var(\mathbf{x}^T\mathbf{y})&=\mathbb{E}[(\mathbf{x}^T\mathbf{y})^2]-(\mathbb{E}[\mathbf{x}^T\mathbf{y}])^2\\ &=\mathbb{E}\big[\frac{(\sum_i u_iv_i)^2}{(\sum_i u_i^2)(\sum_i v_i^2)}\big]\\ &=\mathbb{E}\big[\frac{\sum_{i}u_i^2v_i^2+\sum_i\sum_j^{j\neq i}u_i u_j v_i v_j}{(\sum_i u_i^2)(\sum_i v_i^2)}\big]\\ &=N \mathbb{E}\big[\frac{u_1^2 v_1^2}{(\sum_i u_i^2)(\sum_i v_i^2)}\big]\\ &=N \mathbb{E}\big[\frac{u_1^2}{\sum_{i}u_i^2}\big]\mathbb{E}\big[\frac{v_1^2}{\sum_{i}v_i^2}\big]\\ &=\frac{1}{N} \end{align}

The second equality follows $\sum_i E[x_i]E[y_i] = 0$, the forth equality requires $E[x_i x_j]=0$. These are satisfied by the standard gaussian distribution. Also we used the permutation symmetry and the independence between $x_1^2$ and $y_1^2$.