Spherical trigonometry and statistics in higher dimensions

120 Views Asked by At

Until recently I never properly fully assimilated the spherical laws of sines and cosines into my understanding, and thinking about those, I see some parallels with some things in statistics.

Segments from the center of a sphere of unit radius to the three vertices of a triangle on the sphere meet at angles $\alpha,\beta,\gamma.$ The sides of the triangle are arcs of great circles and the two arcs at the vertex that is the endpoint of that one of the aforementioned segments that does not meet either of the others at angle $\alpha$ meet at angle $\mathbb A,$ and as $\alpha$ is related to $\mathbb A$, so also are $\beta,\gamma$ to $\mathbb B,\mathbb C$ respectively. The spherical law of cosines then says $$ \cos\beta = \cos\alpha\cos\gamma+\sin\alpha\sin\gamma\cos\mathbb B. \tag 1 $$

Now consider vectors $x=(x_1,\ldots,x_n), y=(y_1,\ldots,y_n)\in\mathbb R^n.$ Let $\overline x=(x_1+\cdots+x_n)/n$ and similarly define $\overline y.$ Let $s_x^2=\big((x_1-\overline x)^2 + \cdots + (x_n-\overline x)^2\big)/n,\, s_x>0$ and similarly define $s_y^2.$ (Sometimes you see $n-1$ rather than $n$ in the denominator, but that won't change anything that concerns us here.

It is often remarked that the correlation $\displaystyle \frac{\sum_{i=1}^n (x_i-\overline x)(y_i-\overline y) }{s_xs_y} $ is the cosine of the angle between the two unit vectors $\left( \frac{x_i-\overline x}{s_x} : i=1,\ldots,n \right)$ and $\left( \frac{y_i-\overline y}{s_y} : i=1,\ldots,n \right),$ known as the standardizations of $x$ and $y.$

Now consider a third vector $z=(z_1,\ldots,z_n)$ with $\overline z$ and $s_z$ defined similarly.

Let

  • $\alpha$ be the angle between the standardizations of $x$ and $y$

and similarly

  • $\beta$ between $y$ and $z$ and

  • $\gamma$ between $z$ and $x.$

And let $\mathbb C$ correspond to $\gamma$ as above.

I leave it as an exercise to see that $\mathbb C$ is a certain dihedral angle and $\cos^2\mathbb C$ is the quantity that statisticians know as the coefficient of determination, which is the proportion of the variability in $z$ that is "explained" by the variability of $x$ and $y$ when $z$ is regressed on $x$ and $y,$ the "unexplained" part being the sum of squares of residuals that is minimized when regression is done be the method of ordinary least squares.

A coordinate system in the $3$-dimensional space spanned by the standardizations of $x,y,z$ may be chosen so that the standardization of $x$ becomes $\left[ \begin{array}{c} 1 \\ 0 \\ 0 \end{array} \right],$ and that of $y$ is $\left[ \begin{array}{c} \cos\alpha \\ \sin\alpha \\ 0 \end{array} \right],$ and that of $z$ is $\left[ \begin{array}{c} \cos\gamma \\ \sin\gamma\cos\mathbb B \\ \sin\gamma\sin\mathbb B \end{array} \right].$ Then the ordinary dot product between those last two vectors gives is the law of cosines of line $(1)$ above.

But coefficients of determination are not used only in regressing one variable on exactly two others; they are used when one regresses one variable on $p$ others, and then they are the square of the cosine of the angle between a line corresponding to the response variable and a $p$-dimensional space corresponding to the predictors.

So can the relationship between spherical trigonometry and statistics be fruitfully extended to higher dimensions?

Appendix:

  • The total sum of squares is $\sum_{i=1}^n (z_i-\overline z)^2.$

  • The fitted values are $\widehat z_i = \widehat a + \widehat b x_i + \widehat c y_i$ where $\widehat a,\widehat b, \widehat c$ are the least-squares estimates.

  • The residuals are $z_i - \widehat z_i.$

  • The unexplained sum of squares is the sum of squares of residuals when fitting is done by least squares: $\sum_{i=1}^n (z_i - \widehat z_i)^2$

  • The explained sum of squares is $\sum_{i=1}^n (\widehat z_i - \overline z)^2.$

  • The coefficient of determination is the proportion of variability in $z$ that is explained by $x$ and $y,$ i.e. it is $$ \frac{\text{explained sum of squares}}{\text{total sum of squares}}. $$

  • Exercise: The total sum of squares is the sum of the explained and unexplained sums of squares. (This can be reduced to showing that the vectors of residuals and of fitted values are mutually orthogonal.) $$ \sum_{i=1}^n (z_i-\overline z)^2 = \sum_{i=1}^n (\widehat z_i - \overline z)^2 + \sum_{i=1}^n (z_i - \widehat z_i)^2 $$