I am trying to understand how the general 2D Gaussian (binormal / bivariate) equation is derived as part of my work, and am having trouble expanding the terms. The article on Wikipedia (https://en.wikipedia.org/wiki/Gaussian_function#Two-dimensional_Gaussian_function) lists the general equation as being:
$$f(x, y) = A\exp\left(-\left(a(x-x_{o})^2 + 2b(x-x_{o})(y-y_{o}) + c(y-y_{o})^2\right)\right)$$
Where: $$a=\frac{\cos^2\theta}{2\sigma_{x}^2} + \frac{\sin^2\theta}{2\sigma_{y}^2}$$ $$b=-\frac{\sin2\theta}{4\sigma_{x}^2} + \frac{\sin2\theta}{4\sigma_{y}^2}$$ $$c=\frac{\sin^2\theta}{2\sigma_{x}^2} + \frac{\cos^2\theta}{2\sigma_{y}^2}$$ $A$ is a scaling factor, and $\theta$ is the angle by which the Gaussian is rotated.
In R. J. Barlow's book "Statistics: A Guide to the Use of Statistical Methods in the Physical Sciences" shows that the normalised 2D Gaussian equation can be represented as follows:
$$f(x, y) = \frac{1}{2\pi\sigma_{x}\sigma_{y}\sqrt{(1-\rho^2)}} \exp\left\{-\frac{1}{2(1-\rho^2)}\left[\left(\frac{x-\mu_{x}}{\sigma_{x}}\right)^2 + \left(\frac{y-\mu_{y}}{\sigma_{y}}\right)^2 - 2\rho\left(\frac{x-\mu_{x}}{\sigma_{x}}\right)\left(\frac{y-\mu_{y}}{\sigma_{y}}\right)\right]\right\}$$
Where $\rho$ is the correlation factor for $x$ and $y$, given as: $$\rho=\frac{\mathrm{cov}(x, y)}{\sigma_{x}\sigma_{y}}$$
In expanding and playing around with the formula for the exponent, I find that the terms can apparently be matched up as follows:
$$a=\frac{\cos^2\theta}{2\sigma_{x}^2} + \frac{\sin^2\theta}{2\sigma_{y}^2} = \frac{1}{2\sigma_{x}^2(1-\rho^2)}$$ $$b=-\frac{\sin2\theta}{4\sigma_{x}^2} + \frac{\sin2\theta}{4\sigma_{y}^2} = -\frac{\rho}{2\sigma_{x}\sigma_{y}(1-\rho^2)}$$ $$c=\frac{\sin^2\theta}{2\sigma_{x}^2} + \frac{\cos^2\theta}{2\sigma_{y}^2} = \frac{1}{2\sigma_{y}^2(1-\rho^2)}$$
Could someone help explain how the $\rho$ terms can be expanded to give said relationships? I have a basic grasp of matrix operations and trigonometry, so simple explanations and/or undergraduate-level resources to help better understand this derivation would be much appreciated.
Thanks!
I managed to figure out what the problem was. It turns out that the notation on the Wikipedia page was inconsistent, and $\sigma_{x}$ and $\sigma_{y}$ are actually referring to its widths along the diagonal of the ellipse.
Starting with the general equation: $$G(\bar{x}, \bar{\mu}, \Sigma) = \frac{1}{\sqrt{\left(2\pi\right)^{n}\det{\Sigma}}}\exp{\left(-\frac{1}{2}\left(\bar{x}-\bar{\mu}\right)^{T}\Sigma^{-1}\left(\bar{x}-\bar{\mu}\right)\right)}$$
The bivariate case can be represented as follows: $$G(x, y, x_{o}, y_{o}, \Sigma) = \frac{1}{2\pi\sqrt{\det{\Sigma}}}\exp{\left(-\frac{1}{2}\left(\begin{matrix}x-x_{o} & y-y_{o}\end{matrix}\right)\Sigma^{-1}\left(\begin{matrix}x-x_{o} \\ y-y_{o}\end{matrix}\right)\right)}$$
The covariance matrix for a bivariate Gaussian can be represented in terms of its standard deviation along the $x$ and $y$ axes, with a correlation coefficient $\rho$ being used to show how the degree of correlation in the distribution, but it can also be represented via eigendecomposition in terms of its eigenvalues and eigenvectors. This can be thought of as being a new set of axes $u$ and $v$, rotated relative to the $x$ and $y$ axes by an angle $\theta$. Along this new set of axes, $\sigma_{u}$ and $\sigma_{v}$ are uncorrelated, meaning the non-diagonal terms in its covariance matrix are 0, and the matrix ($D$) is diagonal. The two matrices are related to one another by a rotation matrix ($U$) which can be described in terms of $\theta$:
$$\Sigma = \left(\begin{matrix}\sigma_{x}^{2} & \rho\sigma_{x}\sigma_{y} \\ \rho\sigma_{x}\sigma_{y} & \sigma_{y}^{2}\end{matrix}\right) = UDU^{-1} = \left(\begin{matrix}\cos{\theta} & -\sin{\theta} \\ \sin{\theta} & \cos{\theta}\end{matrix}\right) \left(\begin{matrix}\sigma_{u}^{2} & 0 \\ 0 & \sigma_{v}^{2}\end{matrix}\right) \left(\begin{matrix}\cos{\theta} & \sin{\theta} \\ -\sin{\theta} & \cos{\theta}\end{matrix}\right)$$
Expanding the equation, we find that $\Sigma$ as represented using the new set of axes is: $$\Sigma = \left(\begin{matrix}\sigma_{u}^{2}\cos^{2}{\theta} + \sigma_{v}^{2}\sin^{2}{\theta} & \frac{1}{2}\sin{2\theta}\left(\sigma_{u}^{2} - \sigma_{v}^{2}\right) \\ \frac{1}{2}\sin{2\theta}\left(\sigma_{u}^{2} - \sigma_{v}^{2}\right) & \sigma_{u}^{2}\sin^{2}{\theta} + \sigma_{v}^{2}\cos^{2}{\theta}\end{matrix}\right)$$
And the inverse of this is: $$ \Sigma^{-1} = \left(\begin{matrix} \frac{\cos^{2}{\theta}}{\sigma_{u}^{2}} + \frac{\sin^{2}{\theta}}{\sigma_{v}^{2}} & \frac{\sin{2\theta}}{2\sigma_{u}^{2}} - \frac{\sin{2\theta}}{2\sigma_{v}^{2}} \\ \frac{\sin{2\theta}}{2\sigma_{u}^{2}} - \frac{\sin{2\theta}}{2\sigma_{v}^{2}} & \frac{\sin^{2}{\theta}}{\sigma_{u}^{2}} + \frac{\cos^{2}{\theta}}{\sigma_{v}^{2}} \end{matrix}\right)$$
When the exponent is then expanded with this, we get the following form, which matches that given on Wikipedia (albeit with a different rotation direction): $$ G(x, y, A, x_{o}, y_{o}, \sigma_{u}, \sigma_{v}, \theta, B)=A \exp{\left( -\left( \left(\frac{\cos^{2}{\theta}}{2\sigma_{u}^{2}} + \frac{\sin^{2}{\theta}}{2\sigma_{v}^{2}} \right)\left(x-x_{o}\right)^{2} + 2\left(\frac{\sin{2\theta}}{4\sigma_{u}^{2}} - \frac{\sin{2\theta}}{4\sigma_{v}^{2}} \right)\left(x-x_{o}\right)\left(y-y_{o}\right) + \left(\frac{\sin^{2}{\theta}}{2\sigma_{u}^{2}} + \frac{\cos^{2}{\theta}}{2\sigma_{v}^{2}}\right) \right) \right)} $$