I am wondering why for a multivariate gaussian, the geometry looks like an ellipse where it has been scaled by $\sqrt{\lambda_i}$. I have seen 3 explanations for the geometry of the gaussian. The one that makes the most sense to me is how we form it multiple translations. Here is the explanation.
We take the normal distributions and scale every rv component $v_i$ of $\vec{v}$ by some $\sqrt{\lambda_i}$. This would give us multiple independent $y_i$ rvs with a guassian distribution and a standard deviation of $\sqrt{\lambda_i}$. We then rotate this vector space to align with the eigenvectors and then shift by mu. This explanation kind of makes sense to me however, im not sure why we stretch by $\sqrt{\lambda_i}$ instead of just $\lambda_i$ along the ith eigenvector. I dont have enough reputation to post an image of the explanation.
Another explanation I saw described the gaussian in terms of the ellipse slices we could form from it.
I do not have enough rep to post it but it basically says the half lengths of the ellipse defined by some slice of our gaussian will be $l = \sqrt{\lambda_i * x_{i}^2}$
However, this does not describe why our ellipse lengths are $\sqrt{\lambda_i}$
I would have figured we would scale our vector space by $\lambda_i$ along each eigenvector i, but instead we seem to scale by $\sqrt{\lambda_i}$. I am also not sure why the length of the ellipse is $\sqrt{\lambda_i}$.
When $X\sim N(a,\sigma^2)$ you can write $X= a+\sigma Z$ with $Z\sim N(0,1)$. In several dimensions if $X\sim N(a,\Sigma)$ you write $\Sigma=U^TDU$ with $U$ orthogonal and $D=\mathrm{diag}(\lambda_1,\ldots,\lambda_n)$. Therefore if $Z\sim N(0,I_n)$ then $Z_1=UZ\sim N(0,I_n)$ and we have $$X= a+\sqrt{\Sigma}Z= a+ \sqrt{D}UZ= a+ \sqrt{D}Z_1$$