To give you some context first:
Let the sample variance $S^2$ be $S^2 = \frac{\sum_i (\textbf{Y}_i - \overline{\textbf{Y}})^2}{n-1}$, with $\textbf{Y}\sim N_n(0,\sigma^2\textbf{I}_n)$. From a previous exercise it follows that the expression $\frac{(n-1)*S^2}{\sigma ^2}$ can be written as $\frac{\textbf{Y}(\textbf{I}_n - \frac{\textbf{J}_n}{n}) \textbf{Y}'}{\sigma ^2}$, with $\textbf{J}_n$ being the n*n matrix of 1's and $\textbf{I}_n$ being the identity matrix. I already know that the matrix $A = \textbf{I}_n - \frac{\textbf{J}_n}{n}$ is symmetric and idempotent.
Here comes the part that's confusing to me: After defining the random vector $Z = \sigma ^{-1}*(\textbf{Y}-\mu * \textbf{1}_n)$, the author proceeds to state "Hence it follows that $\frac{(n-1)*S^2}{\sigma ^2} = \frac{\textbf{Z}(\textbf{I}_n - \frac{\textbf{J}_n}{n}) \textbf{Z}'}{\sigma ^2}$. I'm pretty sure it's mainly my lack of linear algebra knowledge here, but my main two questions are:
- Why is this possible? I suspect that it is possible because I can decompose $A$ into an orthogonal and diagonal matrix, right?
- (Somewhat related to question 1) Would this not imply that random vector Y is equal to random vector Z? More generally: If I have two quadratic forms $p = \textbf(X'AX), q = \textbf(Y'AY)$ given with $p = q$ and Matrix $A$ being the same for both quadratic forms, can I conclude that $X = Y$? My instinct tells me no but for some reason it seems to work for the example I gave above...
Any advice would be greatly appreciated! Source: Seber, Linear Regression Analysis
The book does that, since we are interested in whether it is a standard normal that is involved in the quadratic forms. Now, I honestly have no idea, nor would it have ever occurred to me to have done the book thing, but here I show you why it works.
We are interested $\boldsymbol{Y} \sim \mathcal{N}_{n}(\boldsymbol{\mu}, \sigma^2 \boldsymbol{I}_n)$, and the book suggests $$ (n-1)S^2 = \boldsymbol{Z}^T A \boldsymbol{Z} $$ with $A = \boldsymbol{I}_n- \frac{1}{n} \boldsymbol{J}_n$. Let's develop what the book suggests. Let $\mu \in \mathbb{R}$ and $\boldsymbol{1} \in \mathbb{R}^{n \times 1}$.
\begin{align*} \boldsymbol{Z}^T A \boldsymbol{Z} & = (\boldsymbol{\boldsymbol{Y}} - \mu \cdot \boldsymbol{1})^T A (\boldsymbol{\boldsymbol{Y}} - \mu \cdot \boldsymbol{1}) \\ & = (\boldsymbol{Y}^T - \mu \boldsymbol{\cdot 1}^T) (A \boldsymbol{Y} - A \mu \boldsymbol{\cdot 1} ) \\ & = \boldsymbol{Y}^T A \boldsymbol{Y} - \boldsymbol{Y}^T A \mu \boldsymbol{\cdot 1} - \mu \boldsymbol{\cdot 1}^T A \boldsymbol{Y} + \mu \boldsymbol{\cdot 1}^TA \mu \boldsymbol{\cdot 1} \\ & = \boldsymbol{Y}^T A \boldsymbol{Y} - \mu \boldsymbol{Y}^T A\boldsymbol{\cdot 1} - \mu \boldsymbol{\cdot 1}^T A \boldsymbol{Y} + \mu^2 \boldsymbol{\cdot 1}^TA\boldsymbol{\cdot 1} \end{align*}
Here is the trick, it is possible to show that $$ A \boldsymbol{1} = 0, \boldsymbol{1}^{T} A = 0, \boldsymbol{1}^{T} A \boldsymbol{1} =0 $$ you can easily check this. Then
\begin{align*} \boldsymbol{Y}^T A \boldsymbol{Y} - \mu \boldsymbol{Y}^T A\boldsymbol{\cdot 1} - \mu \boldsymbol{\cdot 1}^T A \boldsymbol{Y} + \mu^2 \boldsymbol{\cdot 1}^TA\boldsymbol{\cdot 1} & = \boldsymbol{Y}^T A \boldsymbol{Y} -0-0+0 \\ & = \boldsymbol{Y}^T A \boldsymbol{Y} \end{align*}
That is, we have to
\begin{align*} (n-1)S^2 & = \boldsymbol{Z}^T A \boldsymbol{Z} \\ & = \boldsymbol{Y}^T A \boldsymbol{Y} \end{align*} So, do we really gain anything by expressing $S$ in such a way, if it leads us nowhere? Yes.If we want to find the distribution of $S^2$ with $\boldsymbol{Y} \sim \mathcal{N}_{n}(\boldsymbol{\mu}, \sigma^2 \boldsymbol{I}_n)$ we cannot apply theorem 2.7, but standardizing $Y$ in the way I show, if it can be applied.
For question 2 see $X = (-1/\sqrt{2}, -1/\sqrt{2})^T$ and $Y = (1/\sqrt{2}, 1/\sqrt{2})^T$ with $A = I_{2 \times 2}$