Recently, I have some trouble with One-way ANOVA. Before I raise my doubts, I think it is necessary to make some notation.
Suppose that the form of One-way analysis of variance is as follows: \begin{cases} y_{ij}&=\mu_j+\epsilon_{ij}\quad i\in\{1,2,\cdots,n_j\},\ j\in\{1,2,\cdots,g\}\\ \epsilon_{ij}&\sim N(0,\sigma^2)\qquad\rm i.i.d \end{cases}
$y_{ij}$:observations
$\mu_j$: the mean of the observations for the jth treatment group
$i$: the index over experimental units(Fix the group $j$, $i$ ranges from 1 to $n_j$)
$j$: the index over treatment groups(from $1$ to $g$)
If we regard the above model as a regression model, we can use dummy variables to construct the explanatory variables, and just like this: \begin{cases} \vec{y}=X\vec{\mu}+\vec{\epsilon}\\ \vec{\epsilon}\sim N(\vec{0},\sigma^2I_n) \end{cases}
$\vec{y}=\begin{pmatrix} \vec{y_1}\\ \vec{y_2}\\ \vdots\\ \vec{y_g} \end{pmatrix}$,$\quad$ $X=\begin{pmatrix} 1_{n_1}& &\\ &1_{n_2}& &\\ & & \ddots&\\ & & & 1_{n_g} \end{pmatrix}$,$\quad$$\vec{\mu}=\begin{pmatrix} \mu_1\\ \mu_2\\ \vdots\\ \mu_g\\ \end{pmatrix}$.
$1_{n_j}$: a $n_j$-dimensional column vector of all 1s.
The OLS estimation of explanatory variables in this form is very simple because the design matrix X is a diagonal matrix. By using the results of $\hat{\mu}=(X^{T}X)^{-1}X^{T}\vec{y}=(\frac{1}{n_1}1_{n_1}^T\vec{y_{n_1}},\frac{1}{n_2}1_{n_2}^T\vec{y_{n_2}},\cdots,\frac{1}{n_g}1_{n_g}^T\vec{y_{n_g}})^T=(\overline{y}_{n_1},\overline{y}_{n_2},\cdots,\overline{y}_{n_g})^T$.
Since the one-way variance model can also be written in the form of an effects model, it can be considered as a regression model with an intercept term. But my question is how can I estimate the parameters of this model, especially it is too hard for me to derive the inverse of $(X^{T}X)^{-1}$, I hope someone can help me out.
Here is my thought:
The form of effects model is just like this:
\begin{cases} y_{ij}&=\mu+\alpha_j+\epsilon_{ij}\quad i\in\{1,2,\cdots,n_j\},\ j\in\{1,2,\cdots,g\}\\ \epsilon_{ij}&\sim N(0,\sigma^2)\qquad\rm i.i.d \end{cases}
$\mu$: the grand mean of observations
$\alpha_j=\mu_j-\mu$: the jth treatment effect, a deviation from the grand mean
In order to avoid the collinearity problem of the design matrix, $\alpha_j$ s is usually constrained as $\sum_{i=1}^{g}\alpha_i=0$. We can also use the regression model to describe this. Note that $\alpha_j$can be shown as $-\sum_{i=1}^{g-1}\alpha_i$. \begin{cases} \vec{y}=X\vec{\alpha}+\vec{\epsilon}\\ \vec{\epsilon}\sim N(\vec{0},\sigma^2I_n) \end{cases}
$\vec{y}=\begin{pmatrix} \vec{y_1}\\ \vec{y_2}\\ \vdots\\ \vec{y_g} \end{pmatrix}$,$\quad$ $X=\begin{pmatrix} 1_{n_1}&1_{n_1}& &\\ 1_{n_2}&&1_{n_2}& &\\ \vdots& & & \ddots&\\ 1_{n_g}&-1_{n_g}&-1_{n_g} &\cdots & -1_{n_{g-1}} \end{pmatrix}$,$\quad$$\vec{\mu}=\begin{pmatrix} \mu\\ \alpha_1\\ \vdots\\ \alpha_{g-1}\\ \end{pmatrix}$.
In this way, we can also estimate the $\vec{\alpha}$ theoretically, but how to derive the inverse of $(X^TX)^{-1}$?