Expected value of MSB in one-way ANOVA

759 Views Asked by At

Suppose we have $k$ many machines each producing iron balls and we are interested in weights of the balls. Let $y_{ij}$ be the weight of the $j$th ball produced by $i$ th machine,where $i=1,2,...,k$ and $j=1,2,...,n_i$. Let $\mu_i$ be the average weight of a ball produced by machine $i$. Now let us focus on the following hypothesis.

We know that if $H_0: \mu_1=\dots=\mu_k$ is assumed to be true, then we have $E(MSB)=\sigma^2$.

Where $$ MSB = \sum_i n_i \frac{(\overline y_{i0}-\overline y_{00})^2} {\sigma^2(k-1)}$$ is the Mean Square Between.

Now, when $H_0$ is true, its expected value is $\sigma^2$. But when $H_0$ is not true, then $$ E(MSB)=\sigma^2+\frac{1}{k-1} \sum n_i \alpha^2_i. $$ Where $\alpha_i=\mu_i-\mu$.I am a newcomer in statistics and I know the result only, not its proof. So how to prove this equation using basic knowledge of statistics? Please note that I am considering one-way ANOVA.

1

There are 1 best solutions below

0
On BEST ANSWER

I assume the model is

$$y_{ij}=\mu_i+\varepsilon_{ij} \quad\small,\,i=1,\ldots,k;j=1,\ldots,n_i$$

where $\mu_i$ is a fixed effect and $\varepsilon_{ij}\stackrel{\text{i.i.d}}\sim N(0,\sigma^2)$ is the random error.

Taking $\mu_i=\mu+\alpha_i$, this can be reparameterized as

$$y_{ij}=\mu+\alpha_i+\varepsilon_{ij} \quad\small,\,i=1,\ldots,k;j=1,\ldots,n_i \tag{$\star$}$$

Here $\mu$ is a general effect and $\alpha_i$ is an additional fixed effect subject to $\sum\limits_{i=1}^k n_i\alpha_i=0$. This last constraint is necessary for unique estimation of all parameters in $(\star)$. Also let $n=\sum\limits_{i=1}^k n_i$.

The hypothesis $H_0:\mu_1=\mu_2=\cdots=\mu_k$ is now equivalent to $H_0:\alpha_1=\alpha_2=\cdots=\alpha_k=0$.

We have

$$\overline y_{i0}=\mu+\alpha_i+\overline\varepsilon_{i0}$$ and $$\overline y_{00}=\mu+\overline\varepsilon_{00}$$

The between sum of squares is therefore

$$\text{SSB}=\sum_{i=1}^k n_i(\overline y_{i0}-\overline y_{00})^2=\sum_{i=1}^k n_i(u_i-\overline u)^2\,,$$ where $u_i=\alpha_i+\overline\varepsilon_{i0}$ and $\overline u=\frac1n\sum\limits_{i=1}^k n_i u_i=\overline\varepsilon_{00}$.

Observe that the $u_i$'s are independent normal variables:

$$u_i\stackrel{\text{ind.}}\sim N\left(\alpha_i,\frac{\sigma^2}{n_i}\right)\quad ,\,i=1,\ldots,k$$

This implies $$\overline u \sim N\left(0,\frac{\sigma^2}{n}\right)$$

Therefore,

\begin{align} \operatorname E\left[\text{SSB}\right]&=\operatorname E\left[\sum_{i=1}^k n_i(u_i-\overline u)^2\right] \\&=\sum_{i=1}^k n_i\operatorname E\left[u_i^2\right]-n\operatorname E\left[\overline u^2\right] \\&=\sum_{i=1}^k n_i\left\{\operatorname{Var}(u_i)+(\operatorname E(u_i))^2\right\}-n\operatorname{Var}(\overline u) \\&=\sigma^2(k-1)+\sum_{i=1}^k n_i\alpha_i^2 \end{align}

Hence the required expectation should be

$$\operatorname E\left[\text{MSB}\right]=\operatorname E\left[\frac{\text{SSB}}{k-1}\right]=\sigma^2+\frac1{k-1}\sum_{i=1}^k n_i\alpha_i^2$$