I am trying to do an ANOVA by hand to the following data.
Summer 83 85 85 87 90 88 88 84 91 90
Shoulder 91 87 84 87 85 86 83
Winter 94 91 87 85 87 91 92 86
I have calculated
$y_{i.}=871, 603,713 $ and $y_{..}=2187, N=25,n=3$
From here I was able to calculate the $SS_T = 220.24$ but I keep getting an unexpected value for $SS_E$
I am doing
$\sum \frac{y_{i.}^2}{n} -\frac{y_{..}^2}{N} = \frac{871^2 +603^2+713^2}{3}-\frac{2187^2}{25} = 352220$
and I am very puzzled why I get such a big number (according to SAS I am supposed to get 35.61) I have done similar problems before and the exact same method worked out.
May I have some help, please?
I am suspecting that the observation numbers being unbalanced is the issue...
Notice that:
$$y_{i.} := \sum_{j=1}^n y_{ij}$$ and $$y_{..} :=\sum_{i=1}^a\sum_{j=1}^n y_{ij}$$
Let $n$ be the number of groups. In your case $n=3$.
Let $n_i$ be the number of samples in each group. In your case, $n_1 = 10, n_2 = 7, n_3 = 8.$
Let $\mu_i = \frac{y_i}{n_i}$ be the mean of each group.
Let $\mu = \frac{y}{N}$ be the mean of the whole sample. Notice that $N = \sum_{i=1}^n n_i = n_1 + n_2 + n_3$.
By definition, $SS_E$ is the sum of squares of the errors between the mean of each group $\mu_i$ and the whole population mean $\mu$, weighted by the cardinality of each group. Formally:
$$SS_E = \sum_{i=1}^n n_i(\mu_i - \mu)^2.$$
We can work on this expression:
$$SS_E = \sum_{i=1}^n n_i\left(\mu_i^2 - 2 \mu_i \mu + \mu^2\right) = \\ = \sum_{i=1}^n n_i \mu_i^2 - 2\sum_{i=1}^n n_i \mu \mu_i + \sum_{i=1}^n n_i \mu^2 = \\ = \sum_{i=1}^n n_i \frac{y_i^2}{n_i^2} - 2 \mu \sum_{i=1}^n n_i \frac{y_i}{n_i} + \mu^2 \sum_{i=1}^n n_i = \\ = \sum_{i=1}^n \frac{y_i^2}{n_i} - 2 \mu \sum_{i=1}^n y_i + \mu^2 N .$$
Notice that:
$$\sum_{i=1}^n y_i = y.$$
Therefore:
$$SS_E = \sum_{i=1}^n \frac{y_i^2}{n_i} - 2 \mu y + \mu^2 N = \\ = \sum_{i=1}^n \frac{y_i^2}{n_i} - 2 \frac{y}{N}y + \frac{y^2}{N^2}N = \\ = \sum_{i=1}^n \frac{y_i^2}{n_i} - \frac{y^2}{N},$$
which is different and more correct than your formula.
Summing up, your problem is that you are using $n$ (which you defined as the number of groups) instead of the number of samples in each group, i.e., $n_i$.