As part of a proof I'd like to show if $R_1 \cup R_2 \subseteq R_3$, then $SSE(R_1) + SSE(R_2) \leq SSE(R_3)$ where SSE is the sum of squared deviations from the mean of each group. I'm assuming the values of the group are real or rational numbers if that makes a difference.
I know that since the mean minimizes the sum of squared errors, then each group must be minimized, I'm stuck on how to leverage the fact that the groups are subsets to show that $SSE(R_1 \cup R_2) \geq SSE(R_1) + SSE(R_2)$. Is there a basic piece of theory I am missing? Any pointers would be appreciated!
Using your insight about the mean being the estimator that minimizes the sum of squared errors we can write:
$$SSE(R_1 \cup R_2) = \min_\gamma \sum_{R_1 \cup R_2}(x_i - \gamma)^2$$
Now we can split this sum into two components:
$$= \min_\gamma \sum_{R_1}(x_i - \gamma)^2 + \sum_{R_2}(x_i - \gamma)^2$$
But of course, minimizing each sum individually will always give a value at least as small as minimizes the two sums together:
$$\geq \min_\alpha \sum_{R_1}(x_i - \alpha)^2 + \min_\beta\sum_{R_2}(x_i - \beta)^2$$
But this is exactly $SSE(R_1) + SSE(R_2)$