A few days ago, I tried solving the following question...

... and by now, I managed to solve it (with some help though). However, with answers came even more questions.
To be specific, when trying to solve exercise 3b), I asked the stackexchange community how to do so in this question. One of the answers (highlighted link above) that really helped me out made use of this identity:
$\sum_i w_i*(X_i - \overline X _w)^2 = \sum_i w_i*X_i^2 - \overline X _w ^2$
and stated that this follows from the fact that $\sum_{i}w_i(X_i - \bar{X}_w)^2$ is the variance of $(X_1, \dots, X_n)$ with respect to the probability mass function $(w_1, \dots, w_n)$. My problem is just that I don't know any measure theory so I have pretty much no idea what exactly he is talking about.
Well, actually I do know a few things about measure theory. Here are the things I do know:
- I know what a $\sigma$-algebra is
- I know what measurable functions and measures are
- I know what a probability space and probability measure is
- I know what the image measure theorem states:
(I only read up on this one because I felt like this may help me out, but ultimately I didn't really get that far) I just don't know if this is enough to understand why I am able to simply deduce the equation I stated previously. So my question is: What knowledge am I lacking? I would greatly appreciate someone walking me through a measure theoretic proof of the identity so I can do the necessary reading afterwards . Of course I don't expect anyone to prove every theorem they use in such a proof but a simple proof (maybe a general one that I could apply to various cases of this type) would make my life much easier. I just want to understand why he was able to conclude that this identity holds so quickly.
EDIT: I just noticed that I didn't say where exactly the answering person made use of measure theory. I asked him this :

Sources: Image 1 - Linear Regression Analysis by Seber, Image 2 - Probability Theory and Stochastic Processes by Bremaud


$\newcommand{\var}{\operatorname{Var}}$It's not as deep as you maybe think it is. It's fairly straightforward algebra. But, if you insist on a measure theoretic interpretation, there is one:
But I must emphasise there is nothing special about the $X_i$, about the fact these are random variables. This is an algebraic identity about tuples of real numbers (it is even true in any ring).