Below is the proof along with the causal graph I copied from a textbook about causal inference by Brady Neal:
Claim Given the causal graph is Figure A.1, $P(m \mid d o(t))=P(m \mid t)$.
START OF THE PROOF
Proof. We first apply the Bayesian network factorization (Definition 3.1): $$ P(w, t, m, y)=P(w) P(t \mid w) P(m \mid t) P(y \mid w, m) $$
Next, we apply the truncated factorization (Proposition 4.1): $$ P(w, m, y \mid d o(t))=P(w) P(m \mid t) P(y \mid w, m) $$
Finally, we marginalize out $w$ and $y$ : $$ \begin{aligned} \sum_w \sum_y P(w, m, y \mid d o(t)) & =\sum_w \sum_y P(w) P(m \mid t) P(y \mid w, m) \\ P(m \mid d o(t)) & =\left(\sum_w P(w)\right) P(m \mid t)\left(\sum_y P(y \mid w, m)\right) \\ & =P(m \mid t) \end{aligned} $$
END OF THE PROOF
What I do no understand is how $$ \begin{aligned} \sum_w \sum_y P(w) P(m \mid t) P(y \mid w, m) \end{aligned} $$
becomes
$$ \begin{aligned} \left(\sum_w P(w)\right) P(m \mid t)\left(\sum_y P(y \mid w, m)\right) \end{aligned}. $$
My understanding is that $P(w)$ and $P(m\mid t)$ have nothing to do with $\sum_y$, so we can rearrange $$ \begin{aligned} \sum_w \sum_y P(w) P(m \mid t) P(y \mid w, m) \end{aligned} $$
into
$$ \begin{aligned} \sum_w P(w)P(m \mid t)\left(\sum_y P(y \mid w, m)\right) \end{aligned}. $$
But why can $\sum_w P(w)$ be separated from the rest of the expression? Is it some general rule of double summation or is it only applied to this specific case?
Sorry for my bad English.