I'm reading the proof of "total variation function is positive measure" on p.261 in Measure, Integration and Real Analysis by Axler. (The book pdf: https://measure.axler.net/MIRA.pdf)
Let $\nu$ be a complex measure on measurable space $(X,\mathscr S)$, and $|\nu|:\mathscr S\to[0,\infty]$ be the total variation of $\nu$, i.e., $$|\nu|(E) =\{|\nu(E_1)|+\cdots+|\nu(E_n)| \ ;\ n\in\mathbb N, E_1,\cdots,E_n \in\mathscr S \ \mathrm{are \ disjoint,\ }\ E_1\cup\cdots\cup E_n\subset E \}.$$ Then, $|\nu|$ is a positive measure on $(X,\mathscr S)$.
What I have to see is:
(i) $|\nu|(\emptyset)=0$
(ii) $|\nu|(\cup_{k=1}^\infty A_k)=\sum_{k=1}^\infty|\nu|(A_k)$ for disjoint $\{A_k\}_k\subset\mathscr S.$
I understand (i), and for (ii), I understand $|\nu|(\cup_{k=1}^\infty A_k)\leqq\sum_{k=1}^\infty|\nu|(A_k)$ but I don't see the proof of the other direction.
The written proof is as follows.
Fix $m\in\mathbb N.$
For each $k\in\{1,\cdots,m\}$, suppose $n_k\in\mathbb N$ and $E_{1,k}\ ,\dots,E_{n_k\ ,k}\in\mathscr S$ are disjoint s.t. $$E_{1,k}\cup\cdots\cup E_{n_k\ ,k}\subset A_k.$$
Then, $\{E_{j,k}\ ; 1\leq k\leq m, 1\leq j\leq n_k\}$ is a disjoint collection of sets in $\mathscr S$ that are contained in $\cup_{k=1}^\infty A_k.$
Thus $$\sum_{k=1}^m\sum_{j=1}^{n_k}|\nu(E_{j,k})|\leqq |\nu|(\cup_{k=1}^\infty A_k).$$
Taking the spremum of the left side over all choices of $\{E_{j,k}\}$, we get $$\sum_{k=1}^m|\nu|(A_k)\leqq |\nu|(\cup_{k=1}^\infty A_k).$$
Letting $m\to\infty$, I get the result.
I don't understand the part
Taking the spremum of the left side over all choices of $\{E_{j,k}\}$, we get $$\sum_{k=1}^m|\nu|(A_k)\leqq |\nu|(\cup_{k=1}^\infty A_k).$$
Now, for each $k,$ we have, by the definition, $$|\nu|(A_k)=\{|\nu(E_{1,k})|+\cdots+|\nu(E_{n_k\ ,k})| \ ;\ n_k\in\mathbb N, E_{1,k}\ \dots,E_{n_k\ ,k} \in\mathscr S \ \mathrm{are \ disjoint\ }\ E_{1,k}\cup\cdots\cup E_{n_k\ ,k}\subset A_k \}.$$
At the first part, we picked arbitrarily $n_k\in\mathbb N$ and $E_{1,k},\dots,E_{n_k,k}\in\mathscr S$ so that they are disjoint and satisfy $$E_{1,k}\cup\dots\cup E_{n_k,k}\subset A_k,$$ so I think it is appropriate to take the supremum over all choices of $E_{1,k}\ ,\cdots,E_{n_k,k}$, instead of $\{E_{j,k}\}$. If we take the supremum over the choices of $\{E_{j,k}\}$, we have to pick $\{E_{j,k}\}$ arbitrarily at first.
What's going on here? How did he take the supremum and get $\displaystyle\sum_{k=1}^m|\nu|(A_k)\leqq |\nu|(\cup_{k=1}^\infty A_k)$?
Maybe I'm missing something easy, but I don't notice.
Thanks for the explanation.
Here is a model problem that illustrates the same idea. Suppose that $E = \{a_i\}$ and $F = \{b_j\}$ are two sets of numbers with $\alpha = \sup E$ and $\beta = \sup F$, such that for some $x$, $a_i + b_j \le x$ for all $i$ and $j$. Then I claim $\alpha + \beta \le x$.
You can prove the claim by fixing $j$ and then moving $b_j$ to the other side to get $a_i \le x - b_j$. Since this inequality holds for all $i$, $\alpha \le x-b_j$ by definition of $\alpha$ as a least upper bound. Then moving $b_j$ to the left again and moving $\alpha$ to the right shows $\beta\le x-\alpha$, since the inequality $b_j\le x-\alpha$ holds for all $j$ ($j$ was arbitrary).
You can use this idea with the sum over $k=1,\dots,m$, fixing $k_0\in\{1,\dots,m\}$ and one choice of $\{E_{1,k},\dots,E_{n_k,k}\}$ for all $k\ne k_0$, and applying the definition of $|\nu|(A_{k_0})$, one term at a time. The key is that the inequality holds for all choices of $\{E_{1,k},\dots,E_{n_k,k}\}$ for all $k = 1,\dots,m$ and $n_k$ simultaneously.