I am struggling to understand some parts of the proof of Strassen's theorem in Klenke's textbook.
Theorem. Let $$ L = \{(x_1,x_2)\in \mathbb{R}^2 \mid x_1\le x_2\}. $$ Then, a probability measure $\mu_2$ is stochastically larger than $\mu_1$ if and only if there is a coupling $\varphi$ of $\mu_1$ and $\mu_2$ with $\varphi(L)=1$.
The text defines that $\mu_2$ is stochastically larger than $\mu_1$ if $\int f d\mu_1 \le \int f d\mu_2$ for every monotone increasing bounded function $f:\mathbb{R}\to\mathbb{R}$. Here is his proof.
Proof. (If) Let $\varphi$ be such a coupling. For monotone increasing bounded $f:\mathbb{R}\to\mathbb{R}$, we have $f(x_1)-f(x_2)\le 0$ for every $x=(x_1,x_2)\in L$; hence $$ \int f d\mu_1 - \int f d\mu_2 = \int_L(f(x_1)-f(x_2))\varphi(dx)\le 0.$$ (Only if) Here $F((x_1,x_2)):=\min(F_1(x_1),F_2(x_2))$ defines a distribution function on $\mathbb{R}^2$ that corresponds to a coupling $\varphi$ with $\varphi(L)=1$.
My questions:
Why is $\int f d\mu_1 - \int f d\mu_2=\int_L(f(x_1)-f(x_2))\varphi(dx)$? I'm not sure how to prove this because the set $L$ depends both on $x_1$ and $x_2$. If the integral is over $\mathbb{R}^2$, then I understand that from the definition of coupling, $\int_{\mathbb{R}^2} f(x_1) \varphi(dx)=\int f d\mu_1$.
Why is $F$ a distribution function that corresponds to $\varphi$? I can see that the marginal of $\phi$ is indeed $\mu_i$ from $\lim_{x_j\to\infty}F(x_1,x_2)=F_i(x_i)$. However, I am not sure how to prove $\varphi(L)=1$, since $L$ is not of the "half-open box" form. I tried to think graphically and reached a wrong conclusion: take $x=(x_1,x_2)$ and $y=(y_1,y_2)$ such that $y_1=y_2\le x_1=x_2$. Since $F(x)=F_2(x_2)$ for $x$ on the 45 degree line, the area of $[y_1,x_1]\times [y_2,x_2]$ is $F_2(x_2)-F_2(y_2)$. Since half of the box is in $L$, by taking $x_2\to\infty$ and $y_2\to-\infty$, $\varphi(L)=(1-0)/2$?