The following is a theorem in stochastic process in Pinsky's An introduction to Stochastic Modeling:

The proof starts as the following:

Here is my question:
Could anybody explain why and how one can freely specify the joint distribution of each $\epsilon(p_k)$ and $X(p_k)$ in order to prove (5.11)?
The proof continues as the following. But I don't know how the joint distribution is specified.

The logic here is that if $X$ and $Y$ are random variables for which it possible to place an upper bound on $|P_X(A)-P_Y(B)|$ that involves the joint distribution of $(X,Y)$: $$|P_X(A)-P_Y(B)|\le P_{(X,Y)}(C)\tag1$$ and if (1) holds even when we don't specify what that joint distribution is, then the inequality continues to hold when we take the infimum of the RHS over all choices of $P_{(X,Y)}$: $$|P_X(A)-P_Y(B)|\le \inf_{P_{(X,Y)}}P_{(X,Y)}(C)\tag2$$ Finally we pick a particular joint distribution for $(X,Y)$, say $P^*_{(X,Y)}$. Since the inf of a set is no bigger than any element in the set, we can then assert $$|P_X(A)-P_Y(B)|\le P^*_{(X,Y)}(C).\tag3$$ The name of the game is to pick a $P^*_{(X,Y)}$ to make the inequality (3) as sharp as possible.