I am currently working on understanding the Kantorovich-Rubinstein duality and Wassertein loss. The following part of these class notes:
Collecting the terms algebraically we can rewrite the Lagrangian as :
$L(\pi, f, g) = \underset{x\sim p}{\mathbb{E}}[f(x)] + \underset{y\sim p}{\mathbb{E}}[g(y)] + {\displaystyle\int_{X \times X}\Big(||x-y|| - f(x) - g(y)\Big)\pi(x,y)dydx}$
And we appeal to strong duality to write
$W(p, p_g) = \underset{\pi}{\inf}\underset{f, g}{\sup}L(\pi, f, g) = \underset{f, g}{\sup}\underset{\pi}{\inf}L(\pi, f, g)$
Note that if $||x-y|| \leq f(x) + g(y)$ for some $x, y \in X$ then we can concentrate the mass of $\pi$ at $(x,y)$ and send $L(\pi, f, g)$ to $-\infty$
-Kantorovich-Rubinstein Duality,John Thickstun, p.1
I understand why the author tries to show that it goes to $-\infty$, because then it becomes a constraint. I also understand how it concentrate mass at $(x, y)$ I think.
The part I don't get is the part in bold : Why would the Lagrangian go to $-\infty$ when we concentrate the mass on a point $(x, y)$ such that $||x-y|| \leq f(x) + g(y)$ ?
I really can't see it. I could not find any explanation anywhere.
When working with the Lagrangian some of the constraints on the variables are discarded, in this case that includes the constraint that $\pi$ needs to have total mass equal to 1. For example, this lets $\pi$ be chosen as $\alpha\delta_{(x,y)}$ for arbitrarily large $\alpha$ where $\delta_{(x,y)}$ is the Dirac distribution at $(x,y)$.
Now suppose there is a pair $(x_0,y_0)$ such that $||x_0-y_0|| < f(x_0) + g(y_0)$ and let $D \triangleq ||x_0-y_0|| - f(x_0) - g(y_0) < 0$. Then we have the following chain \begin{align} \inf_\pi L(\pi,f,g) &= \inf_\pi \mathbb{E}_{x\sim p}[f(x)] + \mathbb{E}_{y \sim q}[g(y)] + \int_{X \times X} \left ( ||x - y|| - f(x) - g(y) \right ) d\pi(x,y) \\ &= \mathbb{E}_{x\sim p}[f(x)] + \mathbb{E}_{y \sim q}[g(y)] + \inf_\pi \int_{X \times X} \left ( ||x - y|| - f(x) - g(y) \right ) d\pi(x,y) \\ &\leq C + \alpha\left ( ||x_0 - y_0|| - f(x_0) - g(y_0) \right ) \\ &= C + \alpha D \end{align} where $C \triangleq \mathbb{E}_{x\sim p}[f(x)] + \mathbb{E}_{y \sim q}[g(y)]$ is finite since we assumed $f,g$ are bounded. The inequality comes from choosing $\pi = \alpha\delta_{(x_0,y_0)}$. Since $D$ is strictly negative, taking $\alpha \rightarrow \infty$ sends the last expression to $-\infty$.
This is why the Lagrangian is $-\infty$ when a pair violates $||x-y|| - f(x) - g(y)$.