$$\min_{\theta \in \Theta} \sup_{P: D(P, P_0) \leq \rho} \mathbb{E}_P\left[ \ell(\theta;(X,Y)) \right]$$
The above equation is from this paper in ML: (https://arxiv.org/pdf/1805.12018.pdf) on the top of page 2. I'm having a hard time understanding what that equation is even saying though, specifically the part in the middle. So, is it correct to say the equation is saying there is a distribution P, which is within a distance of row from P0, and we want to minimize the expected value given that distribution? Where does the support come into play though?
My understanding is that there is a fixed distribution $P_0$ and the inner supremum is over the set of distributions that are at most $\rho$ away from it with respect to $D$ distance. The outer minimization is choosing $\theta$ in some set $\Theta$ such that the supremum is minimized.
In words, find $\theta \in \Theta$ such that the worst expected loss with respect to $P$, given that $P$ is at most $\rho$-away from $P_0$, is minimized. So, this is a kind of minmax problem where you are minimizing the worst-case expected loss.
The support comes into play in the expectation operator. The expectation is taken over $P$ which means summing (or integrating if $P$ is continuous) over the values in the support of $P$.