In a paper of distributionally robust optimization, there is a step of derivation using the "standard duality argument". It obtains $ \sup _{\mathbb{Q}_{i} \in \mathcal{M}(\Xi)} \inf _{\lambda \geq 0} \frac{1}{N} \sum_{i=1}^{N} \int_{\Xi} \ell(\xi) \mathbb{Q}_{i}(d \xi) +\lambda\left(\varepsilon-\frac{1}{N} \sum_{i=1}^{N} \int_{\Xi}\left\|\xi-\widehat{\xi}_{i}\right\| \mathbb{Q}_{i}(\mathrm{~d} \xi)\right)$ from $\sup _{\mathbb{Q}_{i} \in \mathcal{M}(\Xi)} \frac{1}{N} \sum_{i=1}^{N} \int_{\Xi} \ell(\xi) \mathbb{Q}_{i}(\mathrm{~d} \xi)$ s.t. $\quad \frac{1}{N} \sum_{i=1}^{N} \int_{\Xi}\left\|\xi-\widehat{\xi}_{i}\right\| \mathbb{Q}_{i}(\mathrm{~d} \xi) \leq \varepsilon$.
It is confusing for me. Since if I try to use the strong duality, the optimal value of dual problem is solved from the minimax problem $\inf_{\lambda\ge 0}\sup_{\mathbb{Q} \in \mathcal{M}(\Xi) } \dots$. I am wondering how $\sup_{\mathbb{Q} \in \mathcal{M}(\Xi) } \inf_{\lambda\ge 0}$ is derived in their paper.
Any hint would be appreciated.
They do not use strong duality in that step. The two expressions are equal because the infimum in the first expression is $-\infty$ when the constraint in the second expression is not satisfied, so the supremum operator ensures that the constraint is satisfied also in the first expression.