Norm equal to infimum with constraint on dual norm

75 Views Asked by At

I am attempting to prove the following:

$$ -\lambda \lVert \mathbf{x}\rVert = \inf_{\lVert \mathbf{y}\rVert_\ast \leq \lambda} \mathbf{x}^\top \mathbf{y} \; , $$

where $\lambda \geq 0$. I am not sure how to approach it. A proof I am trying to follow states that I can use the definition of a dual norm:

$$ \lVert \mathbf{a}\rVert_\ast = \sup_{\lVert \mathbf{b}\rVert \leq 1} \mathbf{a}^\top \mathbf{b} $$

but I can't figure it out. Any suggestions?

Thank you for your help!


My initial guess is to formulate the norm as [Ref. here]: $$ \begin{aligned} \lambda \lVert\mathbf{x} \rVert &= \inf_{\mathbf{w}}\ \lambda \lVert\mathbf{w} \rVert \\ & \quad \mathrm{s.t.} \ \mathbf{w} = \mathbf{x}\;. \end{aligned} $$ This has the corresponding Lagrangian: $$ L(\mathbf{w}, \mathbf{r}) = \lambda\lVert\mathbf{w}\rVert - \mathbf{r}^\top\mathbf{w} + \mathbf{r}^\top\mathbf{x} \;, $$ and corresponding dual problem: $$ \sup_{\mathbf{r}} \inf_{\mathbf{w}} \ \lambda\lVert\mathbf{w} \rVert - \mathbf{r}^\top\mathbf{w} + \mathbf{r}^\top\mathbf{x} \;. $$ From definition of a dual norm, we know that if $\lVert\mathbf{r}\rVert_\ast > \lambda$, then $\inf_{\mathbf{w}} \left(\lambda\lVert \mathbf{w} \rVert - \mathbf{r}^\top \mathbf{w}\right) = -\infty$, and if $\lVert\mathbf{r}\rVert_\ast \leq \lambda$, then $\inf_{\mathbf{w}} \left(\lambda \lVert \mathbf{w} \rVert - \mathbf{r}^\top \mathbf{w}\right) = 0$. Therefore, the dual problem becomes: $$ \begin{aligned} &\ \ \sup_{\mathbf{r}}\ \mathbf{r}^\top\mathbf{x} \\ & \ \ \ \ \mathrm{s.t.} \ \ \lVert\mathbf{r}\rVert_\ast \leq \lambda\;. \end{aligned} $$ From Slater's theorem, we know that the primal problem is equal to the dual.

We then use the property that: $-\inf_{\mathbf{x}} \left( -f(\mathbf{x}) \right) = \sup_{\mathbf{x}} f(\mathbf{x})$ to get that: $$ -\lambda \lVert \mathbf{x}\rVert = -\sup_{\lVert \mathbf{r}\rVert_\ast \leq \lambda} \mathbf{x}^\top \mathbf{r} = \inf_{\lVert \mathbf{r}\rVert_\ast \leq \lambda} \mathbf{x}^\top \left(-\mathbf{r}\right) = \inf_{\lVert \mathbf{y}\rVert_\ast \leq \lambda} \mathbf{x}^\top \mathbf{y} \; , $$ where $\mathbf{y} = - \mathbf{r}$ and therefore: $\lVert \mathbf{y}\rVert_\ast = \lVert \mathbf{r}\rVert_\ast$.


Proof is for Theorem 3.3.3 in: Ruidi Chen and Ioannis Ch. Paschalidis (2020), “Distributionally Robust Learning”, : Vol. 4, No. 1–2, pp 1–243. DOI: 10.1561/2400000026

1

There are 1 best solutions below

2
On BEST ANSWER

If you're not familiar with the Hahn-Banach theorem, then you probably don't want to use it to solve this problem. But the Hahn-Banach theorem is fundamental to normed spaces. If $E$ is a Banach space (which every finite dimensional normed space is), if $F$ is a subspace, $R>0$ and $g:F\to \mathbb{R}$ is a linear functional such that $|g(x)|\leqslant R\|x\|$ for all $x\in F$, then there exists a linear extension $G:E\to \mathbb{R}$ that satisfies the same bound: $|G(x)|\leqslant R\|x\|$ for all $x\in E$. Let $E$ be your original space and let $x$ be any fixed non-zero vector. Let $F=\{a x:a\in \mathbb{R}\}$ be the span of $x$. Define $f:F\to \mathbb{R}$ by $f(ax)=a\|x\|$. Then for any $ax\in F$, $|f(ax)|=|a|\|x\|=\|ax\|$. Therefore we have the desired inequality with $R=1$. We apply the Hahn-Banach theorem to find some linear extension $G:E\to \mathbb{R}$ such that $|G(z)|\leqslant \|z\|$ for all $z$. We then note that $G$ has the form $G(z)=z^Ty$ for some $y$. Then $|z^Ty|=|G(z)|\leqslant \|z\|\leqslant 1$ for any $z$ with $\|z\|\leqslant 1$. From this it follows that $\|y\|_*\leqslant 1$. And $x^Ty=G(x)=g(x)=\|x\|$. So $$\|x\|_{**} =\sup_{\|y'\|_*\leqslant 1}x^Ty' \geqslant x^Ty=\|x\|.$$

We show the other inequality $\|x\|_{**}\leqslant \|x\|$, which is elementary. If $x=0$, then $\|0\|_{**}=\|0\|=0$. Assume $x\neq 0$. Note that, with $u=x/\|x\|$, $\|u\|=1$. For any $y$ with $\|y\|_*\leqslant 1$, $$x^Ty=(\|x\|u)^Ty=\|x\|u^Ty\leqslant \|x\|\sup_{\|x'\|\leqslant 1}=\|x\|\|y\|_*\leqslant \|x\|.$$ Since this holds for any $y$ with $\|y\|_*\leqslant 1$, it holds that $$\|x\|_{**}=\sup_{\|y\|_*\leqslant 1}x^Ty\leqslant \|x\|.$$

Thus $\|x\|_{**}=\|x\|$ for any $x$.

Let $B=\{y:\|y\|_*\leqslant 1\}$ and note that $y\in B$ if and only if $-y\in B$. Therefore $$\{x^Ty:y\in B\}=\{x^T(-y):y\in B\}=\{-x^Ty:y\in B\}.$$ Thus the set of values $\{x^Ty:y\in B\}$ is symmetric, and the supremum of this set is the opposite of the infimum of this set. That is, $$\sup_{\|y\|_*\leqslant 1}x^Ty=\sup\{x^Ty:y\in B\} = -\inf\{x^Ty:y\in B\}=-\|x\|_{**}=-\|x\|.$$

Last, for $\lambda>0$, $\{y:\|y\|_*\leqslant \lambda\}=\{\lambda y:\|y\|_*\leqslant 1\}$. Thus $$\sup_{\|y\|_*\leqslant \lambda}x^Ty=\sup_{\|y\|_*\leqslant 1}x^T(\lambda y)=\lambda \sup_{\|y\|_*\leqslant 1}x^Ty=\lambda \|x\|_{**}=\lambda \|x\|.$$