Fenchel-Young inequality for matrices

368 Views Asked by At

Background: For $f:D\subset\mathbb{R}^d \to \mathbb{R}$ we call $$f^*:D^*\subset\mathbb{R}^d \to \mathbb{R}, ~ y\mapsto \underset{x\in D}{\sup}(\langle y, x \rangle - f(x))$$ the conjugate of $f$, where $D^*:=\{y\in\mathbb{R}^d | \underset{x\in D}{\sup}(\langle y, x \rangle - f(x)) < \infty \}$. From the definition we directly get the Fenchel inequality: $$ \langle y, x \rangle \leq f(x) + f^*(y) $$ If we set $D=\mathbb{R}^d, f(x)=\lVert x \rVert$ for some vector norm $\lVert \cdot \rVert$ and consider the dual norm $ \lVert y \rVert^* = \underset{\lVert x \rVert=1}{\sup}\langle y, x \rangle$ (which in fact is a dual norm thanks to Riesz' representation theorem) we get $$ f^*(y) = \begin{cases} 0, &\lVert y \rVert^* \leq 1 \\ \infty, &\mathrm{otherwise}, \end{cases} $$ and therefore $\langle y, x \rangle \leq \lVert x \rVert$ whenever $\lVert y \rVert^* \leq 1$.

My questions: The author of a paper I'm reading defines something he calls a matrix dual norm by $$ \lVert W \rVert^* := \underset{\lVert u \rVert = \lVert v \rVert = 1}{\sup} u^TWv $$ for some vector norm $\lVert \cdot \rVert$ and $W\in\mathbb{R}^{d\times d}$. He then claims that $$ \langle A, B \rangle_{\mathrm{F}} \leq \lVert A \rVert\qquad (*) $$ by the Fenchel-Young inequality whenever $\lVert B \rVert^* \leq 1$, where $\langle \cdot, \cdot \rangle_{\mathrm{F}}$ is the Frobenius SP and I'm guessing (this is not clearly defined) $\lVert \cdot \rVert$ is the matrix norm induced by the aforementioned vector norm $\Vert \cdot \Vert$.

  1. How is the defined matrix norm $\lVert \cdot \rVert^*$ a dual norm? I'm only familar with $\mathbb{R}^{d\times d}$ as a Hilbert space when equipped with the Frob. SP, but the dual norm defined here does not seem to coincide with the dual norm induced by the Frobenius SP.
  2. Since the author does not use the Frobenius norm as dual norm, can we still apply Fenchel-Young to achieve the inequality (*)?

Thanks a lot!

Edit: I just realized that it might be somewhat exaggerated to use Fenchel-Young here, since for all $x \neq 0, y \in \mathbb{R}^d$ with $\lVert y \Vert^* \leq 1$ we have $$ \langle x, y \rangle = \langle x/\lVert x \rVert, y \rangle \lVert x \rVert \leq \lVert y \Vert^* \lVert x \rVert \leq \lVert x \rVert. $$ My second question therefore reduces to whether the following is true: $$ \langle A, B \rangle_F = \langle A/\lVert A \rVert, B \rangle_F \lVert A \rVert \overset{\mathrm{?}}{\leq} \lVert B \Vert^* \lVert A \rVert. $$ I'm pretty sure it is not since by choosing $\lVert \cdot \lVert = \lVert \cdot \lVert_1$ we have $$ \lVert B \Vert^* = \underset{\lVert u \rVert_1 = \lVert v \rVert_1 = 1}{\sup} u^TBv = \underset{i,j}{\max}|b_{i,j}| $$ but choosing $A$ appropriately (one entry per column +/-1, otherwise zero s.t. $\lVert A \rVert_1$ = 1) $$ \langle A, B \rangle_F = \sum_{j=1}^d \underset{i=1, ..., d}{\max} |b_{ij}| > \lVert B \Vert^* $$ Could someone confirm?

1

There are 1 best solutions below

9
On

If we define the dual norm on $\mathbb{R}^{m\times n}$ as $$ \lVert X \rVert^* = \sup\{\operatorname{Tr}(X^\top Y)~\vert~\lVert Y \rVert \leq 1\} = \sup\{\lvert\operatorname{Tr}(X^\top Y)\rvert~\vert~\lVert Y \rVert \leq 1\}, $$ thus $$ \langle X,Y\rangle_F = \langle X,Y/\lVert Y\rVert\rangle_F\lVert Y\rVert\leq \lVert X \rVert^*\lVert Y\rVert $$

Furthermore, we have an induced dual matrix norm $\lVert A \rVert^*$ equivalently defined as $$ \lVert A \rVert^* = \sup_{\lVert x\rVert^* = 1,\lVert y\rVert=1}x^\top Ay. $$

Proof: From the definition of dual norm on vectors: $$ \lVert z\rVert^* = \sup_{\lVert y \rVert=1}\lvert y^\top z\rvert. $$

Thus $$ \lVert A\rVert^* = \sup_{\lVert x\rVert^*=1}\lVert Ax\rVert^* = \sup_{\lVert x\rVert^*=1,\lVert y\rVert=1}\lvert y^\top Ax\rvert. $$

EDIT 1: To complement the answer a bit, we also have from Fenchel-Young on vectors $$ x^\top A^\top By\leq \lVert Ax\rVert^*\lVert By\rVert \leq \lVert A\rVert^*\lVert B\rVert \lVert x\rVert^*\lVert y\rVert. $$

The dual norm of the spectral norm $\lVert A \rVert_2$ is not itself. It is in fact the so-called nuclear norm, and a proof is available here: https://math.stackexchange.com/a/1145246/443030

$\lVert A\rVert_1$ is the dual of $\lVert A\rVert_\infty$, for matrices, and the inequality still holds.