When is a matrix-valued function convex?

506 Views Asked by At

How is convexity defined for a matrix-valued function $f:\mathbb{R}^{m\times n} \to\mathbb{R}$?

  • If $f:X\subset\mathbb{R}\to\mathbb{R}$ then it is convex iff $f(tx_1 + (1-t)x_2) \leq tf(x_1) + (1-t)f(x_2)$ for any $x_1, x_2\in X$ and $t\in(0, 1)$.
  • If $f:X\subset \mathbb{R}^m\to \mathbb{R}$ then it is convex iff $\Gamma = \{(x, r)\in X\times\mathbb{R}\,:\, r \geq f(x)\}$ is a convex set. That is, if for any $\gamma_1, \gamma_2\in\Gamma$ we have $t\gamma_1 + (1-t)\gamma_2\in\Gamma$ for any $t\in(0, 1)$.

How is it defined for a matrix-valued function?

1

There are 1 best solutions below

2
On BEST ANSWER

Well this is not a one-answer question. Here is one possibility, define $\preccurlyeq$ to be a relation on $\mathbb R^{n}\times \mathbb R^{n}$ to be defined as $x\preccurlyeq y =1$ if and only if $x_i \leq y_i$ for all $i\in [1:n]$ (more generally you could take $\preccurlyeq$ to be any partial vector ordering on any vector space). Let $f:X\to \mathbb R^{n}$ (or again any partially ordered vector space) some authors (for instance "A. Löhne, Vector Optimization with Infimum and Supremum.") say that $f$ is convex if and only if $\Gamma=\{ (x,y) : f(x)\preccurlyeq y \}$ is a convex set.

You can show that this is equivalent to the definition that $f$ is convex if and only if for all $x,y\in X$ and $\lambda\in[0,1]$ then $f(\lambda x+(1-\lambda)y) \preccurlyeq \lambda f(x) +(1-\lambda) f(y)$. This works only when $\preccurlyeq$ is a vector ordering

Here is the proof, since $(x_1,f(x_1))$ and $(x_2,f(x_2))$ are both in $\Gamma$ (that we suppose convex), let $\lambda\in[0,1]$, then $\lambda(x_1,f(x_1))+(1-\lambda)(x_2,f(x_2))$ is in $\Gamma$ and so $f(\lambda x_1+(1-\lambda)x_2)\preccurlyeq \lambda f(x_1)+(1-\lambda)f(x_2)$. Now suppose the second property holds, and let $(x_1,y_1)$ and $(x_2,y_2)$ both be in $\Gamma$, let also $\lambda\in[0,1]$, then $f(\lambda x_1+(1-\lambda)x_2)\preccurlyeq \lambda f(x_1)+ (1-\lambda)f(x_2)\preccurlyeq \lambda y_1+(1-\lambda)y_2$ (this last step uses the axioms of a vector ordering) this says that $\lambda[x_1,y_2]^T+(1-\lambda)[x_1,y_2]^T$ is in $\Gamma$ and proves its convexity according to the first criterion.

For the vector ordering proposed above $x\preccurlyeq y$ if $x_i\leq y_i$ for all $i$, this is equivalent to having convexity of all $f_i$ where $f(x)=[f_1(x),\dots,f_n(x)]^T$. Also note that there are some notion of subgradients, and almost everything we have in convex optimization, We however lose the notion of minimum in general, but you can define some other type of minimality such as $x$ attains a minimum if for all $y$ such that $f(y)\preccurlyeq f(x)$, we have $f(x)\preccurlyeq f(y)$ and you can do things like gradient descent to get to such a minimum.

This finds some nice application in multiobjective optimization, on my side I am very interested in such optimization when the objective space has and infinite amount of dimensions (but still is partially ordered).