Convexity of matrix trace functions $A\mapsto \operatorname{Tr} (Bf(A))$

1.4k Views Asked by At

Any function $f:\mathbb{R}\rightarrow\mathbb{R}$ is can be extended to a function on self-adjoint $n\times n$ matrices by $$ f(A) = \sum_{i=1}^n f(a_i)v_iv_i^* $$ where $A = \sum_{i=1}^n a_iv_iv_i^*$ is the spectral decomposition of $A$.

If $f:\mathbb{R}\rightarrow\mathbb{R}$ is convex, then the function $A\mapsto \operatorname{Tr}(f(A))$ is also convex as a function of matrices. (See e.g. Theorem 3.27 in Introduction to Matrix Analysis and Applications, or see proof below).

My question: If $f:\mathbb{R}\rightarrow\mathbb{R}$ is convex, is the function $$\tag{1} A\mapsto \operatorname{Tr}(Bf(A)) $$ convex for any self-adjoint positive matrix $B\geq 0$?

I'm wondering if the following proofs might be adapted to show convexity of (1). Here I show that $A\mapsto\operatorname{Tr}f(A)$ is a convex function as long as $f$ is convex. I can't get a similar proof to work to show convexity of $A\mapsto\operatorname{Tr}(Bf(A))$, nor can I find a counterexample.

Lemma. Suppose $f:\mathbb{R}\rightarrow\mathbb{R}$ is convex and let $A$ be a self-adjoint $n\times n$ matrix. For any orthonormal basis $\{u_1,\dots,u_n\}$ of $\mathbb{C}^n$, it holds that $$ \operatorname{Tr}f(A)\geq \sum_{j=1}^n f(u_j^*Au_j). $$ Proof. Let $A = \sum_{i=1}^n a_iv_iv_i^*$ be the spectral decomposition of $A$. Then $\sum_{j=1}^{n}|u_j^*v_i|^2=1$ for each $j$ and

\begin{align*} \operatorname{Tr}f(A) &= \sum_{j=1}^n\sum_{i=1}^n f(a_i) |u_j^*v_i|^2\\ & \geq \sum_{j=1} f \left(\sum_{i=1}^na_i |u_j^*v_i|^2\right)= \sum_{j=1}f (u_j^*Au_j) \end{align*} where the inequality is due to the convexity of $f$. $\Box$

Theorem. Let $f:\mathbb{R}\rightarrow\mathbb{R}$ be convex. Then $A\mapsto \operatorname{Tr}f(A)$ is convex as a function of matrices.

Proof. Let $A$ and $B$ be self-adjoint $n\times n$ matrices and let $t\in(0,1)$. We will show that $$t\operatorname{Tr}f(A)+(1-t)\operatorname{Tr}f(B)\geq \operatorname{Tr}(f(tA+(1-t)B)).$$ Let $\{u_1,\dots,u_n\}$ be the eigenbasis of $tA+(1-t)B$. By the Lemma, \begin{align*} t\operatorname{Tr}f(A)+(1-t)\operatorname{Tr}f(B) & \geq t\sum_{j=1}^n f(u_j^*Au_j) + (1-t)\sum_{j=1}^nf(u_j^*Bu_j)\\ & = \sum_{j=1}^n\bigl(tf(u_j^*Au_j) + (1-t)f(u_j^*Bu_j)\bigr)\\ & \geq \sum_{j=1}^nf\bigl(tu_j^*Au_j+ (1-t)u_j^*Bu_j)\\ & = \sum_{j=1}^nf\bigl(u_j^*(tA+(1-t))Bu_j)\\ & = \operatorname{Tr}(f(tA+(1-t)B)), \end{align*} as desired, where the second inequality is due to convexity of $f$. $\Box$

2

There are 2 best solutions below

1
On BEST ANSWER

The answer is no. Since every positive semidefinite matrix is a nonnegatively weighted sum of orthogonal projections, we may assume that $B$ is an orthogonal projection. With this simplification, it is not hard to see that the inequality in question must hold if $f$ is operator convex.

So, to find a counterexample to the inequality, we should look for a function $f$ that is convex but not operator convex. Take $f(x)=x^4$. Let $n=2$ and $B=\operatorname{diag}(1,0)$, so that $\operatorname{tr}(Bf(A))$ is just the $(1,1)$-th entry of $f(A)$. We use the following Octave/Matlab script to see if the first entry of $f(A)$ is mid-point convex:

[U,S,V]=svd(rand(2,2)); % this generates two orthogonal matrices U,V
D1=diag(rand(2,1)); % random positive diagonal matrix
D2=diag(rand(2,1)); % random positive diagonal matrix
A1=U*D1*U';      % random positive definite matrix
A2=V*D2*V';      % random positive definite matrix
F1=A1^4;         % f(A1)
F2=A2^4;         % f(A2)
A3=(A1+A2)./2;   % mean of A1 and A2
F3=A3^4;         % f(A3)
% If the (1,1)-th entry of f(A) is convex, we will have d>=0:
d = (F1(1,1)+F2(1,1))/2 - F3(1,1)

The answer turns out to be "no":

d = -0.0029889

A1 =
   0.44099  -0.21308
  -0.21308   0.91124

A2 =
   0.39698  -0.27903
  -0.27903   0.50098

A3 =
   0.41898  -0.24606
  -0.24606   0.70611

F1 =
   0.14056  -0.32144
  -0.32144   0.84997

F2 =
   0.11822  -0.14139
  -0.14139   0.17092

F3 =
   0.13238  -0.22015
  -0.22015   0.38927
2
On

After reading user1551's answer, I have subsequently realized that my question can be answered in the negative as follows.

For $n\times n$ matrices, we write $A\geq 0$ if $A$ is positive semidefinite, and $A\geq B$ if $A-B\geq 0$. We say a function $f:\mathbb{R}\rightarrow\mathbb{R}$ is matrix convex if, for all integers $n$, it holds that $$ f(tA+(1-t)B)\leq tf(A)+(1-t)f(B) $$ for all $n\times n$ hermitian matrices $A,B$ and all $t\in(0,1)$, where the inequality is respect to the ordering of positive semidefinite matrices.

Theorem Let $f:\mathbb{R}\rightarrow\mathbb{R}$ be a function. The following are equivalent:

  1. $f$ is matrix convex
  2. For all integers $n$ and all $n\times n$ matrices $B\geq 0$, the function $A\mapsto \operatorname{Tr}(Bf(A))$ is convex.

Proof. If $f$ is matrix convex then the result clearly follows. So suppose that $f$ is not matrix convex. Then there exists an $n$ and $n\times n$ hermitian matrices $A$ and $B$ and $t\in(0,1)$ such that $$ f(tA+(1-t)B)\not\leq tf(A)+(1-t)f(B). $$ Thus there is a vector $v\in\mathbb{C}^n$ such that $$ v^*\bigl(f(tA+(1-t)B)\bigr)v >tv^*f(A)v+(1-t)v^*f(B)v $$ which is equivalent to $$ \operatorname{Tr}\bigl(vv^*f(tA+(1-t)B)\bigr) >t \operatorname{Tr}\bigl(vv^*f(A)\bigr) + (1-t)\operatorname{Tr}\bigl(vv^*f(B)\bigr), $$ hence the function $A\mapsto \operatorname{Tr}(vv^*f(A))$ is not convex.