Maximize expectation of concave function with respect to unitary matrix

213 Views Asked by At

Let $\mathbf{x} \sim \mathcal{N}(\mathbf{m},\mathbf{C})$ and let $\mathbf{D}$ be a diagonal matrix with positive entries and of the same dimension as $\mathbf{C}$. Let $f(z)$ be a strictly increasing and concave function in $z$.

Considering all the possible unitary matrices $\mathbf{U}$ (i.e., $\mathbf{U} \mathbf{U}^{\mathrm{T}}=\mathbf{I}$), I suspect the following holds:

\begin{align} \mathrm{argmax}_{\mathbf{U}} \mathbb{E}_{\mathbf{x}} \big[ f \big( \mathbf{x}^{\mathrm{T}} \mathbf{U} \mathbf{D} \mathbf{U}^{\mathrm{T}} \mathbf{x} \big) \big] = \mathrm{argmax}_{\mathbf{U}} f \big( \mathbb{E}_{\mathbf{x}} \big[ \mathbf{x}^{\mathrm{T}} \mathbf{U} \mathbf{D} \mathbf{U}^{\mathrm{T}} \mathbf{x} \big] \big). \end{align}

However, I don't know how to prove it formally.

  • My intuition is that, by optimizing over unitary matrices $\mathbf{U}$, we are just playing with the "directions" (note that $\mathbf{U} \mathbf{D} \mathbf{U}^{\mathrm{T}}$ can be seen as the eigendecomposition of a symmetric and positive definite matrix), and I suspect that the directions that maximize $\mathbb{E}_{\mathbf{x}} \big[ f \big( \mathbf{x}^{\mathrm{T}} \mathbf{U} \mathbf{D} \mathbf{U}^{\mathrm{T}} \mathbf{x} \big) \big]$ should be the ones maximizing $\mathbb{E}_{\mathbf{x}} \big[ \mathbf{x}^{\mathrm{T}} \mathbf{U} \mathbf{D} \mathbf{U}^{\mathrm{T}} \mathbf{x} \big]$. It would be great if someone could confirm this.

  • Note: to make things easier, we can consider $f(z) = \log(z)$ (which is strictly increasing and concave).

2

There are 2 best solutions below

0
On

I derived the following "partial" result, though still far from answering your question.

If $\mathbf{x} \sim \mathcal{N}(\mathbf{m},\mathbf{C})$, and $\mathbf{D}$ is a diagonal matrix, then given an unitary matrix $\mathbf{U}$, we have

$\mathbb{E}_{\mathbf{x}} \big[ \mathbf{x}^{\mathrm{T}} \mathbf{U} \mathbf{D} \mathbf{U}^{\mathrm{T}} \mathbf{x} \big]=\sum\limits_{i,j}(c_{ii}+m_i^2)\cdot u_{ij}^2 \cdot d_j$

where $c_{ii}$ are the $i$th diagonal element of $\mathbf{C}$,

$m_i$ is the $i$th element of $\mathbf{m}$,

$u_{ij}$ be the element of $i$th row, $j$th column element of $\mathbf{U}$,

and $d_{j}$ be the $j$th diagonal element of $\mathbf{D}$.

Proof:

Note $\mathbf{U} \mathbf{D} \mathbf{U}^{\mathrm{T}}$ is also a diagonal matrix.

the $i$th diagonal element of $\mathbf{U} \mathbf{D} \mathbf{U}^{\mathrm{T}}$ is $\sum_{j}u_{ij}^2\cdot d_j$

So $\mathbf{x}^{\mathrm{T}} \mathbf{U} \mathbf{D} \mathbf{U}^{\mathrm{T}} \mathbf{x}=\sum_{i}x_{i}^2 \cdot \sum_{j}u_{ij}^2 \cdot d_j$

Since $\mathbb{E}_{\mathbf{x}} \big[x_i^2 \big]=c_{ii}+m_i^2$,

$\mathbb{E}_{\mathbf{x}} \big[ \mathbf{x}^{\mathrm{T}} \mathbf{U} \mathbf{D} \mathbf{U}^{\mathrm{T}} \mathbf{x} \big]=\sum_{i}\mathbb{E}_{\mathbf{x}} \big[x_{i}^2 \big] \cdot \sum_{j}u_{ij}^2 \cdot d_j=\sum\limits_{i,j}(c_{ii}+m_i^2)\cdot u_{ij}^2 \cdot d_j$

From here we can find some optimal solution $\mathbf{U}^\star=\mathrm{argmax}_{\mathbf{U}} \mathbb{E}_{\mathbf{x}} \big[ \mathbf{x}^{\mathrm{T}} \mathbf{U} \mathbf{D} \mathbf{U}^{\mathrm{T}} \mathbf{x} \big]$

0
On

Let $z=\mathbf{x}^{\mathrm{T}} \mathbf{U} \mathbf{D} \mathbf{U}^{\mathrm{T}} \mathbf{x}$, and denote the probability density function (PDF) of $z$ as $P_{U}(z)$.

$P_U(z)$ should be a function of $\mathbf{U,m,C,D}$ (here we only consider $\mathbf{U}$ as a variable).

With this random variable transformation $z=\mathbf{x}^{\mathrm{T}} \mathbf{U} \mathbf{D} \mathbf{U}^{\mathrm{T}} \mathbf{x}$, your question can be restated as follows:

\begin{equation} argmax_{\mathbf{U}}\mathbb{E}_{z\sim P_U(z)}\big[f(z)\big]=argmax_{\mathbf{U}}f\big(\mathbb{E}_{z\sim P_U(z)}\big[z\big]\big) \end{equation}

If we assume $f$ is monotonically increasing, then we just need to show: \begin{equation} argmax_{\mathbf{U}}\mathbb{E}_{z\sim P_U(z)}\big[f(z)\big]=argmax_{\mathbf{U}}\mathbb{E}_{z\sim P_U(z)}\big[z\big] \end{equation}

In general, this is not necessarily true. As we can see: \begin{equation} \mathbb{E}_{z\sim P_U(z)}\big[z\big]=\int_{z}z\cdot P_U(z)dz \end{equation}

\begin{equation} \mathbb{E}_{z\sim P_U(z)}\big[f(z)\big]=\int_{z}f(z)\cdot P_U(z)dz \end{equation}

Both $\mathbb{E}_{z\sim P_U(z)}\big[z\big]$ and $\mathbb{E}_{z\sim P_U(z)}\big[f(z)\big]$ are a function of $\mathbf{U}$. If their derivatives exist, then they should be $0$ at local minima. That's to say: \begin{equation}\label{eq:cond1} \int_{z}z\cdot \frac{\partial P_U(z)}{\partial \mathbf{U}}dz=0 \end{equation} \begin{equation}\label{eq:cond2} \int_{z}f(z)\cdot \frac{\partial P_U(z)}{\partial \mathbf{U}}dz=0 \end{equation}

We may be able to find some $\mathbf{U}$ that satisfies both equations above. But the solution sets for the above two equations may probably differ unless there are some unique characteristics of $P_U(z)$ and $f$.

By the way, the above two equations also leads to: $ \int_{z}\big(f(z)-z\big)\cdot \frac{\partial P_U(z)}{\partial \mathbf{U}}dz=0 $

$P_U(z)$, $f$ and optimal $U^{\star}$must satisfy this condition, too.

If there is no error in the above argument, then I would reject the original hypothesis.

Your hypothesis might be true if $P_U(z)$ has some very special structure (which I were not able to derive). Otherwise, only certain $f$ (NOT the entire set of monotonically concave functions) may satisfy your hypothesis.