How to maximize the norm?

492 Views Asked by At

$$\begin{array}{ll} \text{maximize} & \|\pmb A \pmb x + \pmb h\|_2^2\\ \text{subject to} & |x_i| = 1\end{array}$$

where $x_i$ is the $i$-th element of $\pmb x$, $\pmb A \in \Bbb C^{N \times K}$ is full column rank, $\pmb x \in \Bbb C^{K \times 1}$,$\pmb h \in \Bbb C^{N \times 1}$. I tried directly make derivatives of this object function, yet it won't satisify the constraint. It must be wrong. Any comments would help! Thanks!

1

There are 1 best solutions below

5
On BEST ANSWER

We turn the problem into a problem of real variables and a real output to determine the optimal value. For convenience, we will split the argument of ${\bf x}$ into its real and imaginary components as ${\bf x}={\bf w}+i{\bf y}$. We now seek to maximize the real valued objective function

$$f({\bf x})\equiv f({\bf w},{\bf y})=||{\bf A}({\bf w}+i{\bf y})+{\bf h}||_2^2$$

The constraint functions can be viewed as

$$g_i({\bf x})\equiv g_i({\bf w},{\bf y})=w_i^2+y_i^2-1=0$$

and there is a constraint function for each $i=1,...,K$. We can set this up as a Lagrange multiplier problem by defining the auxiliary function

$$L({\bf x},\lambda)\equiv L({\bf w},{\bf y},\pmb{\lambda})=f({\bf w},{\bf y})-\sum_{i=1}^K\lambda_ig_i({\bf w},{\bf y})$$

$$=||{\bf A}({\bf w}+i{\bf y})+{\bf h}||_2^2-\sum_{i=1}^K\lambda_i(w_i^2+y_i^2-1)$$

$$=({\bf w}+i{\bf y})^*{\bf A}^*{\bf A}({\bf w}+i{\bf y})+{\bf h}^*{\bf A}({\bf w}+i{\bf y})+({\bf w}+i{\bf y})^*{\bf A}^*{\bf h}+{\bf h}^*{\bf h}-\sum_{i=1}^K\lambda_i(w_i^2+y_i^2-1)$$

$$={\bf w}^*{\bf A}^*{\bf Aw}+i{\bf w}^*{\bf A}^*{\bf A}{\bf y}-i{\bf y}^*{\bf A}^*{\bf A}{\bf w}+{\bf y}^*{\bf A}^*{\bf Ay}+{\bf h}^*{\bf Aw}+i{\bf h^*}{\bf Ay}$$

$$+{\bf w^*}{\bf A^*}{\bf h}-i{\bf y}^*{\bf A}^*{\bf h}+|{\bf h}|^2-\sum_{i=1}^K\lambda_i(w_i^2+y_i^2-1)$$

This is a real valued auxiliary function of three real vectors ${\bf w}$, ${\bf y}$, and $\pmb{\lambda}$. The gradient of this function is

$$\nabla_{\bf w}L=\left({\bf A^*}{\bf A}+({\bf A^*}{\bf A})^T\right){\bf w}+i{\bf A}^*{\bf Ay}-i\left({\bf y}^*{\bf A}^*{\bf A}\right)^T+\left({\bf h}^*{\bf A}\right)^T+{\bf A}^*{\bf h}-2\pmb{\Lambda}{\bf w}=0$$

$$\nabla_{\bf y}L=i\left({\bf w}^*{\bf A^*}{\bf A}\right)^T-i{\bf A^*}{\bf Aw}+\left({\bf A}^*{\bf A}+({\bf A^*}{\bf A})^T\right){\bf y}+i\left({\bf h}^*{\bf A}\right)^T-i{\bf A}^*{\bf h}-2\pmb{\Lambda}{\bf y}=0$$

Where we have defined $\pmb{\Lambda}=diag(\pmb{\lambda})$. We can also use the fact that $\bf w$ and $\bf y$ are both real to simplify these expressions as

$$\nabla_{\bf w}L=2Re\left({\bf A}^*{\bf A}\right){\bf w}-2Im\left({\bf A^*}{\bf A}\right){\bf y}+2Re({\bf A}^*{\bf h})-2\pmb{\Lambda}{\bf w}=0$$

$$\nabla_{\bf y}L=2Im\left({\bf A^*}{\bf A}\right){\bf w}+2Re\left({\bf A}^*{\bf A}\right){\bf y}+2Im({\bf A}^*{\bf h})-2\pmb{\Lambda}{\bf y}=0$$

Lastly, dividing by $2$ gives

$$\begin{split} Re\left({\bf A}^*{\bf A}\right){\bf w}-Im\left({\bf A^*}{\bf A}\right){\bf y}+Re({\bf A}^*{\bf h})-\pmb{\Lambda}{\bf w}=0 \\ Im\left({\bf A^*}{\bf A}\right){\bf w}+Re\left({\bf A}^*{\bf A}\right){\bf y}+Im({\bf A}^*{\bf h})-\pmb{\Lambda}{\bf y}=0 \end{split}$$

From here, we can list the individual rows of these equations to further solve for each $\lambda_i$. Row $i$ of the first equation yields

$$\left[Re\left({\bf A}^*{\bf A}\right){\bf w}-Im\left({\bf A^*}{\bf A}\right){\bf y}+Re({\bf A}^*{\bf h})\right]_i=\lambda_iw_i$$

and row $i$ of the second equation is

$$\left[Im\left({\bf A^*}{\bf A}\right){\bf w}+Re\left({\bf A}^*{\bf A}\right){\bf y}+Im({\bf A}^*{\bf h})\right]_i=\lambda_iy_i$$

If we multiply the first equation by $w_i$ and the second by $y_i$ and add them together, we have created a condition for each $\lambda_i$ as

$$\lambda_i=\left[Re\left({\bf A}^*{\bf A}\right){\bf w}-Im\left({\bf A^*}{\bf A}\right){\bf y}+Re({\bf A}^*{\bf h})\right]_iw_i+\left[Im\left({\bf A^*}{\bf A}\right){\bf w}+Re\left({\bf A}^*{\bf A}\right){\bf y}+Im({\bf A}^*{\bf h})\right]_iy_i$$

Now, we have a system of $2K$ equations without the $\lambda_i$'s of

$$\boxed{\begin{split} \left[Re\left({\bf A}^*{\bf A}\right){\bf w}-Im\left({\bf A^*}{\bf A}\right){\bf y}+Re({\bf A}^*{\bf h})\right]_i=\left[Re\left({\bf A}^*{\bf A}\right){\bf w}-Im\left({\bf A^*}{\bf A}\right){\bf y}+Re({\bf A}^*{\bf h})\right]_iw_i^2+\left[Im\left({\bf A^*}{\bf A}\right){\bf w}+Re\left({\bf A}^*{\bf A}\right){\bf y}+Im({\bf A}^*{\bf h})\right]_iy_iw_i \\ \left[Im\left({\bf A^*}{\bf A}\right){\bf w}+Re\left({\bf A}^*{\bf A}\right){\bf y}+Im({\bf A}^*{\bf h})\right]_i=\left[Re\left({\bf A}^*{\bf A}\right){\bf w}-Im\left({\bf A^*}{\bf A}\right){\bf y}+Re({\bf A}^*{\bf h})\right]_iw_iy_i+\left[Im\left({\bf A^*}{\bf A}\right){\bf w}+Re\left({\bf A}^*{\bf A}\right){\bf y}+Im({\bf A}^*{\bf h})\right]_iy_i^2 \end{split}}$$

We can solve these equations for $\bf w$ and $\bf y$ to recover $\bf x$.