Confusion about the solution of a convex problem

55 Views Asked by At

I require the solution of the following problem:

$$ \hat{W} = \arg \min_{W} \frac{1}{2} {\left\| A W x - y \right\|}_{2}^{2} $$

Here, $A$ is a $10 \times 4$ sized matrix; $W$ is a $4 \times 32$ sized matrix and $x$ is a $32$ dimensional vector. $y$ is a $10$ dimensional vector.

The problem is convex; so I need to have at least one solution for $\hat{W}$.

I do the following; I calculate the derivative of $W$ and set it equal to zero:

$$\begin{align*} \hat{W} = \arg \min_{W} \frac{1}{2} {\left\| A W x - y \right\|}_{2}^{2} & \Leftrightarrow \frac{\partial \frac{1}{2} {\left\| A \hat{W} x - y \right\|}_{2}^{2} }{\partial \hat{W}} = 0 \\ & \Leftrightarrow {A}^{T} \left( A \hat{W} x - y \right) {x}^{T} = 0 \\ & \Leftrightarrow {A}^{T} A \hat{W} x {x}^{T} = {A}^{T} y {x}^{T} \\ \end{align*}$$

At this stage; in order to solve for $\hat{W}$; I use the following equation, involving Kronecker Produts and vectorization; from $vec(AXB)=(B^T\otimes A)vec(X)$:

$$vec({A}^{T} A \hat{W} x {x}^{T}) = (xx^T \otimes A^TA) vec(\hat{W}) = vec(A^Tyx^T)$$.

Now, it becomes a linear system which can be solved for $\hat{W}$.

But the thing is; $xx^T$ is a one rank matrix. And it is $\text{rank} \left(A \otimes B \right) = \text{rank}A\cdot \text{rank}B$, which means $(xx^T \otimes A^TA)$ has at most a rank of $rank (A^T A)=4$. Since in the linear system of $Ax=b$, $A$ is rank-deficient; $x$ can only have a solution either which meets the least squares criterion or it can have infinitely many exact solutions. If, let's say; we don't have an exact solution for $vec(\hat{W})$ then this means we don't have a solution which sets the derivative $\frac{\partial \frac{1}{2} {\left\| A \hat{W} x - y \right\|}_{2}^{2} }{\partial \hat{W}}$ to zero, which must not be possible, since the original problem is convex (it has to have a global minimum). I need some directions here: I am sure that I am mixing up something in this thought process but I can't point it out.

1

There are 1 best solutions below

1
On BEST ANSWER

You do not have a generic linear system, but a linear system where the right hand side depends on $A$ and $x$. The system always has at least one solution, as you concluded from convexity.

The kronecker product just creates copies of $A^TA$: $$\begin{pmatrix} x_1 x_1 A^TA & \cdots & x_n x_1 A^TA \\ x_1 x_2 A^TA & \ddots \\ \vdots \\ x_1 x_n A^TA & & x_n x_n A^TA \end{pmatrix}\text{vec}(\hat{W}) = \begin{pmatrix}x_1A^Ty \\ \vdots \\ x_n A^T y\end{pmatrix}.$$ This is similar to a system $A^TAx = A^Tb$ for regular least squares, which always has a solution. The $x_i$ do not ruin the party: if all $x_i$ are $0$, any $\hat{W}$ solves this system. Otherwise, there is at least one $i$ for which $x_i \neq 0$, and the system above can be solved using only the column $i$ of $\hat{W}$ (setting the other columns to $0$).