Finding matrix least square problem

70 Views Asked by At

If you have a function of the form

$$ f(x) = \frac{1}{2}\left\lVert Ax - y \right\rVert_2^2 $$

We know that if $A \in \mathbb{R}^{n \times m}, x \in \mathbb{R}^m$ and $y \in \mathbb{R}^n$ we can find the minimizer by differentiating

$$ \nabla_x f = A^TAx - A^Ty $$

setting this to $0$ leads to the linear system

$$ A^TAx = A^Ty. $$

Suppose now instead of $x$ being unknown the unknown is $A$. In this case we have

$$ f(A) = \frac{1}{2} \left\lVert Ax - y \right\rVert_2^2 $$

To calculate the gradient w.r.t. $A$ I proceed as follows (assuming as a norm for $A$ I am using the 2 norm).

$$ \lim_{E \to 0} \frac{\left| f(A + E) - f(A) - T(A)E \right|}{\left\lVert E \right\rVert_2} $$

With a little bit of calculation we can show that $$ f(A + E) - f(A) = \left(x^TA^T - y^T\right)Ex +\left\lVert Ex \right\rVert_2^2 $$

Substituing this into the limit and using the squeeze theorem I get

$$ 0 \leq \lim_{E \to 0} \frac{\left| \left(x^TA^T - y^T\right)Ex +\left\lVert Ex \right\rVert_2^2 - T(A)E \right|}{\left\lVert E \right\rVert_2} \leq \lim_{E \to 0} \frac{\left| \left(x^TA^T - y^T\right)Ex - T(A)E \right| +\left\lVert Ex \right\rVert_2^2 }{\left\lVert E \right\rVert_2} \leq \lim_{E \to 0} \frac{\left\lVert \left(x^TA^T - y^T\right)(\cdot)x - T(A) \right\rVert_{{\mathbb{R}^{n \times m}}^*} \left\lVert E \right\rVert_2 +\left\lVert Ex \right\rVert_2^2 }{\left\lVert E \right\rVert_2} = \lim_{E \to 0} \left\lVert \left(x^TA^T - y^T\right)(\cdot)x - T(A) \right\rVert_{{\mathbb{R}^{n \times m}}^*} $$

The last limit is equal to 0 iff $$ T(A) = \left(x^TA^T - y^T\right)(\cdot)x $$

Question 1: Is my calculation of the differential correct? Assuming it is I was trying to characterize $T(A)$ using a basis $E_{ij} = \delta_{ij}$ by doing this I get

$$ T(A)E_{ij} = \left(x^TA^T - y^T\right)E_{ij}x = \left(x^TA^T - y^T\right)x_j e_i = x_j \left(x^TA^T - y^T\right) e_i = x_j \left(x^TA^T e_i - y^T e_i \right) = x_j \left(x^TA^T e_i - y_i \right) $$

Question 2: Is this correct? This should give me a set of equations that I should be able to solve for $A$?