Derive least square equation $A^TA\hat{x} = A^Tb$ from $||A\hat{x} - b||_2^2$

48 Views Asked by At

I know how to derive least square equation $A^TA\hat{x} = A^Tb$ from geometric aspect of orthogonality between error vector and column space of $A$. And I'm curious about if it is possible to derive it from $||A\hat{x} - b||_2^2$. The following steps are my derivation, but I can't get the expected result:

\begin{align} ||A\hat{x} - b||_2^2 &= (A\hat{x} - b)^T(A\hat{x} - b) \\ &= \hat{x}^TA^TA\hat{x} - 2\hat{x}^TA^Tb - b^Tb \end{align}

Taking first-derivative ($\hat{x}$) from the above result and make it equal to 0 gives:

$$A^TA\hat{x} = 2 A^Tb$$

The above result is wrong obviously, but I can't figure out what's wrong with my derivation. Or do I miss some concept about vector calculus?


Update:

I got 2 $\hat{x}$ in the first term of my derivation and I shall take that into account. Taking first-derivative ($\hat{x}$) from the above result and make it equal to 0 gives:

$$2 A^TA\hat{x} = 2 A^Tb$$

$$A^TA\hat{x} = A^Tb$$

1

There are 1 best solutions below

0
On

If we were working with numbers instead of matrices, it would be obvious that $d(xA^2x)/dx=2A^2$. That already shows you miscalculated. To wit:$$\frac{\partial}{\partial x_i}(x_jA^T_{jk}A_{kl}x_l)=A_{kj}A_{kl}(\delta_{ij}x_l+x_j\delta_{il})=A_{ki}A_{kl}x_l+A_{kj}A_{ki}x_j=2(A^TAx)_i.$$