Help me to derive the derivative for $(Ax-b)^T(Ax-b)$

1.1k Views Asked by At

I know the derivative of $(Ax-b)^T(Ax-b)$ is $A^TAx-A^Tb$, but it seems something wrong with my own derivation. Please help me find the error. Thanks

$$ (Ax-b)^T(Ax-b)=(x^TA^T-b^T)(Ax-b)=x^TA^TAx-x^TA^Tb-b^TAx+b^Tb$$

So, taking the derivative with all four terms respect to $x$, we get

$$ (A^TA+(A^TA)^T)x-A^Tb-b^TA+0 $$

what's wrong there?


Edit: with Siong Thye Goh and Bernard's help I got the error, and would like to provide the right derivation here for my own future reference. $$ \begin{align} (Ax-b)^T(Ax-b)&=(x^TA^T-b^T)(Ax-b)\\ &=x^TA^TAx-x^TA^Tb-b^TAx+b^Tb \\ &=x^T(A^TA)x-x^T(A^Tb)-(b^TA)x+b^Tb \end{align} $$

We need to recall two rules:

$$\frac {\partial(a^Tx)} {\partial x}=\frac {\partial(x^Ta)} {\partial x}=a$$

$$\frac {\partial(x^TAx)} {\partial x}=(A+A^T)x$$

Then, take the derivative respect to $x$ we have

$$ \begin{align} (A^TA+(A^TA)^T)x-2A^Tb&=2A^TAx-2A^Tb\\ &=2A^T(Ax-b) \end{align} $$

2

There are 2 best solutions below

4
On BEST ANSWER

The expected answer is wrong to begin with, it should be $$2(A^TAx-A^Tb)$$

Also, the sizes of $A^Tb$ and $b^TA$ are different, aren't they?

0
On

Note the derivative of $Ax-b$ is $A$ and the derivative of $\;{}^{\mathrm t\mkern-1.5mu}(Ax-b)$ is $\;{}^{\mathrm t\mkern-1.5mu}A$ since it is the composition of $Ax-b$ with the linear operator of transposition. Thus the derivative of the given expression is \begin{align*} {}^{\mathrm t\!}A(Ax-b)+{}^{\mathrm t\mkern-2mu}(Ax-b)A&= ({}^{\mathrm t\!}A A)x-{}^{\mathrm t\!}Ab+{}^{\mathrm t\mkern-2mu}x({}^{\mathrm t\!}AA)-. \end{align*} Now observe that, since ${}^{\mathrm t\mkern-3mu}AA$ is symmetric, $\;{}^{\mathrm t\mkern-2mu}x({}^{\mathrm t\!}AA)=({}^{\mathrm t\!}AA)x$. Thus the derivative is $$2{\,}^{\mathrm t\!}AAx-{}^{\mathrm t\!}Ab-{}^{\mathrm t\mkern-1mu}bA.$$