How to get gradient of linear equations?

60 Views Asked by At

I am new to derivatives of linear equation systems. And I would like to

$$\|Ax-b\|^2 = (Ax-b)^T(Ax-b) (equation)$$

And how do I reach to this gradient: $\Delta_x = 2(Ax-b)^TA$?

If I develop the equation I get: $Ax^TAx-b^TAx-Ax^Tb+b$. But how do I get to the gradient ?

2

There are 2 best solutions below

0
On

Let $f(x)=||Ax-b||^2$.

Then $$\begin{eqnarray} f(x+h)-f(x) & = & ||Ax-b +Ah||^2 -||Ax-b||^2 & \\ &=& (Ah)^T (Ax-b) + (Ax-b)^T Ah + (Ah)^T (Ah) & \\ &=& (Ah)^T (Ax-b) + (Ax-b)^T Ah &\text{neglecting the $||h||^2$ terms}\\ &=& (Ax-b)^T Ah + (Ax-b)^T Ah &\text{since $(Ah)^T (Ax-b)$ is $1\times 1$}\\ &&&\text{and so symmetric.} \end{eqnarray} $$

That is, $$ f(x+h)-f(x)= 2(Ax-b)^T A h+ O(||h||^2), $$ so the gradient is $2(Ax-b)^T A$.

0
On

This question comes up often, and the most elegant solution uses the chain rule. Let $f(x) = \| Ax - b ||_2^2$. Note that $f(x) = g(h(x))$, where $h(x) = Ax - b$ and $g(u) = \| u \|^2$. The derivatives of $h$ and $g$ are $h'(x) = A$ and $g'(u) = 2u^T$ (a row vector). By the chain rule, $$ f'(x) = g'(h(x)) h'(x) = 2(Ax - b)^T A. $$