How does $(Ax -b)^T(Ax -b)$ reduce to $x^TA^TAx − 2b^TAx + b^Tb$?

6.7k Views Asked by At

I am not sure if I can write $(Ax -b)^T$ as $(A^Tx^T -b^T)$. If I can, I can't reproduce the above result. Please help. Also refer to any sources that will give me a good insight in matrix derivatives.

2

There are 2 best solutions below

3
On BEST ANSWER

$$ (A x - b)^T (A x - b) $$ $$ = \left[(A x)^T - b^T \right] (A x - b) $$ $$ = ( x^T A^T - b^T ) (A x - b) $$ $$ = x^T A^T (A x - b) - b^T (A x - b)$$ $$ = x^T A^T A x - x^T A^T b - b^T A x + b^T b $$ $$ = x^T A^T A x - (b^T A x)^T - b^T A x + b^T b $$

EDIT: As pointed out in the comments, $b^T A x$ is a dot product of two vectors $b$ and $A x$, which is a scalar. Therefore $(b^T A x)^T = b^T A x$ and $$ (A x - b)^T (A x - b) = x^T A^T A x - 2 b^T A x + b^T b \mathrm{.} $$

0
On

$$\|Ax-b\|^2=\left(Ax-b\right)^T\left(Ax-b\right)=\left(\left(Ax\right)^T-b^T\right)\left(Ax-b\right)=\left(x^TA^T-b^T\right)\left(Ax-b\right)=x^TA^TAx-b^TAx-x^TA^Tb+b^Tb$$ if $A\in \mathbb{R}^{m\times n}$, $x\in \mathbb{R}^{n}$, $b\in \mathbb{R}^{m}$ therefore:

$b^TAx = \sum\limits_{i=1}^mb_i(Ax)_i=\sum\limits_{i=1}^m\sum\limits_{j=1}^n b_i(a_{ij}x_j)=\sum\limits_{j=1}^n \sum\limits_{i=1}^m x_j(a_{ij}b_i)=\sum\limits_{j=1}^n x_j(A^Tb)_j= x^TA^Tb$

So, eventually: $$\left(Ax-b\right)^T\left(Ax-b\right)=x^TA^TAx-2b^TAx+b^Tb$$