Where is my calculation wrong?

65 Views Asked by At

For the following quadratic form $$f(\mathbf{x})=\mathbf{x^TA^TWAx}-2\mathbf{b^TWAx}+\mathbf{b^TWb}$$ where $\mathbf{x}\in R^m$ $\mathbf{A}\in R^{n\times m}$ $\mathbf{b}\in R^n$ and $\mathbf{W}\in R^{n\times n}$ ($\mathbf{W}$ is a diagonal matrix) the value of $\mathbf{x}$ that minimizes $f(\mathbf{x})$ is $\mathbf{x}^*=\left(\mathbf{A^TWA}\right)^{-1}\mathbf{b^TWA}$. When I put this value I get $$f(\mathbf{x^*})=\left[\left(\mathbf{A^TWA}\right)^{-1}\mathbf{b^TWA}\right]^T\mathbf{b^TWA}-2\mathbf{b^TWA\left(A^TWA\right)^{-1}b^TWA+b^TWb}.$$ But this is not what is written in Convex optimization book by Stephen Boyd on page 82. The expression provided by the book is as follows $$f(\mathbf{x^*})=\mathbf{b^TWb-b^TWA\left(A^TWA\right)^{-1}A^TWb}$$ where is the mistake in my answer?

2

There are 2 best solutions below

2
On BEST ANSWER

The mistake is that the minimizing value for $x$ is $$x=(A^{T}WA)^{-1}A^{T}Wb$$ not $x=(A^{T}WA)^{-1}b^{T}WA$.

You can see this easily if you differentiate $f(x)$ with respect to $x^{T}$

$$f(x) = x^{T}A^{T}WAx - 2b^{T}WAx + b^{T}Wb$$ $$\frac{\partial f}{\partial x^{T}}(x) = \frac{\partial x^{T}}{\partial x^{T}}A^{T}WAx + x^{T}A^{T}WA\frac{\partial x}{\partial x^{T}}- 2b^{T}WA\frac{\partial x}{\partial x^{T}}\\ = A^{T}WAx + (x^{T}A^{T}WA)^{T} - (2b^{T}WA)^{T}\\ = A^{T}WAx + A^{T}WAx - 2A^{T}Wb\\ = 2A^{T}WAx - 2A^{T}Wb \overset{!}{=}0$$ $$\implies A^{T}WAx = A^{T}Wb$$ $$\implies x = (A^{T}WA)^{-1}A^{T}Wb$$

Now if you substitute that in $f$ you get the desired result.

4
On

The first two terms in $f(\mathbf{x^*}) =\left[\left(\mathbf{A^TWA}\right)^{-1}\mathbf{b^TWA}\right]^T\mathbf{b^TWA} -2\mathbf{b^TWA\left(A^TWA\right)^{-1}b^TWA +b^TWb}$ are substantially the same once the transpose and inverse are done.

For example,

$\mathbf{b^TWA\left(A^TWA\right)^{-1}b^TWA}\\ =\mathbf{b^TWA\left(A^{-1}W^{-1}(A^T)^{-1}\right)b^TWA}\\ =\mathbf{b^T(A^T)^{-1}b^TWA} $