Find a (nontrivial) linear system of equations satisfied by any vector minimizing the energy

637 Views Asked by At

Here is an exercise 1.5 from the book Numerical Algorithms: Methods for Computer Vision, Machine Learning, and Graphics (by J. Solomon):

Suppose $A,B \in R^{n \times n}$ and $\vec{a},\vec{b} \in R^{n}$. Find a (nontrivial) linear system of equations satisfied by any $\vec{x}$ minimizing the energy $||A\vec{x}-\vec{a}||_{2}^2 + ||B\vec{x}-\vec{b}||_{2}^2$

As I can see, the question is to find system $C\vec{x}=\vec{c}$ which solution is any vector $\vec{x}_{opt}$ that minimises aforementioned function:

$$f(\vec{x})=||A\vec{x}-\vec{a}||_{2}^2 + ||B\vec{x}-\vec{b}||_{2}^2$$

But I can't figure out how to approach to this problem, i.e. can't understand how to deal with that question. I was trying to calculate a gradient of this equation or to use chapter's information about residues and Lagrange multipliers, but don't know if I am going in right direction.

Could someone give me a tip about how to approach to this problem?


Update #1

Using hints given in comments, I've came up with something like this: $$ f(\vec{x})=||Ax-a||_{2}^2 + ||Bx-b||_{2}^2 $$ Expanding norms: $$ f(\vec{x})=||Ax||_2^2 + ||Bx||_2^2 - 2a^TAx - 2b^TBx + ||a||_2^2 + ||b||_2^2 $$ Taking gradient and setting it to zero: $$ \nabla f(\vec{x})=2A\vec{x} + 2B\vec{x} - 2a^TA - 2b^TB = 0 $$ $$ 2(A + B)\vec{x} - 2(a^TA + b^TB) = 0 $$ $$ (A + B)\vec{x} - (a^TA + b^TB)=0 $$ $$ \vec{x}_{opt}=(A + B)^{-1}(a^TA + b^TB) $$

Is it correct?


Update #2

Oh, I see. The derivative was taken in a wrong way. Here how it should be (like it was noted in Walter's answer): $$ f(\vec{x})=x^{\top}A^{\top}Ax + x^{\top}B^{\top}Bx - 2a^{\top}Ax - 2b^{\top}Bx + a^{\top}a + b^{\top}b $$ $$ \nabla f(\vec{x})=2A^{\top}Ax + 2B^{\top}Bx - 2a^{\top}A-2b^{\top}B=0 $$ $$ \vec{x}_{opt}=(A^{\top}A + B^{\top}B)^{-1}(a^{\top}A + b^{\top}B) $$

2

There are 2 best solutions below

3
On BEST ANSWER

$\bf{Hint}$:

If you know the general Least squares approach to $\|Ax-b\|_2^2$, which is \begin{equation} \hat{x}=(A^{\top}A)^{-1}A^{\top}b \end{equation} Then maybe you can rewrite your problem just as general single objective Least Squares problem, remember that the matrix $A$ does not need to be square, but can also be skinny, i.e. $A\in \mathbb{R}^{m\times n}$ with $m>n$.

Do you have a clue what I am hinting at? Otherwise, just let me know.

Edit:

If we go back to your problem of minimizing $\|Ax-a\|_2^2+\|Bx-b\|_2^2$ then see that we can rewrite this as \begin{equation} \left\Vert\begin{bmatrix} A\\ B \end{bmatrix}x - \begin{bmatrix} a\\ b \end{bmatrix}\right\Vert_2^2=\|Cx-c\|_2^2 \end{equation} Now our Least Squares solution becomes(assuming $C$ is full rank) \begin{equation} \hat{x}=(C^{\top}C)^{-1}C^{\top}c= (A^{\top}A+B^{\top}B)^{-1}(A^{\top}a+B^{\top}b) \end{equation}

1
On

Just set the gradient equal to $0$, which immediately yields $$ 2A^T(Ax - b) + 2B^T(Bx - b) = 0. $$ Done!

Here are some hints to take the gradient easily. If $g(x) = \| x \|^2$, then $\nabla g(x) = 2x$. Also, if $h(x) = g(Mx)$, then $\nabla h(x) = M^T \nabla g(Mx)$. (This follows from the chain rule.)