Minimize $\sum\limits_i || t_i - A_i \ x||^2$ wrt x

117 Views Asked by At

I would like to find in closed form the solution to the following: Find x that minimizes $$ f(x) = \sum\limits_i (t_i - x)^T A_i (t_i - x),\\ t_i, x \in \mathbb{R}^3,\\ A_i \in \mathbb{R}^{3\times3},\; A_i\,\text{positive semi-definite (covariance matrix)} $$ Note, dimension of 3 for the vectors is arbitrary as I believe a solution will generalize to any dimensionality.

Since each $A_i$ is positive semidefinite, we can write it as $A_i=M_i^T M_i$ or $A_i = M_i M_i$ where $M$ is symmetric. Then we have the following: $$ f(x) = \sum_i (M_it_i - M_ix)^{T} (M_it_i - M_ix) $$ Letting $y_i = M_it_i$, we are minimizing over x: $$f(x) = \sum_i (y_i - M_ix)^T (y_i - M_ix)$$ which is equivalent to minimizing: $$f(x) = \sum_i || y_i - M_ix ||^2$$

This is as far as I have gotten. Perhaps the next step is to see if we can solve by taking derivative wrt x and setting to zero, but I'm not sure how to work out the derivative in vector/matrix equations. Maybe even if this step were to be done will it reveal a closed form solution?

2

There are 2 best solutions below

0
On

The trick is to write explicitly $f(x)$ in terms of the components of $x$ and $y$. It is just a little tedious (but not hard), because of the matrix $M_i$. For example $(M_ix)_1=M_{i,11}x_1+M_{i,12}x_2+M_{i,13}x_3$. You will get $f(x)$ of the form $$f(x)=\sum_i\sum_{j=1}^3\sum_{k=1}^3C_{ijk}x_jx_k+\sum_i\sum_{j=1}^3D_{i,j} x_j+\sum_iE_i$$ Now take the derivatives with respect to the components of $x$, and equate them to $0$. You should get a system of three linear equations in $x_1$, $x_2$, and $x_3$.

0
On

Let's use a colon to denote the trace/Frobenius product, i.e. $$\eqalign{ &X:Y &= {\,\rm tr}(X^TY) &\,\,\text{(matrix version)}\cr &x:y &= {\,\rm tr}(x^Ty) = x^Ty &\,\,\text{(vector version)} \cr }$$ Then the function can be written as $$\eqalign{ f &= \sum_k A_k:(x-t_k)(x-t_k)^T \cr &= \sum_k A_k:xx^T - 2\sum_k A_kt_k:x + \sum_k A_k:t_kt_k^T \cr &= B:xx^T - 2b:x + \beta \cr }$$ In the last line, all of the subscripted variables have been collected into three matrix/vector/scalar sums
$$\eqalign{ B = \sum_kA_k\,,\,\,\,\,\,\,\,b = \sum_kA_kt_k\,,\,\,\,\,\,\,\,\beta = \sum_k2A_k:t_kt_k^T \cr }$$ Now let's find the differential and gradient of the function $$\eqalign{ df &= B:d(xx^T) - 2b:dx + 0 \cr &= (B+B^T):dx\,x^T - 2b:dx \cr &= 2(Bx-b):dx \cr \frac{\partial f}{\partial x} &= 2(Bx-b) \cr }$$ Now set the gradient to zero and solve for the optimal $x$-vector $$\eqalign{ Bx &= b \cr x &= B^{-1}b \cr }$$