What is the solution to this matrix optimization problem $A^* = \text{argmin}_{A} \sum_{i=1}^{r-1}|Ax_i-x_{i+1}|^2$?

62 Views Asked by At

The motivation for asking this question, is that if the vectors represent word embeddings in natural language processing, then the matrix, should represent a document consisting of a sequence of words / word embeddings.

I have a sequence of vectors over the real numbers $x_1,\cdots,x_r$ and I am searching a matrix $A$ which will minimize the following error:

What is the solution to this matrix optimization problem $A^* = \text{argmin}_{A} \sum_{i=1}^{r-1}|Ax_i-x_{i+1}|^2$?

Here $|.|$ denotes the Euclidean norm of the vector $Ax_i - x_{i+1}$.

Thanks for your help.

(My calculation in matrix analysis are a bit rusty and I have computed the following "solution":

$$A = \frac{1}{2} ( \sum_{i=1}^{r-1} x_{i+1}x_i^T) ( \sum_{i=1}^{r-1} x_{i}x_i^T)^{-1}$$

but this solution migh be wrong.)

2

There are 2 best solutions below

1
On BEST ANSWER

Consider the matrices $\mathbf{R}_1= \left[ \mathbf{x}_1,\ldots,\mathbf{x}_{r-1} \right]$ and $\mathbf{R}_2= \left[ \mathbf{x}_2,\ldots,\mathbf{x}_{r} \right]$

The objective function writes $\phi(\mathbf{A}) = \| \mathbf{A} \mathbf{R}_1 - \mathbf{R}_2 \|^2_F $ The closed form solution is $$ \mathbf{A} = \mathbf{R}_2 \mathbf{R}_1^T \left( \mathbf{R}_1 \mathbf{R}_1^T \right)^{-1} $$

1
On

Isn't this separable by rows of $A$? Because the squared norm is just a sum of squares. Assume $A$ is $n\times n$. Row $j$ of $A$ is given by $R_j$ (an $n-$vector). Then your objective is: $$\sum_{j=1}^n \sum_{i=1}^{r-1} \Big( R_j \cdot x_i -x_{i+1,j}\Big)^2.$$ So you can solve for each best row vector $R_j$ on its own. Say using linear regression.