using the Kronecker product and vec operators to write the following least squares problem in standard matrix form

1.3k Views Asked by At

I have a least squares problem with the following form:

$$ \min_\mathbf{X} ~ \sum_{i=1}^n | \mathbf{u}_i^\top \mathbf{X} \mathbf{v}_i - b_i |^2 $$

where $\{\mathbf{u}_i\}_{i=1}^n$ and $\{\mathbf{v}_i\}_{i=1}^n$ are vectors, the $\{b_i\}_{i=1}^n$ are scalars, and $\mathbf{X}$ is a matrix.

I would like to rewrite it in the more standard form which I can hand off to a solver in matlab or numpy:

$$ \min_\mathbf{X} ~ \left\| \mathbf{A} \mathbf{x} - \mathbf{b} \right\|^2 $$

where $\mathbf{x} = \text{vec}(\mathbf{X})$ and $\mathbf{b}$ is a vector whose $i$'th component is the scalar $b_i$ from the first equation.

I've figured out how to this for a single term of the sum.

$$ \mathbf{u}^\top \mathbf{X} \mathbf{v} = \sum_{ij} \mathbf{X}_{ij} u_i v_j = \left(\text{vec}(\mathbf{u}\mathbf{v}^\top)\right)^\top \text{vec}(\mathbf{X}) = (\mathbf{v} \otimes \mathbf{u})^\top\, \text{vec}(\mathbf{X}).$$

But my actual problem doesn't just have a single $\mathbf{u}$ and $\mathbf{v}$, but instead, whole matrices $\mathbf{U}$ and $\mathbf{V}$ whose columns are the $\{\mathbf{u}_i\}$ and $\{\mathbf{v}_i\}$.

I can't go from the single term case to the $n$-term case by letting

$$A = \mathbf{V} \otimes \mathbf{U}$$

analagous to $\mathbf{v} \otimes \mathbf{u}$ because this produces a wrong matrix for $\mathbf{A}$. What I need is a way to write $\mathbf{A}$ such that its $i$'th row contains $\text{vec}(\mathbf{u}_i \mathbf{v}_i^\top)$ which suggests that I need to write $\mathbf{A}$ as a flattened cube whose $i$'th level contains the matrix $\mathbf{u}_i \mathbf{v}_i^\top$.

What is the way of writing this, both in standard mathematical notation and in the notation of numeric programming languages like Matlab or Numpy?

Any help here is appreciated. Many thanks.

2

There are 2 best solutions below

0
On BEST ANSWER

Okay, the answer is to use the Khatri-Rao product "$*$" which computes the column-wise Kronecker products of the columns of two matrices. So $A = \mathbf{V} * \mathbf{U}$.

1
On

Note that $$\eqalign{ \sum_k u_k^TXv_k &= \sum_k Xv_k:u_k \cr &= \sum_k X:u_kv_k^T \cr &= X:\sum_k u_kv_k^T \cr &= X:UV^T \cr }$$ where the colon denotes the Frobenius Inner Product for matrices.

For convenience, define a new variable $$\eqalign{ h &= (UV^T:X) - \sum_k b_k \cr dh &= UV^T:dX \cr\cr }$$

Use this new variable to write function and find its differential $$\eqalign{ f &= h^2 \cr &= 2h\,\,dh \cr &= 2h\,\,UV^T:dX }$$ Since $df=\big(\frac{\partial f}{\partial X}:dX\big),\,$ the gradient is $$\eqalign{ \frac{\partial f}{\partial X} &= 2h\,UV^T \cr\cr }$$ Set the gradient to zero and solve for the optimal $X$.