Matrix Linear Least Squares Problem with Diagonal Matrix Constraint

Question

Matrix Linear Least Squares Problem with Diagonal Matrix Constraint

3k Views Asked by Bumbble Comm At 31 Mar 2026 - 5:59

How could one solve the following least-squares problem with Frobenius Norm and diagonal matrix constraint?

$$\hat{S} := \arg \min_{S} \left\| Y - XUSV^T \right\|_{F}^{2}$$

where the $S$ is a diagonal matrix and $U,V$ are column-orthogonal matrix. Is there any fast algorithm?

Original Q&A

There are 2 best solutions below

Bumbble Comm On 09 Sep 2017 - 12:26

Vectorize $S$, express matrix multiplication by $X,U$ from left and $V^T$ from right with Kronecker products. Now you have a linear least squares problem in which you can add the extra term $\left\| D \text{vec}(S) \right\|_F^2$, where you make $D$ to be a diagonal weight matrix with large positive values corresponding to off-diagonal elements in $S$. This will regularize away any non-diagonal solution $\hat S$.

EDIT:

From wikipedia, you can use the following relation to build your objective term:

$${\bf AXB = C} \Leftrightarrow ( {\bf B^T \otimes A} )\text{vec}({\bf X}) =\text{vec}({\bf C})$$

Then to build the regularizing term - enforcing a diagonal matrix, just add $+\lambda\|{\bf D}\text{vec}({\bf X})\|_F^2$, where $$\cases{D_{ii} = \cases{0,&index $i$ corresponds to diagonal element in X\\1,&index $i$ corresponds to non-diagonal element in X}\\D_{ij} = 0, i\neq j}$$

**Bumbble Comm** · Accepted Answer

The Problem

Stating the problem in more general form:

$$ \arg \min_{S} f \left( S \right) = \arg \min_{S} \frac{1}{2} \left\| A S {B}^{T} - C \right\|_{F}^{2} $$

The derivative is given by:

$$ \frac{d}{d S} \frac{1}{2} \left\| A S {B}^{T} - C \right\|_{F}^{2} = A^{T} \left( A S {B}^{T} - C \right) B $$

Solution to General Form

The derivative vanishes at:

$$ \hat{S} = \left( {A}^{T} A \right)^{-1} {A}^{T} C B \left( {B}^{T} B \right)^{-1} $$

Solution with Diagonal Matrix

The set of diagonal matrices $ \mathcal{D} = \left\{ D \in \mathbb{R}^{m \times n} \mid D = \operatorname{diag} \left( D \right) \right\} $ is a convex set (Easy to prove by definition as any linear combination of diagonal matrices is diagonal).

Moreover, the projection of a given matrix $ Y \in \mathbb{R}^{m \times n} $ is easy:

$$ X = \operatorname{Proj}_{\mathcal{D}} \left( Y \right) = \operatorname{diag} \left( Y \right) $$

Namely, just zeroing all off diagonal elements of $ Y $.

Hence one could solve the above problem by Project Gradient Descent by projecting the solution of the iteration onto the set of diagonal matrices.

The Algorithms will be:

$$ \begin{align*} {S}^{k + 1} & = {S}^{k} - \alpha A^{T} \left( A {S}^{k} {B}^{T} - C \right) B \\ {S}^{k + 2} & = \operatorname{Proj}_{\mathcal{D}} \left( {S}^{k + 1} \right)\\ \end{align*} $$

The code:

mAA     = mA.' * mA;
mBB     = mB.' * mB;
mAyb    = mA.' * mC * mB;

mS          = mAA \ (mA.' * mC * mB) / mBB; %<! Initialization by the Least Squares Solution
vS          = diag(mS);
mS          = diag(vS);
vObjVal(1)  = hObjFun(vS);

for ii = 2:numIterations

    mG = (mAA * mS * mBB) - mAyb;
    mS = mS - (stepSize * mG);

    % Projection Step
    vS          = diag(mS);
    mS          = diag(vS);

    vObjVal(ii) = hObjFun(vS);
end

Solution with Diagonal Structure

The problem can be written as:

$$ \arg \min_{s} f \left( s \right) = \arg \min_{s} \frac{1}{2} \left\| A \operatorname{diag} \left( s \right) {B}^{T} - C \right\|_{F}^{2} = \arg \min_{s} \frac{1}{2} \left\| \sum_{i} {s}_{i} {a}_{i} {b}_{i}^{T} - C \right\|_{F}^{2} $$

Where $ {a}_{i} $ and $ {b}_{i} $ are the $ i $ -th column of $ A $ and $ B $ respectively. The term $ {s}_{i} $ is the $ i $ -th element of the vector $ s $.

The derivative is given by:

$$ \frac{d}{d {s}_{j}} f \left( s \right) = {a}_{j}^{T} \left( \sum_{i} {s}_{i} {a}_{i} {b}_{i}^{T} - C \right) {b}_{j} $$

Note to Readers: If you know how vectorize this structure, namely write the derivative where the output is a vector of the same size as $ s $ please add it.

By vanishing it or using Gradient Descent one could find the optimal solution.

The code:

mS          = mAA \ (mA.' * mC * mB) / mBB; %<! Initialization by the Least Squares Solution
vS          = diag(mS);
vObjVal(1)  = hObjFun(vS);

vG = zeros([numColsA, 1]);

for ii = 2:numIterations

    for jj = 1:numColsA
        vG(jj) = mA(:, jj).' * ((mA * diag(vS) * mB.') - mC) * mB(:, jj);
    end

    vS = vS - (stepSize * vG);

    vObjVal(ii) = hObjFun(vS);
end

Remark
The direct solution can be achieved by:

$$ {s}_{j} = \frac{ {a}_{j}^{T} C {b}_{j} - {a}_{j}^{T} \left( \sum_{i \neq j} {s}_{i} {a}_{i} {b}_{i}^{T} - C \right) {b}_{j} }{ { \left\| {a}_{j} \right\| }_{2}^{2} { \left\| {b}_{j} \right\| }_{2}^{2} } $$

Summary

Both methods works and converge to the optimal value (Validated against CVX) as the problem above are Convex.

The full MATLAB code with CVX validation is available in my StackExchnage Mathematics Q2421545 GitHub Repository.

Matrix Linear Least Squares Problem with Diagonal Matrix Constraint

There are 2 best solutions below

The Problem

Solution to General Form

Solution with Diagonal Matrix

Solution with Diagonal Structure

Summary

Related Questions in LINEAR-ALGEBRA

Related Questions in OPTIMIZATION

Related Questions in CONVEX-OPTIMIZATION

Related Questions in LEAST-SQUARES

Related Questions in SVD

Trending Questions

Popular # Hahtags

Popular Questions