Gradients for partially symmetric CP decomposition of 3rd order tensor

Question

Gradients for partially symmetric CP decomposition of 3rd order tensor

84 Views Asked by Bumbble Comm At 23 Feb 2026 - 5:25

I am interested in computing a rank-$R$ CP decomposition of a 3rd order tensor that is partially symmetric about the first 2 modes. The factorization of a vanilla CP decomposition is given below as a least squares minimization problem:

$$ \min_{A,B,C} || \mathcal{X} - [[A,B,C]]||^{2} $$

where $\mathcal{X} \in \mathbb{R}^{L \times M \times N} $ is the tensor to approximate, $A \in \mathbb{R}^{L \times R}$, $B \in \mathbb{R}^{M \times R}$, and $C \in \mathbb{R}^{N \times R}$ are factor matrices to estimate. [[]] denotes the sum of outer products of each of the corresponding factor matrix columns:

$$ [[A,B,C]] = \sum_{r=1}^{R} a_{r} \otimes b_{r} \otimes c_{r} $$

Where $a_r$ is the $r$-th column of $A$, etc.

This objective can be rewritten as sub-problems corresponding to each mode:

$$ \min_{A} ||\mathcal{X}_{(1)} - A(C \odot B)^{T}||^{2} \\ \min_{B} ||\mathcal{X}_{(2)} - B(C \odot A)^{T}||^{2} \\ \min_{C} ||\mathcal{X}_{(3)} - C(B \odot A)^{T}||^{2} $$

where the subscript $\mathcal{X}_{(i)}$ indicates the $i$-th mode unfolding of a tensor (matricization where the mode-$i$ fibers become the columns of the resulting matrix) and $\odot$ is the Khatri-Rao product. The partial derivatives, with respect to each factor matrix, are as follows:

$$ \frac{\partial f}{\partial A} = (\mathcal{T}_{(1)} - \mathcal{X}_{(1)})(C \odot B) \\ \frac{\partial f}{\partial B} = (\mathcal{T}_{(2)} - \mathcal{X}_{(2)})(C \odot A) \\ \frac{\partial f}{\partial C} = (\mathcal{T}_{(3)} - \mathcal{X}_{(3)})(B \odot A) $$

where $\mathcal{T}=[[A,B,C]]$. See here and here for details regarding all of the above derivation/implementation.

In the case of partial symmetry about the first two modes, my problem becomes:

$$ \min_{A,A,C} || \mathcal{X} - [[A,A,C]]||^{2} $$

where $\mathcal{X} \in \mathbb{R}^{M \times M \times N} $ is a the tensor to approximate, $A \in \mathbb{R}^{M \times R}$, and $C \in \mathbb{R}^{N \times R}$.

So, my question is, what would the partial derivatives be for the above case with partial symmetry with respect to $A$ and $C$? My actual optimization problem is more complex than this, so I want the gradients for non-linear optimization and this is the particular aspect of the objective that I have been unable to compute them. Any advise is greatly appreciated, I've been banging my head trying to figure this out - thank you!

Original Q&A

There are 2 best solutions below

Bumbble Comm On 17 Nov 2023 - 8:52

The most straightforward way is to use the gradients you have already calculated in the general case. Doing so, you have $$ \frac{\partial \phi}{\partial \mathbf{A}} = \left(\mathcal{T}_{(1)}-\mathcal{X}_{(1)}\right) (\mathbf{C}\boxtimes \mathbf{A}) + \left(\mathcal{T}_{(2)}-\mathcal{X}_{(2)}\right) (\mathbf{C}\boxtimes \mathbf{A}) $$ where $\boxtimes$ denotes the Khatri-Rao product.

We can check that this result agrees with Greg's answer $$ \frac{\partial \phi}{\partial A_{uv}} = \sum_{jk} M_{ujk} (A_{jv}C_{kv})+ \sum_{jk} M_{juk} (A_{jv}C_{kv}) $$

**Bumbble Comm** · Accepted Answer

$ \def\d{\delta} \def\o{{\tt1}} \def\E{{\cal E}} \def\T{{\cal T}} \def\H{{\cal H}} \def\X{{\cal X}} \def\Y{{\cal M}} \def\LR#1{\left(#1\right)} \def\op#1{\operatorname{#1}} \def\vecc#1{\op{vec}\LR{#1}} \def\trace#1{\op{Tr}\LR{#1}} \def\frob#1{\left\| #1 \right\|_F} \def\qiq{\quad\implies\quad} \def\p{\partial} \def\grad#1#2{\frac{\p #1}{\p #2}} \def\c#1{\color{red}{#1}} \def\CLR#1{\c{\LR{#1}}} \def\fracLR#1#2{\LR{\frac{#1}{#2}}} \def\gradLR#1#2{\LR{\grad{#1}{#2}}} $First, define two extremely useful zero-one (aka indicator) tensors $$\eqalign{ \H_{ijk} &= \begin{cases} \o\quad{\rm if}\;\;i=j=k \\ 0\quad{\rm otherwise} \\ \end{cases} \\ \E_{ijkl} &= \begin{cases} \o\quad{\rm if}\;\;i=k\;\;{\rm and}\;\;j=l \\ 0\quad{\rm otherwise} \\ \end{cases} \\ }$$ The latter is often written in terms of Kronecker delta symbols as $$ \E_{ijkl} = \d_{ik}\,\d_{jl} $$ Now we can write matrix self-gradients and $\T$ using the Einstein summation convention $$\eqalign{ &\grad{A_{ij}}{A_{kl}} \;=\; \E_{ijkl} = \d_{ik}\,\d_{jl} \\ &\T = [[A,B,C]]\qiq\T_{ijk} = \H_{pqr}\:A_{ip}\,B_{jq}\,C_{kr} \\ }$$ For typing convenience, define the tensor $$ \Y=\T-\X \qiq d\Y=d\T $$ Write the objective function as a triple contraction product and calculate its gradient wrt $A$ $$\eqalign{ f &= \Y_{ijk} \:\: \Y_{ijk} \\ df &= 2\,\Y_{ijk} \:\: d\T_{ijk} \\ &= 2\,\Y_{ijk} \LR{\H_{pqr}\:\c{dA_{ip}}\,B_{jq}\,C_{kr}} \\ \grad{f}{A_{uv}} &= 2\,\Y_{ijk}\H_{pqr}\,\c{\d_{iu}\d_{pv}}\,B_{jq}\,C_{kr} \\ &= 2\,\Y_{ujk}\H_{vqr}\,B_{jq}\,C_{kr} \\ }$$ Suspending the Einstein convention, you would code this as $$\eqalign{ \grad{f}{A_{uv}} &= 2 \sum_{j=\o}^M\sum_{k=\o}^N \Y_{ujk}\,B_{jv}\,C_{kv} \\ }$$ Change the definition of $\T$ as proposed in your post and re-calculate the gradient $$\eqalign{ \T &= [[A,A,C]]\qiq\T_{ijk} = \H_{pqr}\:A_{ip}\,A_{jq}\,C_{kr} \\ df &= 2\,\Y_{ijk} \:\: d\T_{ijk} \\ &= 2\,\Y_{ijk} \LR{\H_{pqr}\:\c{dA_{ip}}\,A_{jq}\,C_{kr}} + 2\,\Y_{ijk} \LR{\H_{pqr}\:A_{ip}\:\c{dA_{jq}}\,C_{kr}} \\ &= 2\,\Y_{ijk}\H_{pqr}C_{kr} \LR{\:\c{dA_{ip}}\,A_{jq} + A_{ip}\:\c{dA_{jq}}} \\ \grad{f}{A_{uv}} &= 2\,\Y_{ijk}\H_{pqr}C_{kr}\LR{\:\c{\d_{iu}\d_{pv}}A_{jq}+A_{ip}\c{\d_{ju}\d_{qv}}} \\ &= 2\,\Y_{ujk}\H_{vqr}C_{kr}A_{jq} \;+\; 2\,\Y_{iuk}\H_{pvr}C_{kr}A_{ip} \\ }$$ And this would be coded as $$\eqalign{ \grad{f}{A_{uv}} &= 2 \sum_{j=\o}^L\sum_{k=\o}^N \LR{\Y_{ujk}+\Y_{juk}}A_{jv}C_{kv} \\ }$$

Gradients for partially symmetric CP decomposition of 3rd order tensor

There are 2 best solutions below

Related Questions in LINEAR-ALGEBRA

Related Questions in OPTIMIZATION

Related Questions in PARTIAL-DERIVATIVE

Related Questions in MATRIX-CALCULUS

Related Questions in TENSOR-DECOMPOSITION

Trending Questions

Popular # Hahtags

Popular Questions