computing dual matrix trace norm and tensor gradient in python

131 Views Asked by Bumbble Comm At 22 Feb 2026 - 8:36

I'm trying to write the following function in python: $$ f_\mu(\mathcal X) = f_0(\mathcal X) + \sum_{i = 1}^n \max_{||\mathcal Y_{i(i)}|| \leq1} \alpha_i\langle \mathcal X_{(i)},\mathcal Y_{i(i)} \rangle - \frac{\mu_i}{2}||\mathcal Y_{i(i)}||_F^2 \\ \mathcal X, \mathcal Y \ \text{are tensors (MxNx3), } \\ \mathcal X_{(i)} \ \text{is the tensor unfolded along mode i (a matrix), } \\||.||_F \ \text{ is the Frobenius Norm}, \\ \alpha, \mu \in {\mathbb R}, \\ n = 3 $$

What i have so far:

import tensorly as tl
import numpy as np
LA = np.linalg

def f_m(a,X,Y,m):
   result = np.zeros((3,3))
   zz = []
   for i in range(3):
      for j in range(3):
        yy = tl.unfold(Y[j],j)
        xx = a[j]*tl.unfold(X,j)
        zz = np.multiply(yy,xx)
        zz = LA.norm(zz, 'nuc')
        result[i,j] = zz-0.5*m[j] * LA.norm(tl.unfold(Y[j], j), 'fro') ** 2
   F1 = max(result[0,:])
   F2 = max(result[1,:])
   F3 = max(result[2,:])
   F = max(F1,F2,F3)
return F

def grad_f_m(m,a,X):
   grad = a*truncate_op(a/m * X, 1)
return grad

def truncate_op(X,tau):
   u , sig, v = LA.svd(X, full_matrices=False)
   t = np.zeros(len(sig))+tau
   np.minimum(sig,t,sig)
   trunk = np.dot(u[:,:], np.dot(np.diag(sig[:]), v[:,:]))
return trunk

First question:

What does $ f_\mu(\mathcal X)$ return? A scalar? A matrix? A tensor? I'm thinking its a scalar.

Second question: $$ g(X) := ||X||_* = \max_{||Y||\leq1} \langle X,Y \rangle \\||.||_* \ \text{ is the Trace Norm} $$ If i compute the trace norm like this:

numpy.linalg.norm(zz,'nuc') (which computes the nuclear norm according to the documentation)

Is that equivalent to $$ \max_{||\mathcal Y_{(i)}||\leq1}\alpha_i\langle \mathcal X_{(i)},\mathcal Y_{(i)} \rangle $$

Third question:

$$ \nabla f_\mu(\mathcal X) = \nabla f_0(\mathcal X) + \sum_{i = 1}^n \alpha_i T_1(\frac{\alpha_i}{\mu_i}\mathcal X_{(i)}) $$ $$ T_{\tau} = U\sigma_{\overline \tau}V $$

$$ \sigma_{\overline \tau} = diag(\min(\sigma,\tau)) $$

Where

$$ U,\sigma, V = SVD(\mathcal X_{(i)}) $$

Since $\ T_{\tau}$ will result in a different sized matrix for each $ \mathcal X_{(i)}$ how can I sum them across $ n $ ?

Original Q&A

computing dual matrix trace norm and tensor gradient in python

Related Questions in NORMED-SPACES

Related Questions in TENSOR-PRODUCTS

Related Questions in TRACE

Related Questions in PYTHON

Related Questions in TENSOR-DECOMPOSITION

Trending Questions

Popular # Hahtags

Popular Questions