I'm trying to write the following function in python: $$ f_\mu(\mathcal X) = f_0(\mathcal X) + \sum_{i = 1}^n \max_{||\mathcal Y_{i(i)}|| \leq1} \alpha_i\langle \mathcal X_{(i)},\mathcal Y_{i(i)} \rangle - \frac{\mu_i}{2}||\mathcal Y_{i(i)}||_F^2 \\ \mathcal X, \mathcal Y \ \text{are tensors (MxNx3), } \\ \mathcal X_{(i)} \ \text{is the tensor unfolded along mode i (a matrix), } \\||.||_F \ \text{ is the Frobenius Norm}, \\ \alpha, \mu \in {\mathbb R}, \\ n = 3 $$
What i have so far:
import tensorly as tl
import numpy as np
LA = np.linalg
def f_m(a,X,Y,m):
result = np.zeros((3,3))
zz = []
for i in range(3):
for j in range(3):
yy = tl.unfold(Y[j],j)
xx = a[j]*tl.unfold(X,j)
zz = np.multiply(yy,xx)
zz = LA.norm(zz, 'nuc')
result[i,j] = zz-0.5*m[j] * LA.norm(tl.unfold(Y[j], j), 'fro') ** 2
F1 = max(result[0,:])
F2 = max(result[1,:])
F3 = max(result[2,:])
F = max(F1,F2,F3)
return F
def grad_f_m(m,a,X):
grad = a*truncate_op(a/m * X, 1)
return grad
def truncate_op(X,tau):
u , sig, v = LA.svd(X, full_matrices=False)
t = np.zeros(len(sig))+tau
np.minimum(sig,t,sig)
trunk = np.dot(u[:,:], np.dot(np.diag(sig[:]), v[:,:]))
return trunk
First question:
What does $ f_\mu(\mathcal X)$ return? A scalar? A matrix? A tensor? I'm thinking its a scalar.
Second question: $$ g(X) := ||X||_* = \max_{||Y||\leq1} \langle X,Y \rangle \\||.||_* \ \text{ is the Trace Norm} $$ If i compute the trace norm like this:
numpy.linalg.norm(zz,'nuc')
(which computes the nuclear norm according to the documentation)
Is that equivalent to $$ \max_{||\mathcal Y_{(i)}||\leq1}\alpha_i\langle \mathcal X_{(i)},\mathcal Y_{(i)} \rangle $$
Third question:
$$ \nabla f_\mu(\mathcal X) = \nabla f_0(\mathcal X) + \sum_{i = 1}^n \alpha_i T_1(\frac{\alpha_i}{\mu_i}\mathcal X_{(i)}) $$ $$ T_{\tau} = U\sigma_{\overline \tau}V $$
$$ \sigma_{\overline \tau} = diag(\min(\sigma,\tau)) $$
Where
$$ U,\sigma, V = SVD(\mathcal X_{(i)}) $$
Since $\ T_{\tau}$ will result in a different sized matrix for each $ \mathcal X_{(i)}$ how can I sum them across $ n $ ?