Gradient of Predictive Sparse Decomposition Cost function

342 Views Asked by Bumbble Comm At 04 Apr 2026 - 4:12

I am trying to minimize the following Cost function with respect to $X_m$.

$$ Energy = f(X) = \frac{1}{2}||I-\sum_{m=1}^{M}{C_m * X_m}||_2^2+\sum_{m=1}^{M}{||X_m-\phi(W_m * I)||_2^2}+\lambda|X|_1 $$

$$ X_{min}=\arg{ \min_x{f(X_m)}} $$

with

$I:$ Input image (size: w x h)

$C_1 ... C_M:$ Decoder matrices (size: s x s)

$W_1 ... W_m:$ Encoder matrices (size: s x s)

$X_1 ... X_m:$ Sparse matrices (size: w+s-1 x h+s-1)

$C_m * X_m:$ The 2D convolution between $C_m$ and $X_m$. (In matlab: conv2(X,C,'valid'))

$W_m * I:$ The 2D convolution between $W_m$ and $I$. (In matlab: conv2(I,W,'full'))

$\phi(...):$ A (activation) function.

$||...||_2$: The L2-norm of a matrix.

$|...|_1:$ The L1-norm of a matrix.

I am trying to minimize the energy by using gradient descent algorithm:

$$ X_n=X_n-\nabla{f(X_n)} $$ where $$ \nabla{f(X_n)} = C_n^{'} * (I-\sum_{m=1}^M C_m * X_m) + (X_n - \phi(W_n * I)) $$

with

$C_n^{'} * z:$ Convolution of the $180^{\circ}$ rotation of $C_n$ and z (In matlab: conv2(z,rot90(C,2),'full'))

When running the algorithm, the Energy doesn't minimize, but becomes only larger. So I have 2 questions:

Is the gradient $\nabla{f(X_n)}$ of the energy function correct?
If the gradient is correct, what could be the problem that the energy doesn't minimize?

Original Q&A

There are 1 best solutions below

Bumbble Comm On 26 Jul 2013 - 5:41

It seems that the problem lies in the border of the $X$ matrices. Because I do a full convolution (conv2(I,W,'full')), the $X$ matrices gets extra values from the convolution of zero-padded borders. So these borders and in specific the corners are dependent on fewer values in the image. This will contribute to larger gradient in these locations. The next iteration of the algorithm will try to rectify the update of the larger gradient. This results in some kind of oscillating behavior, increasing the energy rather than minimizing it.

A solution is changing the full convolution (conv2(I,W,'full')) into a valid convolution. Or by using a mask that reduces the impact of the border in the gradient.

Gradient of Predictive Sparse Decomposition Cost function

There are 1 best solutions below

Related Questions in MULTIVARIABLE-CALCULUS

Related Questions in ALGORITHMS

Related Questions in OPTIMIZATION

Related Questions in CONVOLUTION

Related Questions in PARTIAL-DERIVATIVE

Trending Questions

Popular # Hahtags

Popular Questions