A few questions about KSVD algorithm (dictionary learning) in a paper

389 Views Asked by At

To learn more about dictionary learning, I am currently trying to understand the concept in detail and to do so, I've found the following paper quite informative:

KSVD: an algorithm for designing overcomplete dictionaries for sparse representation

UPDATE1 (a short explanation of the whole concept):

For most signals, there exists a sparse representation, considering known bases. The bases we use to represent data forms a matrix, called dictionary. In the linear equation: $Y=DX$, $Y$ is our data/signals; $D$ is the dictionary; $X$ is the sparse representation of signals using the dictionary


I have a few questions:

  • does the jth column of D (the overcomplete dictionary) correspond to the jth row of X or its transpose? So we do work on the transpose of X and not the matrix X itself? Why the notation $x_T ^K$ is used and not $x_K ^T$ ?
  • somewhere in the middle, they suggest to initialize $w_k$ which defines as follows: $$w_k=\{i|1 \leq k \leq K: w_k(i) \neq 0\}$$ is defining this variable necessary in spite of the fact that we will define another variable afterwards. $\Omega _k$ is a matrix of size $N$ (number of signals (Y)) $\times |w_k|$ where it is one for $w_k(i)$ and zero otherwise. why do we need to define $w_k$?

  • According to the authors we have the following matrix for error matrix:

$$E_k=Y-\sum_{k \neq j} d_jx_T^{j}$$

(Y is an $n \times N$ matrix (signals), D is a $n \times K$ matrix (the overcomplete dictionary) and $X$ is $K \times N$ (the sparse representation of signals using the dictionary D)

I am not sure about the summation, how is it computed? multiplying each column of D by the corresponding row in X? Please shed some light on the problem!

1

There are 1 best solutions below

2
On
  • $x_{T}^{k}$ denotes the $k$th row in $X$
  • The main idea is that: the authors first defined some dictionary, and then based on them, using K-SVD to denoising the image. After that, the dictionary is updated by OMP (Orthogonal Matching Pursuit). This work will be iteratively done until satisfying convergence condition.
  • $K-SVD$ is a very original paper on dictionary learning, the authors are initially defined $w_{k}$. But, the following work from other researchers show that it is not quite good. Tons of papers have been trying to improve the performance of $K-SVD$. For better performance, you may read some papers which don't need to initially define the dictionary (dictionary is defined by noisy data):

[1] K. Dabov, A. Foi, V. Katkovnik, and K. Egiazarian, “Image Denoising by Sparse 3D Transform-domain Collaborative Filtering,” IEEE Trans. on Image Process., vol. 16, no. 8, pp. 2080-2095, Aug. 2007

[2] K. Dabov, A. Foi, V. Katkvnik, K. Egiazarian, “BM3D Image Denoising with Shape-Adaptive Principal Component Analysis,” Proc. of Workshop on Signal Process. With Adaptive Sparse Structure Representations, 2009

[3] P. Chatterjee, and P. Milanfar, “Patch-Based Near-Optimal Image Denoising,” IEEE Trans. on Image Process., vol. 21, no. 4, pp. 1635-1649, Apr. 2012