To learn more about dictionary learning, I am currently trying to understand the concept in detail and to do so, I've found the following paper quite informative:
KSVD: an algorithm for designing overcomplete dictionaries for sparse representation
UPDATE1 (a short explanation of the whole concept):
For most signals, there exists a sparse representation, considering known bases. The bases we use to represent data forms a matrix, called dictionary. In the linear equation: $Y=DX$, $Y$ is our data/signals; $D$ is the dictionary; $X$ is the sparse representation of signals using the dictionary
I have a few questions:
- does the jth column of D (the overcomplete dictionary) correspond to the jth row of X or its transpose? So we do work on the transpose of X and not the matrix X itself? Why the notation $x_T ^K$ is used and not $x_K ^T$ ?
somewhere in the middle, they suggest to initialize $w_k$ which defines as follows: $$w_k=\{i|1 \leq k \leq K: w_k(i) \neq 0\}$$ is defining this variable necessary in spite of the fact that we will define another variable afterwards. $\Omega _k$ is a matrix of size $N$ (number of signals (Y)) $\times |w_k|$ where it is one for $w_k(i)$ and zero otherwise. why do we need to define $w_k$?
According to the authors we have the following matrix for error matrix:
$$E_k=Y-\sum_{k \neq j} d_jx_T^{j}$$
(Y is an $n \times N$ matrix (signals), D is a $n \times K$ matrix (the overcomplete dictionary) and $X$ is $K \times N$ (the sparse representation of signals using the dictionary D)
I am not sure about the summation, how is it computed? multiplying each column of D by the corresponding row in X? Please shed some light on the problem!
[1] K. Dabov, A. Foi, V. Katkovnik, and K. Egiazarian, “Image Denoising by Sparse 3D Transform-domain Collaborative Filtering,” IEEE Trans. on Image Process., vol. 16, no. 8, pp. 2080-2095, Aug. 2007
[2] K. Dabov, A. Foi, V. Katkvnik, K. Egiazarian, “BM3D Image Denoising with Shape-Adaptive Principal Component Analysis,” Proc. of Workshop on Signal Process. With Adaptive Sparse Structure Representations, 2009
[3] P. Chatterjee, and P. Milanfar, “Patch-Based Near-Optimal Image Denoising,” IEEE Trans. on Image Process., vol. 21, no. 4, pp. 1635-1649, Apr. 2012