## Dictionary learning by K-SVD

The K-SVD method (Aharon et al., 2006) is one approach that solves equation 4 with good performance. The dictionaries in are not obtained at a time. Instead, columns in are updated one by one while fixing . In order to update the th column, one can first write the objective function in equation 4 as

 (5)

where is the th column vector in , is the th row vector in . Here, simply indicates a row vector. For simplicity, in equation 5 and the following derivations, I omit the superscript shown in equation 4. is the fitting error using all column vectors other than the th dictionary and their corresponding coefficients row vectors. Note that in equation 5, and are of size , is of size , and is of size . Here, is the length of each training signal, is the number of training signals, and is the number of atoms in the dictionary.

It is now obvious that the th dictionary in is updated by minimizing the misfit between the rank-1 approximation of and the term. The rank-1 approximation is then solved using the singular value decomposition (SVD).

A problem in the direct use of SVD for rank-1 approximation of is the loss of sparsity in . After SVD on , is likely to be filled. In order to solve such problem, K-SVD restricts minimization of equation 5 to a small set of training signals . To achieve this goal, one can define a transformation matrix to shrink and by rejecting the zero columns in and zero entries in . First one can define a set , which selects the entries in that are non-zero. One then constructs as a matrix of size with ones on the entries and zeros otherwise.

Applying to both and , then the objective function in equation 5 becomes

 (6)

and can be minimized by directly using SVD:

 (7)

is then set as the first column in , the coefficient vector as the first column of multiplied by first diagonal entry . After columns are all updated, one turns to solve equation 3 for .

2020-04-03