The sparse coding problem is:
$\underset{\boldsymbol{r}}{\text{min}} \left\Vert \boldsymbol{r} \right\Vert_0 ~s.t.~\textbf{x}=\text{D}\boldsymbol{r}$
Why do sparse dictionary learning algorithms such as MOD and K-SVD solve
$\underset{\text{D},\left\{ \boldsymbol{r}_i \right\}_{i=1}^{s}}{\text{min}}~~\overset{s}{\underset{i=1}{\sum}} \left\Vert \textbf{x}_i - \text{D}\boldsymbol{r}_i \right\Vert_2^2 ~s.t.~\left\Vert \boldsymbol{r}_i \right\Vert_0 \leq k~,~1 \leq i \leq s$
rather than the following problem?
$\underset{\text{D},\left\{ \boldsymbol{r}_i \right\}_{i=1}^{s}}{\text{min}}~~ \left\Vert \boldsymbol{r}_i \right\Vert_0 ~s.t.~ \overset{s}{\underset{i=1}{\sum}} \left\Vert \textbf{x}_i - \text{D}\boldsymbol{r}_i \right\Vert_2^2 \leq \epsilon~,~1 \leq i \leq s$