I am looking, in an convex optimization problem, for a smart way to write a penalty term $Reg(A)$ (where $A$ is the coefficients matrix of the data $X$ w.r.t. the learned dictionary $D$, $A=D^{T}X$), that enforces constant column sparsity on $A$ : \begin{equation} argmin ||X-DA||^{2}_{F} +\lambda Reg(A) \end{equation}
Any idea on how to write $Reg(A)$?
Thank you!