In machine learning we are often faced with optimization problems where we want to minimize some energy function using L1 regularization over some of the parameters, e.g.: $$ E(a,w) = [\text{sum of square errors}]-\lambda||a||_1, $$ where $a$ and $w$ are vectors of parameters.
If we take the standard L1 norm definition $||a||_1=\sum_i|a_i|$ then the optimization is complicated because this norm is not differentiable.
Is there a differentiable replacement for the L1 norm?
I would disagree with the statement that the optimization is complicated - many pre-existing routines exist to solve $l^1$ regularized least squares. Just to name one, take a look at FISTA (or just search for 'iterative shrinkage thresholding').
If you're really looking for something differentiable, just consider
$$ \|a\|_{1+\epsilon} = \left(\sum \vert a_i\vert^{1+\epsilon}\right)^{\frac{1}{1+\epsilon}} $$ This is a slightly smoothed version of $\|a\|_1$. This offers no advantage over standard shrinkage methods, though.