To remove the noise from the image, we base on some mathematical model(minimizing the energy functional).
Q: For different models A and B(different functionals), How can we say which one is better in math. language?
e.g. in R.O.F. model people minimize the functional $\int_\Omega\sqrt {u^2_x+u^2_y}$ in $L^1$ norm, why it is better than model to minimize the $\int_\Omega(u_{xx}+u_{yy})^2$ in $L^2$, with the same constraint conditions?
Is this because that $L^2\subset L^1$?
For justifying the objective function, you need a model for the data, for example a probabilistic model. In that case, you can say that the maximum likelihood estimator (i.e. minimize the objective function $-L( \theta| x)$ the likelihood function) is optimal for estimating the parameters of this model.
Now the main idea in signal processing is that choosing a model for the data is equivalent to choosing an objective function for estimating the parameters. And in most cases :
even for the simpler models, the maximum likelihood estimator is very complicated (and hard to minimize)
even for the simpler objective functions, the underlying model is very complicated
So really there is no solution, you can't choose in the same time a good model and a good objective function, and so you have to choose the objective function without any rigorous justification, other than "it works in those specific cases".