I have two loss functions $\mathcal{L}_1$ and $\mathcal{L}_2$ to train my model. The model is predominantly a classification model. Both $\mathcal{L}_1$ and $\mathcal{L}_2$ takes two variants of the focal loss [reference]. $\mathcal{L}_1$ and $\mathcal{L}_2$ takes as input the same class probabilities and a hyper-parameter $\gamma$. The formulation of $\mathcal{L}_1$ and $\mathcal{L}_2$ are distinct. It is possible to mathematically show that
\begin{equation} \mathcal{L}_1\geq\mathcal{L}_2. \end{equation}
Based on this information can we comment on the useful-ness of $\mathcal{L}_1$ and $\mathcal{L}_2$? i.e. which of $\mathcal{L}_1$ and $\mathcal{L}_2$ is more useful to train the model?