Let's assume that $I_j \in \mathcal{J}$, where $\mathcal{J}$ is a set of images that are correctly classified and $p(I)$ is the output probability distribution of the used underlying model. Out of $\mathcal{J}$, we select $\hat{I}_{\!j^*}$ according to a well defined metric according to the inequality shown in (2).
Given that $$\log\left(\sum p(\hat{I}_{\!j})^2 \right) \leq \sum p(\hat{I}_{\!j}) \log p(\hat{I}_{\!j}) \leq \sum \log p(\hat{I}_{\!j})$$ and $$ p(\hat{I}_{\!j^*} | y) \leq p(\hat{I}_{\!j} | y), $$ where $y$ is the Ground Truth label.
How could we show that the first inequality shown in (1) is equivalent to the second inequality shown in (2)?
$$ \sum p(\hat{I}_{\!j^*})\log p(\hat{I}_{\!j^*}) \leq \sum p(\hat{I}_{\!j}) \log p(\hat{I}_{\!j}) \tag{1} $$ and $$ \sum \log p(\hat{I}_{\!j^*}) \leq \sum \log p(\hat{I}_{\!j}) \tag{2} $$
Could we make our argument stronger if we mentioned the following?
Both inequalities are equivalent as the log function is a monotonic increasing function and inequality (1) acts as a scaled version of (2), where the sum of the probabilities is multiplied by their respective logarithms.
I have tested it numerically, and both inequalities are equivalent, but I cannot prove it mathematically.
Is this enough? I am planning to build my proof based on this.