Let $p$ be an arbitrary discrete probability distribution on $\{1, \dots, 10\}$. We can represent such distribution by a vector indexed by by $i$ subject to the constraints $p_i \ge 0$ and $\sum^{10}_{i=1}p_i = 1$. I want to find analytically the $p$ with maximum entropy which is given by
$$ H(p) = - \sum_{i=1}^{10} p_i \log(p_i) $$
My question consists of two parts:
How do I show that the optimal $p$ is given by the uniform distribution using the Lagrange method.
Why can't the Lagrange method be used to find $p$ with minimal entropy (which I suppose is given by the dirac distribution)?
For 1. I know that the Lagrangian function associated with this constrained problem is given by
$$ \mathcal{L}(p,\lambda) = - \sum_{i=1}^{10} p_i \log(p_i) + \lambda(1 - \sum^{10}_{i=1}p_i) $$
and the respective gradient by
$$ \frac{\partial \mathcal{L}}{\partial p_i} = - \log(p_i) - 1 - \lambda $$
and
$$ \frac{\partial \mathcal{L}}{\partial \lambda} = 1 - \sum^{10}_{i=1}p_i $$
but I don't know how to continue from there.
The KKT conditions, which can incorporate the nonnegativity constraints, are not necessary for optimality when the objective function is not continuously differentiable. In this case the objective function is not differentiable at certain points, so those points can never be found by this method (the stationarity condition is $\log(p_i)=\mu_i-\lambda-1$ which cannot have $p_i=0$ (for some $i$) as a solution).
You can use that the minimum of a concave function over a convex set is attained at an extreme point of the convex set. This is known as the maximum principle. The extreme points are the vectors with $p_i=1$ for some $i$ and $p_i=0$ elsewhere. By comparing the function value at those points, you find that they are all globally optimal.
You should do what Nown did to find potential local optima. Then you should compare the function value with the function values at all points for which the objective function is not differentiable (or the derivative is not continuous), because the Lagrange method may not find those points. That's a lot of work here because the derivative does not exist when $p_i=0$ for at least one $i$. I would argue that the Lagrange method is just not suitable to solve this problem.
Maximizing a symmetric concave function over the standard simplex is an easy task if you follow this answer.