I am reading Science and Information Theory by Brillouin and when he states that the information, defined as: $$ I = - K \sum_{j=1}^m p_j \ln p_j $$ is maximum when all probabilities are the same, i.e.: $$ p_1=p_2=...=p_m, \quad \text{where}\quad \sum_j p_j =1$$
He states 2 conditions the first one is:
First-order partial derivates are zero
Which I completely understand.
However, I don't know why he establishes the second condition:
That $I_{11}$ and the determinants of orders 2,3,...,m-1, (obtained by adding the next row and column) have alternating sign. Where $I_{mm}$ represents the matrix of second partial derivatives, that is:
$$ I = \left ( \begin{matrix} I_{11} & I_{12} & ... & I_{1m} \\ I_{21} & \ddots & ... & I_{2m} \\ \vdots & ... & \ddots & \vdots \\ I_{m1} & I_{m2} & ... & I_{mm} \end{matrix} \right )$$
with $I_{ij}=\partial I/\partial p_i \partial p_j$
Why is this last condition needed? Isn't the first condition enough?
This has nothing to do with entropies, it's a general property of multivariate functions.
Consider a scalar real differentiable function, $f(x)$: it has a local minimun at $x_0$ iff $f'(x_0)=0$ and $f''(x_0)<0$. And no, the first condition is not enough (to have zero derivative only implies it's a critical point).
This can be generalized to multivariate variables. Now we must have null gradient (first-order partial derivates are zero) and the Hessian (matrix of second derivatives) is negative definite matrix.
One of the criterion for a matrix to be negative definite is the given in your texbook: the leading principal minors must alternate signs - see for example here, theorem 5.
All this said, it seems a clumsy way to prove the desired property. For one thing, you need to prove it's a global maximum (not merely a local one). Second, you need to take into account the additional constraint $\sum p_i = 1$. The standard proof using Jensen inequality looks much more elegant to me.