In Principal Component Analysis, we start with $m$ observations $x_1,\dots,x_m$, each of which is an $n$-dimensional vector. Assume we have centered the data; that is, we have subtracted the variable means from each observation. To project each observation into a $1$-dimensional subspace while maximizing sample variance, we want to compute a unit length basis vector, call it $v$. So we get the constrained problem
$$ \mathop{\arg\,\max}\limits_{v}\,\frac{1}{m}\sum_{i=1}^{m}\left\lVert\langle x_i,v\rangle v\right\rVert $$
$$ \text{subject to }v^Tv=1 $$
After some computations, this becomes
$$ \mathop{\arg\,\max}\limits_{v}\,\frac{1}{m}\sum_{i=1}^{m}v^TCv $$
$$ \text{subject to }v^Tv=1 $$
where $C$ is the covariance matrix of the data. Then the Lagrangian objective function is
$$ \mathcal{L}(v,\lambda)=v^TCv-\lambda(v^Tv-1) $$
If we start taking partial derivatives of $\mathcal{L}$, setting these to zero, etc, how do we know that we are maximizing $\mathcal{L}$? Do we need to check that the Hessian is negative definite? Even then, I think that only guarantees a local maximum?
Define $K:=\frac{C+C^T}{2}-\lambda I$. Note that $\mathcal{L}=v^T Kv+\lambda$ is a quadratic, so if $K$ is definite the local turning point, which solves $Kv=0$ from the first derivative of $\mathcal{L}$, is also a global extremum. I've written the coefficient as a symmetric real matrix so it's diagonalisable, viz. $\mathcal{L}=\sum_i K_{ii}v_i^2+\lambda$. In our choice of basis $Kv=0$ simplifies to $K_{ii}\ne 0\implies v_i=0$. This maximises $\mathcal{L}$ provided each of the eigenvalues $K_{ii}$ is at most $0$. Indeed, the solution is for $\lambda$ to be the largest eigenvalue of $\frac{C+C^T}{2}$, with $v$ a unit eigenvector corresponding to $\lambda$; then $K$ has $0$ as an eigenvalue, but no positive eigenvalues.