I'm currently learning about local extrema in serveral variables and have come across the second derivative test for classifying critical points of multivariable functions.
I have read and understood the test (see link below), however I don't understand the idea behind it. Why is the critical point of a function a minimum if the eigenvalues of the Hessian matrix are all positive? I understand the idea behind the single variable case, however I am confused about the role of eigenvalues in the case of several variables.
http://en.wikipedia.org/wiki/Second_partial_derivative_test
Any insight into this would be much appreciated.
Thanks.
Let $U \subseteq \def\R{\mathbf R}\R^d$ open, $f \colon U \to \R$ twice continuously differentiable and $x \in U$ a critical point of $U$, i. e. $Df(x) = 0$. Let's recall that the Hessian $D^2f(x)$ has a symmetric matrix $H$. We know from linear algebra, that there is an orthonormal basis of eigenvectors of $H$. That is, there are $v_1, \ldots, v_d \in \R^d$ and $\lambda_i \in \R$ s. th. $$ (v_i, v_j) = \delta_{ij}, \quad Hv_i = \lambda_i v_i $$ Taylor's theorem says, that $$ f(x+h) = f(x) + Df(x)h + \frac 12 D^2f(x)[h,h] + o(\|h\|^2) $$ in our case this simplifies to $$ f(x+h) = f(x) + \frac 12 h^tHh + o(\|h\|^2) $$ Writing $H = V\Lambda V^t$ with $V = (v_1, \ldots, v_d)$ and $\Lambda = \mathrm{diag}(\lambda_1, \ldots, \lambda_d)$, we have $$ f(x + h) = f(x) + \sum_{i=1}^d \lambda_i \cdot (Vh)_i^2 + o(\|h\|^2) $$ Now - as for $d = 1$ - we see that $x$ is a maximum if all $\lambda_i > 0$ and only if all $\lambda_i \ge 0$.