The Idea behind the Second Partial Derivative Test

583 Views Asked by At

I'm currently learning about local extrema in serveral variables and have come across the second derivative test for classifying critical points of multivariable functions.

I have read and understood the test (see link below), however I don't understand the idea behind it. Why is the critical point of a function a minimum if the eigenvalues of the Hessian matrix are all positive? I understand the idea behind the single variable case, however I am confused about the role of eigenvalues in the case of several variables.

http://en.wikipedia.org/wiki/Second_partial_derivative_test

Any insight into this would be much appreciated.

Thanks.

1

There are 1 best solutions below

0
On

Let $U \subseteq \def\R{\mathbf R}\R^d$ open, $f \colon U \to \R$ twice continuously differentiable and $x \in U$ a critical point of $U$, i. e. $Df(x) = 0$. Let's recall that the Hessian $D^2f(x)$ has a symmetric matrix $H$. We know from linear algebra, that there is an orthonormal basis of eigenvectors of $H$. That is, there are $v_1, \ldots, v_d \in \R^d$ and $\lambda_i \in \R$ s. th. $$ (v_i, v_j) = \delta_{ij}, \quad Hv_i = \lambda_i v_i $$ Taylor's theorem says, that $$ f(x+h) = f(x) + Df(x)h + \frac 12 D^2f(x)[h,h] + o(\|h\|^2) $$ in our case this simplifies to $$ f(x+h) = f(x) + \frac 12 h^tHh + o(\|h\|^2) $$ Writing $H = V\Lambda V^t$ with $V = (v_1, \ldots, v_d)$ and $\Lambda = \mathrm{diag}(\lambda_1, \ldots, \lambda_d)$, we have $$ f(x + h) = f(x) + \sum_{i=1}^d \lambda_i \cdot (Vh)_i^2 + o(\|h\|^2) $$ Now - as for $d = 1$ - we see that $x$ is a maximum if all $\lambda_i > 0$ and only if all $\lambda_i \ge 0$.