Geometric Intuition of Eigenvalues of Hessian Matrix

4.1k Views Asked by At

I have a very simple question, which I suspect speaks more to my lack of intuitive understanding of parts of linear algebra than anything calculus related.

I have come across this statement (or variants thereof) in the context of the Morse Index of a critical point on various occasions:

Statement: The number of negative eigenvalues of the Hessian matrix of a function $F:M\to N$ at a point $p$ is equal to the dimension of the maximal subspace of the tangent space $TM_p$ of $M$ at $p$ on which $F$ is negative definite.

My question is:

Question: Why do the eigenvalues of the Hessian encode this information?

As I stated above, this almost certainly is the result of a lack of comfort in dealing with eigenvalues (rather than anything differential) that I have put off confronting far too long into my mathematical education, but it is not clear to me how to make the jump from in the statement.

2

There are 2 best solutions below

0
On BEST ANSWER

We consider a smooth manifold $M$ of dimension $d$, a $C^{\infty}$ function $f:M\rightarrow \mathbb{R}$ and $p\in M$. Two charts in a neighborhood $Z$ of $p$: $\phi:U\subset \mathbb{R}^d\rightarrow Z,\phi(u)=p,\psi:V\subset \mathbb{R}^d\rightarrow Z,\psi(v)=p$ are s.t. the transition map $\tau=\phi^{-1}\circ \psi$ is a diffeomorphism. Note that $f$ is smooth iff $g=f\circ \phi$ or $g\circ\tau$ are smooth. Remark that $D(g\circ\tau)_v=Dg_u\circ D\tau_v$ and $Dg_u=0$ is equivalent to $D(g\circ \tau)_v=0$; in the previous case, one says that $p$ is a critical point of $f$.

Proposition 1. The signature of the Hessian of $f$ can be defined in $p\in M$ when $p$ is a critical point of $f$.

Proof. $D^2(g\circ\tau)_v(h,k)=D^2g_u(D\tau_v(h),D\tau_v(k))+Dg_u(D^2\tau_v(h,k))$. Since $Dg_u=0$, $D^2(g\circ\tau)_v(h,k)=D^2g_u(D\tau_v(h),D\tau_v(k))$. Let $K$ be the symmetric matrix associated to $D^2g_u$ and $P$ be the invertible matrix of the linear isomorphism $D\tau_v$; then the symmetric matrix associated to $D^2(g\circ\tau)_v$ is $P^TKP$. Clearly $K$ and $P^TKP$ have same signature and we are done.

Note that, in general, the Hessian of $f$ depends on the chosen chart!! In particular, its eigenvalues vary with the chosen chart.

According to Morse theory,

(*) We can choose a transition map $\tau$ s.t. $D\tau_v$ diagonalizes $K$, that is s.t. $P$ is orthogonal and $P^TKP=diag(\lambda_1\cdots,\lambda_q,\lambda_{q+1},\cdots,\lambda_d)$ where $\lambda_i<0$ for $i\leq q$ and otherwise, $\lambda_i\geq 0$.

Recall that $D\tau_v$ is an isomorphism between two representations of $TM_p$ and that $D^2(g\circ\tau)_v$ is a symmetric bilinear form defined on a representation of $TM_p$.

EDIT. I write the details of the second part. An element $h\in TM_p$ admits, as representative, a smooth curve $\gamma$ s.t. $\gamma(0)=p$; modulo the chart $\phi$, $h$ is identified to the unique vector $(\phi^{-1}\circ \gamma)'(0)\in\mathbb{R}^d$.

Proposition 2. The maximal dimension of the subspaces of the tangent space $TM_p$ of $M$ at $p$, on which $D^2g_v$ is negative definite, is $q$. This result does not depend on the chosen chart.

Proof. According to (*), the maximum is $\geq q$. Now, let $E$ be a subspace of $TM_p$ of dimension $r$ on which $D^2g_v$ is negative definite. There is a transition map $\tau$ associated to the decomposition $E\oplus E^{\perp}$; then $P^TKP$ is in the form $diag(X_r,Y_{n-r})$ where $X_r$ is symmetric $<0$. Note that $X_r$ has $r$ negative eigenvalues and, consequently, $r\leq q$.

0
On

The comments to your question cover it pretty well, but the key piece that I think you might be missing is that (assuming appropriate conditions on the derivatives) the Hessian matrix $H$ is symmetric.

At a critical point $\mathbf P_0$, the first-order derivatives vanish, so we have the Taylor expansion $$F(\mathbf P_0+\mathbf v)=F(\mathbf P_0)+\frac12\mathbf v^TH\mathbf v+O(\|\mathbf v\|^3),$$ that is, the behavior of $F$ near $\mathbf P_0$ depends largely on the quadratic form $Q(\mathbf v)=\mathbf v^TH\mathbf v$. If this form is positive-definite, $F$ increases in every direction, i.e., it has a local minimum at $\mathbf P_0$; if it’s negative-definite, it decreases in every direction—a maximum; if neither, then it increases in some directions and decreases in others. This is where the Morse Index and Sylvester’s Law of Inertia come in. One can find an orthogonal basis of eigenvectors of $H$ in which $Q(\mathbf v)=\sum\lambda_ix_i'^2$, where the $\lambda_i$ are the corresponding eigenvalues. The overall shape of the quadric surface $Q(\mathbf v)=0$ is entirely determined by the signs of these eigenvalues: where positive, $Q$ increases in that direction, where negative, $Q$ decreases.

A subspace spanned by eigenvectors with negative eigenvalues is clearly one on which $Q$ is negative-definite—we’re only including directions in which it decreases—and if you take all of those eigenvectors, you’ll get a maximal subspace with this property.