I am reading a book that claims (without proof) that the curvature of a function $f(x)$ along a direction $p$ is obtained by projecting the product of the Hessian $H$ and $p$ onto the direction $p$ i.e.
$$ p^T H p $$
I'd like to understand why.
I am reading a book that claims (without proof) that the curvature of a function $f(x)$ along a direction $p$ is obtained by projecting the product of the Hessian $H$ and $p$ onto the direction $p$ i.e.
$$ p^T H p $$
I'd like to understand why.
Recall the idea of extrinsic scalar curvature for plane curves. A smaller circle means tighter turns (if you're go-karting around it, say), so we define the curvature of a circle to be inversely proportional to its radius, indeed just the reciprocal $1/r$. For a unit-parametrized circle, this means the curvature will be the angular speed, in other words $\mathrm{d}\theta/\mathrm{d}s$ where $\theta$ is the angle the unit tangent $\mathbf{T}(s)$ makes with any chosen direction and $s$ is the natural arclength parameter. But $|\mathrm{d}\theta/\mathrm{d}s|=\|\mathrm{d}\mathbf{T}/\mathrm{d}s\|$, so for general curves we can use $\|\mathrm{d}\mathbf{T}/\mathrm{d}s\|$ as a definition of (unsigned) curvature. This will equal the curvature $1/r$ of the oscillating circle of radius $r$. A given orientation on space is what allows us the $\theta$ coordinate, in which case we can define signed curvature which is positive or negative depending on concavity (whether $\mathbf{T}$ is turning clockwise or counter).
For the graph $y=f(x)$ of a one-variable function $f$, the (signed) curvature is
$$ \kappa = \frac{f''}{(1+(f')^2)^{3/2}}. \tag{$\ast$}$$
This is all is discussed in the first four subsections of the "plane curves" section of WP's curvature article, and there are plenty of derivations around (or it would make a good exercise).
If we consider the graph $y=f(\mathbf{x})$ of a multivariable scalar function $f$, we can restrict ourselves to a 2D cross section which intersects the domain perpendicularly. If we think about the "direction" of $\mathbf{p}$ from a particular point $\mathbf{x}$ in the domain, we can define the one-variable function $g(t)=f(\mathbf{x}+t\mathbf{p})$; this graph will be the 2D cross section of the graph of $y=f(\mathbf{x})$. If we apply $(\ast)$ to $g(t)$ we get extrinsic scalar curvature
$$ \kappa=\frac{\mathbf{p}^T(Hf)\mathbf{p}}{(1+(\mathbf{p}^T\nabla f)^2)^{3/2}} $$
At a critical point $\mathbf{x}$ where $\nabla f(\mathbf{x})=0$, or in a direction $\mathbf{p}$ orthogonal to the gradient $\nabla f(\mathbf{x})$, then this simplifies to $\kappa=\mathbf{p}^T(Hf)\mathbf{p}$. Just as the gradient $\nabla f(\mathbf{x})$ encodes the magnitude and direction of the maximized directional derivative of $f$, the Hessian's $Hf$ eigenvectors and eigenvalues encode the directions and scalar values of its principal curvatures at a critical point. To generalize to other points we need to normalize $Hf$ with $\nabla f$ somehow; the second fundamental form is $\mathrm{I\!I}=Hf/\|\nabla f\|$ (interpreted as a local bilinear form varying with $\mathbf{x}$) and usual principal curvatures are defined WRT $\mathrm{I\!I}$.