Geometric interpretation of the Hessian

418 Views Asked by At

Assume we have a smooth function $f:\mathbb{R}^2 \to \mathbb{R}$. We may then form the differential of $f$, denoted by $Df$, given by the row vector $$ Df=\Big[\frac{\partial f}{\partial x_1},\frac{\partial f}{\partial x_2}\Big] $$ This quantity is a $1$-form, i.e. for every $p \in \mathbb{R}^2$, $Df(p):\mathbb{R}^2 \to \mathbb{R}$ is a linear map. The action of this object is easy to visualize:

enter image description here

It is simply the linearization at a point $p$ of the given function, ie. the $Df(p)$ takes in a vector $v \in \mathbb{R}^2$, and spits out the directional derivative of $f$ in the direction of $v$. We may then also form the Hessian,

$$ Hf = \begin{bmatrix}\frac{\partial f}{\partial^2 x_1} & \frac{\partial f}{\partial x_1\partial x_2} \\ \frac{\partial f}{\partial x_1x_2} & \frac{\partial f}{\partial^2x_2}\end{bmatrix} $$

The action of this matrix is more difficult for me to understand. According to my understanding, this object should be interpreted as a $2$-form, i.e., for each point $p \in \mathbb{R}^2$ it eats two vectors $v_1,v_2 \in \mathbb{R}^2$ and spits out a number. However, I am wondering what the geometric interpretation is of these two vectors - in the case of the differential of $f$, it was clear that the vector it ate was to be interpreted as the direction of the directional derivative. What is the geometric intuition behind the two vectors that the Hessian takes in as argument?

1

There are 1 best solutions below

1
On

While you say that the Hessian should be viewed as a $2$-form eating two vectors $v_1,v_2$, I disagree. The Hessian should be viewed as a quadratic form $q$, eating a single vector $v$ like this:

$$q(v)=v^T\mathrm Hfv.$$

Because then we can do a Taylor approximation

$$f(x+v)=f(x)+\mathrm Df(x)v+v^T\mathrm Hf(x)v+o(\vert v\vert^2).$$

We have a constant term, a linear term and a quadratic term. The quadratic term is given by the Hessian. To understand it geometrically: We can approximate the function's graph by a parabolic or hyperbolic surface. Plotting the quadratic form given by the Hessian gives us the shape of this surface, while the affine term $f(x)+\mathrm Df(x)v$ shifts the surface around towards the correct point in space such that it's actually tangent to the graph of $f$ at the desired point.