Hessian matrix in cylindrical coordinate basis

2.1k Views Asked by At

I have a scalar-valued function, f, defined on a 2N-dimensional Euclidean space. I want to Taylor expand this function about a point $P$. I need to be able to explicitly write all terms in the expansion of at least 2nd order.

If I were working in Cartesian coordinates, I would define a basis such that $P = (x_1^\prime,y_1^\prime,x_2^\prime,y_2^\prime,...,x_N^\prime,y_N^\prime)$, and the Taylor expansion would be given by $$f(x_1,y_1,...) = f(x_1^\prime,y_1^\prime,...) + \sum_{i=1}^N \Big[(x_i-x_i^\prime)\frac{\partial f}{\partial x_i}|_{x_1^\prime,y_1^\prime,...} + (y_i-y_i^\prime)\frac{\partial f}{\partial y_i}|_{x_1^\prime,y_1^\prime,...}\Big] + \\ \frac{1}{2!}\sum_{i=1}^N \sum_{j=1}^N \Big[ (x_i-x_i^\prime)(x_j-x_j^\prime)\frac{\partial^2 f}{\partial x_i \partial x_j}|_{x_1^\prime,y_1^\prime,...} + (x_i-x_i^\prime)(y_j-y_j^\prime)\frac{\partial^2 f}{\partial x_i \partial y_j}|_{x_1^\prime,y_1^\prime,...} + (y_i-y_i^\prime)(x_j-x_j^\prime)\frac{\partial^2 f}{\partial y_i \partial x_j}|_{x_1^\prime,y_1^\prime,...} + (y_i-y_i^\prime)(y_j-y_j^\prime)\frac{\partial^2 f}{\partial y_i \partial y_j}|_{x_1^\prime,y_1^\prime,...} \Big] + ...$$

However, I want to work in polar coordinates, $(r_1,\theta_1,r_2,\theta_2,...)$. So, I should define $P = (r_1^\prime,\theta_1^\prime,...)$, and the Taylor expansion, written explicitly to first order, looks like the following (if I have this correct).

$$\require{enclose} \enclose{horizontalstrike}{f(r_1,\theta_1,r_2,\theta_2,...) = f(r_1^\prime,\theta_1^\prime,...) + \sum_{i=1}^N \Big[ (r_i-r_i^\prime)\frac{\partial f}{\partial r}|_{r_1^\prime,\theta_1^\prime,...} + r_i(\theta_i-\theta_i^\prime)\frac{\partial f}{\partial \theta_i}|_{r_1^\prime,\theta_1^\prime,...} \Big] + ...}$$

$$f(r_1,\theta_1,r_2,\theta_2,...) = f(r_1^\prime,\theta_1^\prime,...) + \sum_{i=1}^N \Big[ (r_i-r_i^\prime)\frac{\partial f}{\partial r}|_{r_1^\prime,\theta_1^\prime,...} + (\theta_i-\theta_i^\prime)\frac{\partial f}{\partial \theta_i}|_{r_1^\prime,\theta_1^\prime,...} \Big] + ...$$

I feel like this formula should be written somewhere, but I cannot find it. I know the second order terms can be written as a tensor product $x^i H_{ij} x^j$, where $H_{ij}$ is the Hessian matrix (tensor), which would be helpful if I could find an explicit formula for the Hessian in a polar coordinate basis.

Can anyone write the second-order terms in the Taylor expansion, or equivalently, provide the elements of the Hessian in a polar basis? Please keep in mind that I am an engineer, so I am ideally looking for an answer written explicitly using the polar coordinates, rather than covariant gradients, Levi-Civita symbols, etc. Though any help achieving progress toward the explicit formula is much appreciated.

2

There are 2 best solutions below

0
On BEST ANSWER

I'll use the Einstein summation convention throughout. (i.e. any indices which appear twice in the same term are implicitly summed over).

The Taylor expansion of a function $f(x^1,...,x^n)$ in arbitrary curvilinear coordinates on $\mathbb{R}^n$ can be written as $$ f(x^1+\delta^1,...,x^n+\delta^n)=f(x^1,...,x^n)+(\nabla f)_i\delta^i+(\nabla\nabla f)_{ij}\delta^i\delta^j+... $$ Where $(\nabla f)_i$, $(\nabla\nabla f)_{ij}$, etc. are a set of tensors that give linear/quadratic/etc. approximation when terminated at some order, which correspond (up to their order) to the Taylor expansion in Euclidean coordinates. These objects are called the covariant derivative(s) of $f$; in euclidean coordinates they are of course just the partial derivatives of $f$.

It turns out the zeroth and first order terms work as one would expect in all coordinates, $(\nabla f)_i=\frac{\partial f}{\partial x_i}$.

The higher order terms are not so straightforward. This relates to the fact that differentiating vectors/tensors cannot be done componentwise in curvilinear coordinates: a set of coordinates induces a basis $\partial_1,...,\partial_n$ at each point (equating vectors with directional derivatives, the partial derivatives form a basis). These basis elements are not generally constant, and their derivatives will also show up in higher covariant derivatives in non-Eulclidean coordinates. The Christoffel symbols of a set of coordinates, defined by $(\nabla\partial_k)_j=\Gamma^i_{jk}\partial_i$ are convenient way of organizing all these derivatives. For polar coordinates, these have the form $$ \Gamma^r_{\theta\theta}={-r},\ \ \ \Gamma^\theta_{r\theta}=\Gamma^\theta_{\theta r}=\frac{1}{r} $$ With the remaining entries zero (these can be computed by converting $\partial_\theta,\partial_r$ to euclidean partial derivatives using chain rule). The second covariant derivative, the covariant Hessian can then be written in these terms. $$ (\nabla\nabla f)_{ij}=\frac{\partial^2 f}{\partial x^i\partial x^j}-\Gamma ^k_{ij}\frac{\partial f}{\partial x^k} $$ The first term can be thought of as differentiating the components of $\nabla f$, while the second is differentiating the basis elements. In polar coordinates, we have everything we need to write these out explicitly. $$ (\nabla\nabla f)_{rr}=\frac{\partial^2 f}{\partial r^2},\ \ \ (\nabla\nabla f)_{\theta\theta}=\frac{\partial^2 f}{\partial \theta^2}+r\frac{\partial f}{\partial r},\ \ \ (\nabla\nabla f)_{r\theta}=(\nabla\nabla f)_{\theta r}=\frac{\partial^2 f}{\partial r\partial \theta}-\frac{1}{r}\frac{\partial f}{\partial\theta} $$ Of course, some nontrivial details was skipped in this setup, but the above approach works for any curvilinear coordinates in $\mathbb{R}^n$, as well as any smooth manifold equipped with an affine connection. Similar formulas exist expressing higher covariant derivativies in terms of partial derivatives and Christoffel symbols.

0
On

I would just add a version for those more familiar with differentials (Fréchet derivative).

I will let $f$ be the function of your curvilinear coordinates and $h$ be a parametrization of the space through the curvilinear coordinates. Here $h(r,\theta) = (r\cos\theta, r\sin\theta)$. Define $g = f \circ h^{-1}$, essentially $f$ using cartesian coordinates. I am not bothering so much with domains and $h^{-1}$, essentially your coordinates are well-defined locally if $\mathrm dh(P)$ is invertible and $h$ is smooth.

I will loosely use the letter $P$ to denote the point of the plane irrespective of the coordinates used. Also I am using differentials and not gradients, Jacobians, Hessians because it is easier to work with. I will also use the convention that the second differential is a bilinear (symmetric thanks to Schwarz's theorem) map noted $\mathrm d^2$. Locally around $P$, \begin{equation} f = g \circ h, \end{equation} thus at first order, in terms of linear maps, \begin{equation} \mathrm df(P) = \mathrm dg(P) \mathrm dh(P). \end{equation}

At second order now, \begin{equation} \mathrm d^2f(P) = \mathrm d^2g(P)(\mathrm dh(P) ~\cdot, \mathrm dh(P) ~\cdot) + \mathrm dg(P) \mathrm d^2h(P). \end{equation} The dots help indicate that $\mathrm dh(P)$ applies to each argument. Altogether, \begin{align} \mathrm d^2g(P)(\mathrm dh(P) ~\cdot, \mathrm dh(P) ~\cdot) &= \mathrm d^2f(P) - \mathrm dg(P) \mathrm d^2h(P) \\ &= \mathrm d^2f(P) - \mathrm df(P) \mathrm dh(P)^{-1} \mathrm d^2h(P). \end{align} The left-hand side is the bilinear map we are after, in the case of $f$ scalar, it is the bilinear form associated with the Hessian. The differential $\mathrm dh(P)$ applying to the arguments of $\mathrm d^2 g(P)$ ensures that we translate changes $\delta r, \delta \theta$ (if you will) into $\delta x, \delta y$, just like in $\mathrm df(P) = \mathrm dg(P) \mathrm dh(P)$.

So now using vectors $u,v$ we explicit, \begin{align} \mathrm dh(P) &= \begin{bmatrix} \cos\theta & -r \sin\theta \\ \sin\theta & r \cos\theta\end{bmatrix} \\ \mathrm dh(P)^{-1} &= \begin{bmatrix} \cos\theta & \sin\theta \\ -\frac{\sin\theta}r & \frac{\cos\theta}r\end{bmatrix}\\ \mathrm d^2h(P)(u,v) &= u_r \begin{bmatrix} 0 & - \sin\theta \\ 0 & \cos\theta\end{bmatrix} v + u_\theta \begin{bmatrix} -\sin\theta & - r\cos\theta \\ \cos\theta & -r\sin\theta\end{bmatrix} v \\ \mathrm dh(P)^{-1} \mathrm d^2h(P)(u,v) &= \begin{bmatrix} -ru_\theta v_\theta \\ \frac1r (u_r v_\theta+ u_\theta v_r)\end{bmatrix} \\ \mathrm df(P) \mathrm dh(P)^{-1} \mathrm d^2h(P)(u,v) &= - r \frac{\partial f}{\partial r} u_\theta v_\theta + \frac1r\frac{\partial f}{\partial \theta} (u_r v_\theta+ u_\theta v_r) \\ &= u^\top \begin{bmatrix} 0 & \frac1r\frac{\partial f}{\partial \theta} \\ \frac1r\frac{\partial f}{\partial \theta} & - r \frac{\partial f}{\partial r} \end{bmatrix}v. \end{align}

Therefore, \begin{equation} \mathrm d^2g(P)(\mathrm dh(P) u, \mathrm dh(P) v) = u^\top \begin{bmatrix} \frac{\partial^2 f}{\partial r^2} & \frac{\partial^2 f}{\partial r\partial \theta} -\frac1r\frac{\partial f}{\partial \theta} \\ \frac{\partial^2 f}{\partial r\partial \theta} -\frac1r\frac{\partial f}{\partial \theta} & \frac{\partial^2 f}{\partial \theta^2} + r \frac{\partial f}{\partial r} \end{bmatrix} v. \end{equation}