Second-order directional derivative better understending

56 Views Asked by At

I understand how to calculate second order directional derivative. I want to get better understanding of the formula of it. So first order directional derivative of f(x,y) in direction of 'u' is: $$D_uf(x,y) = \vec{\nabla}f\cdot{u}=f_x(x,y)a + f_y(x,y)b$$ So if I want to calculate second order derivative (in direction of 'u') I will have: $$D_u(D_uf(x,y)) = \vec{\nabla}(f_x(x,y)a + f_y(x,y)b)\cdot{u} =$$ $$(f_{xx}(x,y)a + f_{xy}(x,y)b)a + (f_{yx}(x,y)a + f_{yy}(x,y)b)b$$ Am I right? I'm not sure how to properly ask this question, but I will try to:
When I'm trying to get partial derivative of $$(f_x(x,y)a + f_y(x,y)b)$$ with respect to x(or 'y', but let's take only 'x'), I can use 'sum rule' of derivatives and take them separately for $$(f_x(x,y)a)$$ and $$(f_y(x,y)b)$$ right? And I have two separate partial derivatives here, for each I can use second-order partial derivative notations and I will get$$(f_x(x,y)a)_x=f_{xx}a$$ and $$(f_y(x,y)a)=f_{xy}b$$ Correct? I feel like I'm wrong somewhere, but don't know exactly where.

1

There are 1 best solutions below

0
On BEST ANSWER

Yes, your expression for the second directional derivative is correct. Below I've derived the general form, for vectors of any dimension (explicitly writing out the $f_{xx}, f_{xy}...$ stops being convenient).

I will change the notation slightly, and denote the derivative (i.e. the Jacobian vector/matrix) as $\frac{df}{dx}$ instead of $\nabla$, where x is the n-dimensional input column vector. This is a matter of convention, but it's customary to use the former to denote the row vector derivative, and the latter to denote the transpose of the derivative, i.e. the gradient.

Therefore the first directional derivative along vector v is: $D_vf(x) = \frac{df}{dx}\cdot v$

Then, the second order directional derivative is $D_v[D_vf(x)] = D_v[\frac{df}{dx}\cdot v] = \frac{d}{dx}[\frac{df}{dx}\cdot v] \cdot v$

The inner expression $[\frac{df}{dx} \cdot v]$ is a scalar, therefore we have the following partial derivatives:

$$ \frac{\partial }{\partial x_i}[\frac{df}{dx} \cdot v] = \frac{\partial}{\partial x_i}{\sum_{j=1}^{N}{\frac{\partial f}{\partial x_j}v_j}} = \sum_{j=1}^{N}{\frac{\partial^2f}{\partial x_i \partial x_j} v_j}$$

Hence, the entire Jacobian vector becomes:

$$ \frac{d}{dx} [\frac{df}{df} \cdot u] = \begin{bmatrix} \sum_{j=1}^{N}{\frac{\partial^2f}{\partial x_1 \partial x_j}v_j} & \cdots & \sum_{j=1}^{N}{\frac{\partial^2f}{\partial x_N \partial x_j}v_j} \end{bmatrix} = $$ $$ \begin{bmatrix} v_{1} & \cdots & v_{N} \end{bmatrix} \begin{bmatrix} \frac{\partial^2f}{\partial x_1^2} & \cdots & \frac{\partial^2f}{\partial x_N \partial x_1} \\ \vdots & \ddots & \vdots \\ \frac{\partial^2f}{\partial x_1 \partial x_N} & \cdots & \frac{\partial^2f}{\partial x_N^2} \end{bmatrix} = \begin{bmatrix} v_{1} & \cdots & v_{N} \end{bmatrix} \begin{bmatrix} \frac{\partial^2f}{\partial x_1^2} & \cdots & \frac{\partial^2f}{\partial x_1 \partial x_N} \\ \vdots & \ddots & \vdots \\ \frac{\partial^2f}{\partial x_N \partial x_1} & \cdots & \frac{\partial^2f}{\partial x_N^2} \end{bmatrix} = v^TH$$

where H denotes the Hessian matrix. Note that it is valid to substitute the matrix of partial derivatives with its transpose in the above, thanks to Clairaut's theorem. Then, finally:

$$ D_v[D_vf(x)] = \frac{d}{dx}[\frac{\partial f}{\partial x} \cdot v] \cdot v = v^THv$$

You can verify that your example is a special case of the above, where $v = [a \ b]^T$ and f is a function of two variables x and y:

$$ D_v[D_vf(x, y)] = \begin{bmatrix} a & b \end{bmatrix} \begin{bmatrix} \frac{\partial^2f}{\partial x^2} & \frac{\partial^2f}{\partial x \partial y} \\ \frac{\partial^2f}{\partial y \partial x} & \frac{\partial^2f}{\partial y^2} \end{bmatrix} \begin{bmatrix} a \\ b \end{bmatrix}$$