How is the notation in this Hessian Matrix calculated?

250 Views Asked by At

I'm sorry I'm not sure how to word this question. I've returned to school after a long break where I was working full time. In school I did calculus, but most of it seems to have left me.

I have started schooling again in data science and I am trying to compute a Hessian Matrix for a simple function. The function is:

$$f(x_1, x_2) = (x_1-1)^2 + 100(x_1^2-x_2)^2$$

I have calculated the gradient vector by taking the first order derivative

$$\nabla f(x_1, x_2) = \begin{bmatrix} 2(x_1-1) + 400x_1(x_1^2-x_2) \\ -200(x_1^2-x_2) \end{bmatrix}$$

In attempting to calculate the Hessian Matrix I am confused by the notation of entry 1,2 and 2,1:

$$ \nabla^2f(x_1, x_2) = \begin{bmatrix} \frac{\partial^2f}{\partial x_1^2} & \frac{\partial^2f}{\partial x_2 \partial x_1}\\ \frac{\partial^2f}{\partial x_1 \partial x_2} & \frac{\partial^2f}{\partial x_2^2} \\ \end{bmatrix}$$

For entry (1,1) and (2,2), I just retake the derivative of the above gradient vector

$$ \frac{\partial^2f}{\partial x_1^2} = \frac{\partial f}{\partial x_1} [2(x_1-1) + 400x_1(x_1^2-x_2)] = 1200x^2-400x_2+2 $$

and

$$ \frac{\partial^2f}{\partial x_2^2} = \frac{\partial f}{\partial x_2} [ -200x_1^2+200x_2 ] = 200 $$

Therefore the matrix as it stands is:

$$ \nabla^2f(x_1, x_2) = \begin{bmatrix} 1200x^2-400x_2+2 & \frac{\partial^2f}{\partial x_2 \partial x_1}\\ \frac{\partial^2f}{\partial x_1 \partial x_2} & 200 \\ \end{bmatrix}$$

How would I go about calculating $$ \frac{\partial^2f}{\partial x_2 \partial x_1} and \frac{\partial^2f}{\partial x_1 \partial x_2} $$?

Thank you for your time

Edit: Update \Delta to \nabla and \delta to \partial as suggested by top answer.

1

There are 1 best solutions below

3
On BEST ANSWER

The mixed derivatives can also be obtained by taking the derivative of entries of the gradient.

$$\frac{\partial^2 f}{\partial x_2 \ \partial x_1} = \frac{\partial}{\partial x_2} \frac{\partial f}{\partial x_1} = \frac{\partial}{\partial x_1} [2(x_1-1) + 400 x_1(x_1^2-x_2)] = -400x_1$$

$$\frac{\partial^2 f}{\partial x_1 \ \partial x_2} = \frac{\partial}{\partial x_1} \frac{\partial f}{\partial x_2} = \frac{\partial}{\partial x_2} [-200(x_1^2-x_2)] = -400x_1$$

Note that the order you take the derivatives does not matter if the function has continuous second partial derivatives.


Latex comments: generally the gradient and Hessian are denoted with $\nabla$ and $\nabla^2$ respectively (\nabla), not $\Delta$ (\Delta). Also, the symbol for partial derivatives is usually $\partial$ (\partial) rather than $\delta$ (\delta).