Im working with Newton method, and during I was trying to see why we got this expression, I fall into a problem, I find the same expression but instead of Hessian matrix I find his transpose. I can't figure where Im wrong D:
Here how I found that :
Quick note of Newton's Method to aproximate zero of a function h: $$h: \mathbb{R^n} \rightarrow \mathbb{R^n} $$ $$and$$ $$D_h : \mathbb{R^n} \rightarrow \mathbb{L}(\mathbb{R}^n) $$
We must find the x such as: $$D_h(x_n) (x-x_n)+h(x_n)=0$$ $$ x= x_n - D_h^{-1}(x_n)\ h(x_n)$$ $$So$$ $$x_{n+1}= x_n - D_h^{-1}(x_n)\ h(x_n)$$
First I define my function f twice deferentiable: $$ f: \mathbb{R}^n \rightarrow \mathbb{R} $$
We want to minimize this function so we want to find where the diferential is zeros. $$ D_f: \mathbb{R}^n \rightarrow \mathbb{L(R^n)} $$
It's equivalent to find the zeros of the gradient : $$ \nabla f: \mathbb{R}^n \rightarrow \mathbb{R}^n$$
So our objectif is finding the zeros of the gradient vector.
$$Let \ \ g= \nabla f $$
To find the zeros of g we can use Newton's methods:
$$x_{n+1}= x_n - D_g^{-1}(x_n)\ g(x_n)$$
$$With$$
$$D_g : \mathbb{R^n} \rightarrow \mathbb{M}_n $$
$$\hspace{30pt} x \rightarrow \begin{bmatrix}
\frac{dg_1(x)}{dx_1} & \frac{dg_1(x)}{dx_2} & ... &\frac{dg_1(x)}{dx_n} \\
... & ... & &...\\
\frac{dg_n(x)}{dx_1} &\frac{dg_n(x)}{dx_2} & ... &\frac{dg_n(x)}{dx_n} \end{bmatrix}$$
$$\hspace{30pt}x \rightarrow \begin{bmatrix} \frac{d{\nabla f}_1(x)}{dx_1} & \frac{d{\nabla f}_1(x)}{dx_2} & ... &\frac{d{\nabla f}_1(x)}{dx_n} \\ ... & ... & &...\\ \frac{d{\nabla f}_n(x)}{dx_1} &\frac{d{\nabla f}_n(x)}{dx_2} & ... &\frac{d{\nabla f}_n(x)}{dx_n} \end{bmatrix}$$
$$\hspace{30pt}x \rightarrow \begin{bmatrix} \frac{d^2f(x)}{dx_1 dx_1} & \frac{d^2f(x)}{dx_2 dx_1} & ... &\frac{d^2f(x)}{dx_n dx_1} \\ ... & ... & &...\\ \frac{d^2f(x)}{dx_1 dx_n} &\frac{d^2f(x)}{dx_2 dx_n} & ... &\frac{d^2f(x)}{dx_n dx_n} \end{bmatrix}=H^T(f(x))$$ $$Newton \ methods \ become:$$ $$x_{n+1}= x_n -H^T(f(x_n))^{-1}\ \nabla f(x_n)$$
But I find on the internet the Newton's method is : $$ x_{n+1} = x_n -H(f(x_n))^{-1} \nabla f(x_n)$$
$$H(f) = \begin{bmatrix} \frac{\partial^2 f}{\partial x_1^2} & \frac{\partial^2 f}{\partial x_1 \partial x_2} & \cdots & \frac{\partial^2 f}{\partial x_1 \partial x_n} \\ \frac{\partial^2 f}{\partial x_2 \partial x_1} & \frac{\partial^2 f}{\partial x_2^2} & \cdots & \frac{\partial^2 f}{\partial x_2 \partial x_n} \\ \vdots & \vdots & \ddots & \vdots \\ \frac{\partial^2 f}{\partial x_n \partial x_1} & \frac{\partial^2 f}{\partial x_n \partial x_2} & \cdots & \frac{\partial^2 f}{\partial x_n^2} \\ \end{bmatrix}$$
So why I find the transpose of the hessian matrix instead of the hessian matrix ? D:
(I know that if the function is C² the hessian matrix is symetric but even if it's true and im right, why define hessian matrix like that and not directly define the hessian matrix like the transpose of the actual one )
Thanks for reading be nice with my english and my latex (that s my first time latex)
THANKS :)