I know there is a lot of topic regarding this on the internet, and trust me, I've googled it. But things are getting more and more confused for me.
From my understanding, The gradient is the slope of the most rapid descent. Modifying your position by descending along this gradient will most rapidly cause your cost function to become minimal (the typical goal).
Could anyone explain in simple words (and maybe with an example) how Generalization of Gradient can be done Using Jacobian, Hessian, Wronskian, and Laplacian?
If $f: \mathbb{R}^N \rightarrow \mathbb{R}$, then applying the vector $$ \nabla = (\partial/\partial x_1, \partial /\partial x_2, ..., \partial /\partial x_n) $$ to it gives you the gradient.
If $f: \mathbb{R}^n \rightarrow \mathbb{R}^m$, $m>1$, then applying $\nabla$ gives you an $m \times n$ matrix, where the $ij$ entry is $\partial f_i / \partial x_j$. It is a matrix where each row is a gradient, since $f = (f_1, ..., f_m)$ is a vector of functions. That is the Jacobian.
The Hessian is the application of the matrix $$ \nabla \nabla' = \left[ ...\partial^2/\partial x_i \partial x_j...\right] $$ to a function $f: \mathbb{R}^n \rightarrow \mathbb{R}$. The diagonal of the matrix is the second partials of the function, and the off-diagonals are the cross-partials.
The Laplacian is the inner product of $\nabla$, rather than the outer product, as in the previous paragraph with the Hessian. So $$ \nabla'\nabla = \dfrac{\partial}{\partial x_1^2} + \dfrac{\partial}{\partial x_2^2} + ... \dfrac{\partial}{\partial x_n^2} $$ applied to a function $f: \mathbb{R}^n \rightarrow \mathbb{R}$. You get the sum of twice partial derivatives.
I have no particular interest in the Wronskian and I don't really think you should either. The strength of this opinion increased after I just scanned the Wikipedia page.