Are these two gradient notations equivalent? ($\nabla f(x_n) = \frac{\partial}{\partial x_n} f $)

257 Views Asked by At

In gradient descent algorithms (especially when talking about multivariate regression), when they are talking about the gradient at a given point, sometimes I find the notation: $\nabla f(x_n)$ and sometimes it is the notation: $\frac{\partial}{\partial x_n} f$

For example:

  • Wikipedia says: $x_{n+1} = x_n - \alpha \nabla f(x_n) $
  • Andrew Ng (on Coursera) says: $\theta_j = \theta_j - \alpha \frac{\partial}{\partial \theta_j} J (\theta)$

So I was wondering if the two are the same, or if there differences when we should use one notation over the other.

1

There are 1 best solutions below

3
On BEST ANSWER

$\nabla f(x_n)$ does NOT mean $\partial f/\partial x_n$. In the context of the wikipedia article on gradient descent, $x_n$ is a just a point in (say) $\mathbb{R}^3$. For example $x_n$ could be the point (1,0,2).

$\nabla f(x_n)$ is the vector $(\frac{\partial f}{\partial x} (x_n),\frac{\partial f}{\partial y} (x_n),\frac{\partial f}{\partial z} (x_n))$.

$\partial f/\partial x_n$ implies that you are labelling the coordinates $x_1, x_2,$ etc. Maybe $x_n$ happens to correspond to the coordinate $z$ in the 3d case, then this would be $\partial f/\partial z$ (which you would evaluate at some particular point).

Basically, I think the wikipedia article is using vector notation and Ng is using coordinate notation, which are essentially the same (I'm not sure especially since there's a mistake in your given example). But the real issue is what you stated in your first line.