In gradient descent algorithms (especially when talking about multivariate regression), when they are talking about the gradient at a given point, sometimes I find the notation: $\nabla f(x_n)$ and sometimes it is the notation: $\frac{\partial}{\partial x_n} f$
For example:
- Wikipedia says: $x_{n+1} = x_n - \alpha \nabla f(x_n) $
- Andrew Ng (on Coursera) says: $\theta_j = \theta_j - \alpha \frac{\partial}{\partial \theta_j} J (\theta)$
So I was wondering if the two are the same, or if there differences when we should use one notation over the other.
$\nabla f(x_n)$ does NOT mean $\partial f/\partial x_n$. In the context of the wikipedia article on gradient descent, $x_n$ is a just a point in (say) $\mathbb{R}^3$. For example $x_n$ could be the point (1,0,2).
$\nabla f(x_n)$ is the vector $(\frac{\partial f}{\partial x} (x_n),\frac{\partial f}{\partial y} (x_n),\frac{\partial f}{\partial z} (x_n))$.
$\partial f/\partial x_n$ implies that you are labelling the coordinates $x_1, x_2,$ etc. Maybe $x_n$ happens to correspond to the coordinate $z$ in the 3d case, then this would be $\partial f/\partial z$ (which you would evaluate at some particular point).
Basically, I think the wikipedia article is using vector notation and Ng is using coordinate notation, which are essentially the same (I'm not sure especially since there's a mistake in your given example). But the real issue is what you stated in your first line.