Consider a constrained optimization problem $$ \underset{\beta\in\mathbb R^p}{\text{minimize}}f(\beta)\quad\text{such that}\ \beta\in\mathcal C, $$ where $f:\mathbb R^p\to\mathbb R$ is a convex objective function to be minimized, and $\mathcal C\subset\mathbb R^p$ is a convex constraint set.
When the cost function is differentiable, a necessary and sufficient condition for a vector $\beta^*\in\mathcal C$ to be a global optimum is that $$ \langle\nabla f(\beta^*),\beta-\beta^*\rangle\ge0 $$ for all $\beta\in\mathcal C$.
I am looking for an intuitive explanation of this condition.
Using the first-order condition for convexity, we see that this condition is indeed sufficient. The condition also implies that the angle between $\nabla f(\beta^*)$ and $\beta-\beta^*$ is acute for all $\beta\in\mathcal C$. But why does this mean that $\beta^*$ is a minimzer?
Any help is much appreciated!
At a minimizer, the directional derivative in any feasible direction must be nonnegative. Otherwise, we would not be at a minimizer. (Because then we could decrease the objective function value by moving a bit in a feasible direction for which the directional derivative is negative.)