I'm learning nonlinear optimization and encounter a definition as follows: $f$ is convex if and only if for all $x$ and $y$, $f(y) \ge f(x) + \left<\nabla f(x), y-x\right>$
Are x and y both vectors? Is $\left<\nabla f(x), y-x\right>$ a number(dot product)? Is there any intuitive explanation behind this formula? Because I feel this formula is kind of wired and hard to connect it to the original definition of convex.
Thanks a lot!
This is best described by drawing a picture in two dimensions (the idea generalizes to the multidimensional case). The following plot is courtesy of: Boyd, Stephen, and Lieven Vandenberghe. Convex optimization. Cambridge university press, 2004
The idea is that $f(x) + \nabla f(x)^T (y-x)$ defines a tangent line (plane, in the multidimensional case) at the point $(x,f(x))$ parameterized by the variable $y$. As can be seen from the above plot, the function $f(y)$ lies above $f(x) + \nabla f(x)^T (y-x)$ for all $y$. If this test holds for every point along the graph, that is for every $(x,f(x))$, then the function $f$ is convex.