I'm independently working through Boyd and Vandenberghe's "Convex Optimization" and am trying to prove the first-order convexity condition for $f: \mathbb{R}^n \rightarrow \mathbb{R}$.
I'm at the following step
$$\frac{d}{d\theta} g(0) + f(x) \leq f(y)$$
where $g$ is defined as $g(\theta) := f(x + \theta(y-x))$ and am unsure how to get to
$$\nabla f(x)^T(y-x) + f(x) \leq f(y)$$
My current attempt is the following.
$f(x + \theta(y - x)$ is composite and can be re-written as $f(u)$ where $u = x + \theta(y-x)$. Thus, we have the following.
$$\begin{align} \frac{d}{d\theta} g &= \frac{\partial}{\partial\theta}f \\ &= \frac{\partial f}{\partial u} \frac{\partial u}{\partial\theta} \\ &= \frac{\partial f}{\partial u} (y - x) \end{align}$$
How do I get from $\frac{\partial f}{\partial u} (y - x)$ to $\nabla f(x)^T(y-x)$? I can provide any additional details about the question if necessary.
Here, $f$ is a function of several variables. Your $u$ stands for a vector $(u_1,u_2,\dots)$ and the chain rule gives you $$ g'(\theta)=\sum\partial f/\partial u_i\cdot du_i/d\theta=\nabla f\cdot (y-x). $$ In matrix notation, such inner product corresponds to the product of the transpose ('row') vector $\nabla f$ with the column vector $y-x$.