Given a differentiable function $f\colon \mathbb R^n \to \mathbb R$, we have $\frac{d}{dt} f(x + tu) \mid_{t=0} = \langle \nabla f(x), u \rangle$.
Here is a proof, but I have some questions.
Write $x(t) = x + t u = (x_1+tu_1,\ldots x_n+tu_n)$ and apply the chain rule: $$\frac d{dt} f(x(t)) = \sum_{k=1}^n \frac{\partial}{\partial x_k} f(x(t)) \frac{d}{dt}(x_k + t u_k) = \sum_{k=1}^n \frac{\partial}{\partial x_k} f(x(t))u_k = \nabla f(x+tu) \cdot u.$$
I think the use of $x(t) = x + t u = (x_1+tu_1,\ldots x_n+tu_n)$ is confusing since on the second line we have $\sum_{k=1}^n \frac{\partial}{\partial x_k} f(x(t)) \frac{d}{dt}(x_k + t u_k)$. And I think $\partial x_k$ should be read as $\partial (x_k +tu_k)$. Am I correct?
Does the last equality hold? Is $\sum_{k=1}^n \frac{\partial}{\partial x_k} f(x(t))u_k$ a number and $\nabla f(x+tu) \cdot u$ is a vector?
How do we use the information $t=0$?
The symbol $\partial_{x_k}f$ denotes the partial derivative of $f$ with respect to the $k$-th coordinate $x_k$. It should not be read as $\partial_{x_k+tu_k}f$.
In the expression $\nabla f(x+tu) \cdot u$ you have a scalar product, so the result is a number.
The evaluation at $t=0$ is performed at the end: the result of $\frac{d}{dt}f(x(t))_{|t=0}$ is in fact $\nabla f(x(0)) \cdot u$.