What I want to understand
I am trying to understand why the following holds:
$$\lim_{t \to 0} \frac{f(x + t(y-x)) - f(x)}{t} = \nabla f(x)^T\cdot (y-x)$$
with $f: \mathbb{R}^p \mapsto \mathbb{R}$, and $x,y \in \mathbb{R}^p$, $t \in \mathbb{R}^+$
Also I am using the following definition of the gradient:
$\nabla{f(x)} = \begin{bmatrix} \frac{\partial f}{\partial x_1}(x) & \frac{\partial f}{\partial x_2}(x) & \dots & \frac{\partial f}{\partial x_p}(x) \end{bmatrix}^\intercal$
with $x_i$ being the $i$-th element of $x$.
Why I want to understand it
It is used when proving that a convex function always lies above its tangent line (see p. 5-6 of this example if you want)
What I know
The solution probably has something to do with the definition of the derivative as the limit of a difference quotient:
$$\lim_{\Delta z \to 0} \frac{g(z + \Delta z) - g(z)}{\Delta z} = g'(z)$$
With $z, \Delta z \in \mathbb{R}$, and $g: \mathbb{R} \mapsto \mathbb{R}$
However, I do not understand exactly how (or if) I can use this to arrive at this higher dimensional limit at the top of my question from this definition (or what additional information I need)
Thank you very much!
I'm adding this as an answer so as to not have too many comments.
Thanks for editing your questions several times. It now (as far as I can tell) makes perfect mathematical sense, although there is still a missing link: you did not give a definition of the gradient.
I know I'm being very annoying, but there is a reason for that: the expression of the gradient in your question can actually itself be taken as a definition of the gradient. This is called the Gâteaux derivative of a function (see https://en.wikipedia.org/wiki/Gateaux_derivative).
Any answer to you question needs to know what definition you are considering, of course.
Best,
EDIT :
Thanks for adding the gradient definition. Note that your definition can be rephrased as : $$ (\nabla f)_i(x) = \nabla ^T f(x)\cdot e_i = \frac{\partial f}{\partial x_i}(x) = \frac{d}{dt} f(x+t*e_i) = \lim_{t->0} \frac{f(x+t*e_i) - f(x)}{t} $$ Where $(e_i)$ is the base in which you defined the coordinates.
So that at list when $y-x$ is one of the basis vectors, you have your answer. Now if you write $ z = y-x = \sum z_i* e_i$, and use the chain rule, you will get your result.