I'm working through Fact 1 on page 11, Chapter 1 of Implicit Functions and Solution Mappings: A View from Variational Analysis by Dontchev and Rockafellar (famed author of Convex Analysis). Here is the statement and proof:
I follow the proof up to the point where the authors invoke the triangle inequality and the continuity of the Jacobian $\nabla f(x)$ at $\bar{x}$. It seems to me that property $(a)$ is strictly a result of the function $f$ being differentiable everywhere in a convex neighborhood of $\bar{x}$. I actually can't quite see why property $(b)$ is true.
So I don't see how the conclusions follow from the triangle inequality and continuity. What am I missing? Also, is there something a little Lipschitzian about these inequalities?

Subtracting from both sides, you say you agree with the author(s) that: $$\langle h,f(x’)-f(x)-\nabla f(x)(x’-x)\rangle=\langle h,[\nabla f(x+t(x’-x))-\nabla f(x)](x’-x)\rangle$$For some $0<t<1$ and any (unit) vector $h\in\Bbb R^m$. Note that $\|x+t(x’-x)-x\|<\|x’-x\|$: here is where we use continuity of the $\nabla$ at $\tilde x$. By continuity, for arbitrary $\varepsilon>0$, there is $\delta>0$, $\|y-\tilde x\|<\delta$ implying that the operator norm $\|\nabla f(y)-\nabla f(\tilde x)\|<\frac{\varepsilon}{2\sqrt{m}}$. Then, using the triangle inequality (valid for operator norms too!): $$\|\begin{align}\nabla f(y)-\nabla f(y’)\|&\le\|\nabla f(y)-\nabla f(\tilde x)\|+\|\nabla f(y’)-\nabla f(\tilde x)\|\\&<\frac{\varepsilon}{2\sqrt{m}}+\frac{\varepsilon}{2\sqrt{m}}=\frac{\varepsilon}{\sqrt{m}}\end{align}$$ For $y,y’\in B_{\delta}(\tilde x)$. For $x,x’$ in this ball we get, setting $y’:=x+t(x’-x)\in B_{\delta}(\tilde x)$ and $y:=x$: $$\|[\nabla f(x+t(x’-x))-\nabla f(x)](x’-x)\|<\frac{\varepsilon}{\sqrt{m}}\|x’-x\|$$
Take the standard coordinate basis $\{e_i\}_{i=1}^m$ for $\Bbb R^m$. By substituting $h=e_i$ into the inner products - with the effect of just picking out individual coordinates - we get that the $i$th coordinate of $f(x’)-f(x)-\nabla f(x)(x’-x)$ is equal to the $i$th coordinate of $[\nabla f(x+t(x’-x))-\nabla f(x)](x’-x)$, which is bounded in absolute value by $\frac{\varepsilon}{\sqrt{m}}\|x’-x\|$ since, for any vector $y$, $\max(|y_i|)\le\|y\|$. Then, we can estimate: $$\|f(x’)-f(x)-\nabla f(x)(x’-x)\|<\frac{\varepsilon}{\sqrt{m}}\sqrt{m}\cdot\|x’-x\|=\varepsilon\cdot\|x’-x\|$$For $x,x’$ in the mentioned ball. The estimate comes from the fact that: $$\begin{align}\|y\|&=\sqrt{y_1^1+\cdots+y_m^2}\\&\le\sqrt{\underset{m\text{ times}}{\underbrace{\max(y_i^2)+\max(y_i^2)+\cdots+\max(y_i^2)}}}\\&=\sqrt{m\max(y_i^2)}\\&=\max(|y_i|)\sqrt{m}\end{align}$$For any vector $y$. If each coordinate is bounded by $C$ then the norm is bounded by $C\sqrt{m}$.
The continuity of the derivative is very important. You said it felt like $(a)$ was just a restatement of the definition for the derivative at $\tilde x$. Not quite. If $(a)$ had $x’$ replaced with some $y$ and $x$ replaced with $\tilde x$, then we would have the definition of derivative at $\tilde x$. By allowing arbitrary derivatives evaluated at $x’,x$ close to, but not necessarily equal to, $\tilde x$, you are involving a continuity argument. Indeed, $(a)$ would be the definition for the derivative at $x$ instead, were it not for the fact that $x’,x$ are being chosen close to $\tilde x$ - the quantification of the variables is different. The problem? Well, if you fixed an $x$ close to $\tilde x$, then $(a)$ would hold for all $x’$ close to $x$. Eventually you may be picking $\varepsilon$ so small that the necessary $\delta$, about $x$, is so small that $x\notin B_{\delta}(\tilde x)$. Then the variable quantification in $(a)$ would become inappropriate, we would be forbidden from choosing $x$ as it lies outside the ball. I hope that clarifies the difference.
To pass $(a)$ into $(b)$ requires continuity also. Essentially, $\|\nabla f(x)-\nabla f(\tilde x)\|$ is very small if $x$ is very close to $\tilde x$, and the triangle inequality allows you bound by adding the two small errors to get another small error, so effectively replace $\nabla f(x)$ for $\nabla f(\tilde x)$.
More precisely, since we know $(a)$ is true, for $\varepsilon>0$ I can have some $\delta’>0$ for which the inequality in $(a)$ holds with $x’,x\in B_{\delta’}(\tilde x)$ and $\varepsilon/2$. By continuity of the derivative, there is some $\delta’’>0$, $\|x-\tilde{x}\|<\delta’’$ implying $\|\nabla f(x)-\nabla f(\tilde x)\|<\varepsilon/2$. Then let $\delta=\min(\delta’,\delta’’)>0$. We have: $$\begin{align}\|f(x’)-f(x)-\nabla f(x)(x’-x)\|&\le\frac{1}{2}\|x’-x\|\\\|\nabla f(x)(x’-x)-\nabla f(\tilde x)(x’-x)\|&\le\frac{1}{2}\varepsilon\|x’-x\|\end{align}$$For all $x,x’\in B_{\delta}(\tilde x)$ since $B_{\delta}(\tilde x)$ is a subset of both $B_{\delta’,\delta’’}(\tilde x)$. By the triangle inequality: $$\begin{align}\|f(x’)-f(x)-\nabla f(\tilde x)(x’-x)\|&=\|[f(x’)-f(x)-\nabla f(x)(x’-x)]+[\nabla f(x)(x’-x)-\nabla f(\tilde x)(x’-x)]\| \\&\le\|f(x’)-f(x)-\nabla f(x)(x’-x)\|+\|\nabla f(x)(x’-x)-\nabla f(\tilde x)(x’-x)\|\\&\le\frac{1}{2}\varepsilon\|x’-x\|+\frac{1}{2}\varepsilon\|x’-x\|\\&=\varepsilon\|x’-x\|\end{align}$$As required.
As for the Lipschitz remark, yes, continuously differentiable functions are locally Lipschitz for this very reason (though not always globally).