I have a question about the proof of Rademacher theorem due to C. B. Morrey. (I'm reading it in Simon's Lectures on geometric measure theory.) The proof can be summarized in the following steps:
For every $ v \in R^n $, $ |v|=1 $, the directional derivative $ D_vf $ exists a.e. in $ R^n $.
The directional derivative $ D_vf $ is a weak derivative and therefore it holds that:
$$D_vf(x)= v\cdot \nabla f(x) \text{ a.e. in } R^n.$$
- There exists $ A \subset R^n $ such that $L^n(R^n-A)=0 $ ($L^n $ is the Lebesgue measure) and such that for every $ x \in A $ and for every $ v $ the directional derivative $ D_vf(x) $ exists. Note the difference between this step and the first step.
As I've read, these three steps should be conclude the proof. But my doubt is obviously the following one:
Since a continuous function with all of its directional derivatives at a point $ x $ is not necessarily differentiable at $ x $, how can I conclude the proof using only these three steps?
When moving to step 3, you should not forget the result of step 2: the directional derivative is linear in $v$. Still, the existence and linearity of $D_vf(x)$ do not imply the differentiability of $f$ at $x$. The Lipschitz condition must be used yet again.
You may want to consult Lectures on Lipschitz Analysis by Heinonen, where the proof of Rademacher's theorem is carefully presented in Chapter 3. The following quote directly addresses your question.
Anyway, here is self-contained proof of the "missing piece".
Claim. Suppose $f$ is Lipschitz and $x$ is a point such that $D_vf(x)$ exists for all $v$ and is linear in $v$; that is, $D_v(x)=Av$ for some matrix $A$. Then $f$ is differentiable at $x$, and $Df(x)=A$.
Proof. Let $L$ be the Lipschitz constant of $f$. Introduce $g(h)=f(x+h)-f(x)-Ah$. The function $g$ is $2L$-Lipschitz because the norm of $A$ is at most $L$. To prove the differentiability of $f$ at $x_0$, we must show that $$\lim_{h\to 0}\frac{g(h)}{|h|}=0$$ Given $\epsilon>0$, let $v_1,\dots,v_N$ be an $\epsilon$-net in the unit sphere (since the sphere is compact, it has a finite $\epsilon$-net for every $\epsilon>0$). By the directional differentiability assumption, there exists $\delta>0$ such that $|g(tv_j)/t|<{\epsilon}$ whenever $|t|\le \delta$ and $1\le j\le N$.
Given a vector $h$ with $0<|h|<\delta$, let $t=|h|$ and pick $j$ such that $|v_j-h/t|\le \epsilon$. Since $|tv_j-h|\le t\epsilon$, it follows that $$|g(h)|\le |g(tv_j)|+2Lt\epsilon \le t\epsilon + 2Lt\epsilon = (2L+1) t\epsilon$$ Hence $|g(h)|/|h|\le (2L+1)\epsilon$. Since $\epsilon$ was arbitrary, we are done. $\Box$