I am trying to do the following exercise: to prove that a Lipshitz function $f: \mathbb{R}^n \mapsto \mathbb{R}^m$ is $\mathcal{L}^n$-almost everywhere approximately differentiable using the following fact:
"For every $\epsilon$ there exists a function $g \in C^1$ such that $\mathcal{L}^ n (\{ g \neq f \} ) < \epsilon$."
I know about Rademacher theorem, but of course I cannot use it here, it is just an exercise as I said.
Now, a function $f$ is said to be approximately differentiable at a point $x_0$ if there exists a linear functional $L$ (depending on $x_0$) such that the set $\left \lbrace x \text{ s.t. } \frac{|f(x) - f(x_0) - L (x-x_0)|}{|x-x_0| } < \delta \right \rbrace$ has density 1 in $x_0$ for every $\delta>0$.
A set $A$ having density 1 in $x_0$ means
$$ \lim_{r \to 0} \frac{\mathcal{L}^n (B(x_0,r) \cap A)}{\omega_n r^n} =1.$$
Consider the set $A = \cap_m \{ g_m = f\}$ where $g_m \in C^1$ such that $\mathcal{L}^ n (\{ g_m \neq f \} ) < \epsilon/2^m$ for some $\epsilon >0$. Now, the set $A$ has density 1 in almost every point $x_0 \in A$, i.e. :
$$\lim_{r \to 0} \frac{\mathcal{L}^n (B_r (x_0) \cap A)}{\omega_n r^n} =1. $$
This follows simply by noticing that almost every point of $\mathbb{R}^n$ is a Lebesgue point for $\mathcal{I}_A$, the characteristic function of $A$.
Then if we call $L$ the Lipschitz constant of $f$ one has:
$$L \geq \lim_{x_n \to x_0}\frac{|f(x_0) - f(x_n)|}{|x_0 -x_n|} = \lim_{x_n \to x_0}\frac{|g_m(x_0) - g_m(x_n)|}{|x_0 -x_n|} =|\nabla g_m |(x_0)$$
for every $m$ and for every $n$. To make things easier, assume $x_0 = 0 \in A$, $f(0)=0$ and denote $L_m := \nabla g_m (0)$. Eventually considering a subsequence, there exists $\tilde {L}$ limit of the $L_m$ (in the operator sense), and chooising $L= \tilde{L}$ in the definition of the approximate differentiability condition gives that:
$$ \frac{|f(x)- \tilde{L}x|}{|x|} \leq \frac{|f(x) - g_m (x)|}{|x|} + \frac{|g_m(x) - L_m (x)|}{|x|} + \frac{|L_m(x) - \tilde{L} (x)|}{|x|}$$
But for $x$ a Lebesgue point of $A$ the first term is 0 in a set of density 1, while the second and the third can be made arbitrarily small by choosing $|x|$ sufficiently small and $m$ sufficiently big, respectively. Then $f$ is approximately differentiable at every $x_0 \in A$. But $\mathcal{L}^n (A^c) \leq \epsilon$ and I can conclude by sending $\epsilon$ to 0. Is this correct?
There are some problems.
Possible solution. Better consider next situation. Let $A_k$ be a measurable set such that there exists a $C^1$ function $g_k\colon \mathbb R^n \to \mathbb R^m$ with the following property $$ f(x) = g_k(x) \quad\text{for all $x\in A_k$,} \quad \mathcal L^n(\mathbb R^n\setminus A_k) \le 2^{-k}. $$ Then you have decomposition $\mathbb R^n = \Sigma \cup \bigcup\limits_{k=1}^\infty A_k$ where $\mathcal L^n(\Sigma) = 0$ and $\left. f \right|_{A_k}$ is a "trace" of $C^1$ function. Also we can assume that $A_k$ are disjoint, because we can consider sets $B_k = A_k \setminus\bigcup\limits_{i=1}^{k-1}A_i$. Hence, $\bigcup\limits_{k=1}^\infty A_k= \bigcup\limits_{k=1}^\infty B_k$ and $\left. f\right|_{B_k} =g_k$. If $x \in B_k$ then $\left. f\right|_{B_k}$ is approximately differentiable at the $x$. But if $x \in B_k$ then $x$ is a point of density $0$ for the set $\mathbb R^n \setminus B_k$.
And about your solution. Your solution is almost correct. When you define a set $A = \bigcap\limits_m \{g_m = f\}$ you get a some set and a only one function $g$ such that $g_m = g$. Finding a limit of the subsequence $\nabla g_m$ is not a big deal, because this limit equals to $\nabla g$ (and, of course, equals to any $\nabla g_m$). So if $x\in \mathbb R^n$ you can vary $\varepsilon>0$ and find a $\varepsilon_0$ such that there exists $A$, $\mathcal L^n(\mathbb R^n\setminus A) \le \varepsilon_0$ and other properties, but also we obtain that $x\in A$ (this is true for almost all $x \in \mathbb R^n$).