statement of the chain rule: let $f: A \to R$ and $g: B \to R$. s.t. $f(A) \subset B$ where $ g ∘ f$ is defined. If $f$ is differentiable at $a \in A$ and $g$ is differentiable at $f(a) \in B $, then $ g ∘ f$ is differentiable at $a$ with $ (g ∘ f)' (a)=g'(f(a))f'(a)$.
this is the proof that was given:
$g'(f(a))$ exists by assumption so: $g'(f(a))=$$\lim_{y \to f(a)} \frac{g(y)-g(f(a))}{y-f(a)}$
$f'(a)$ exists by assumption so: $f'(a)=$ $\lim_{x \to a} \frac{f(x)-f(a)}{x-a}$
$f$ is differentiable at $a$ $\implies $ $f$ is continuous at $a$ $\implies$ $\lim_{x \to a} f(x)=f(a)$
step 3 implies: $g'(f(a))= \lim_{x \to a} \frac{g(f(x))-g(f(a))}{f(x)-f(a)}$
so, multiplying two limits together we get : $g'(f(a))f'(a) = \lim_{x \to a} \frac{g(f(x))-g(f(a))}{f(x)-f(a)}\frac{f(x)-f(a)}{x-a}=\lim_{x \to a} \frac{g(f(x))-g(f(a))}{x-a}= (g ∘ f)' (a)$
I do not understand step 4. in the proof. how does step 3 imply this equality of the limit in the RHS of 1 with the limit in the RHS of 4.? Can you please, as simply and with as much detail as possible help me understand?
It is easiest to see this using the sequential criterion for limits.
In Step 1: $g'(f(a)) = \lim_{y \to f(a)} \frac{g(y) - g(f(a))}{y - f(a)}$ can be rephrased as: for all sequences $\{y_n\}_{n = 1}^\infty$ converging to $f(a)$ we must have $g'(f(a)) = \lim_{n \to \infty} \frac{g(y_n) - g(f(a))}{y_n - f(a)}$.
In Step 3: $\lim_{x \to a} f(x) = f(a)$ can be rephrased as: for all sequences $\{x_n\}_{n = 1}^\infty$ converging to $a$ we must have $\lim_{n \to \infty} f(x_n) = f(a)$.
Now let us finally look at your question:
In Step 4: we need to show $g'(f(a)) = \lim_{x \to a} \frac{g(f(x)) - g(f(a))}{f(x) - f(a)}$. To do this with the sequential criterion, we need to consider some arbitrary sequence $\{x_n\}_{n = 1}^\infty$ converging to $a$.
But if such a sequence $x_n \to a$ is given, notice that $\{f(x_n)\}_{n = 1}^\infty$ is a new sequence converging to $f(a)$ by our rephrasing $\lim_{n \to \infty} f(x_n) = f(a)$ of Step 3 above.
But then, by our rephrasing $g'(f(a)) = \lim_{n \to \infty} \frac{g(y_n) - g(f(a))}{y_n - f(a)}$ of Step 1 above, we can set $y_n := f(x_n)$ to get $g'(f(a)) = \lim_{n \to \infty} \frac{g(y_n) - g(f(a))}{y_n - f(a)} = \lim_{n \to \infty} \frac{g(f(x_n)) - g(f(a))}{f(x_n) - f(a)}$.
And this is exactly what it means to have $g'(f(a)) = \lim_{x \to a} \frac{g(f(x)) - g(f(a))}{f(x) - f(a)}$ according to the sequential criterion.