From Statistical Inference by Casella and Berger:
$\text{(Delta Method)}$ Let $Y_n$ be a sequence of random variables that satisfies $\sqrt{n}(Y_n - \theta) \rightarrow n(0,\sigma^2)$ in distribution. For a given function $g$ and a specific value of $\theta$, suppose that $g'(\theta)$ exists and is not $0$. Then $\sqrt{n}[g(Y_n) - g(\theta)] \rightarrow n(0, \sigma^2[g'(\theta)])$ in distribution.
The taylor expansion of $g(Y_n)$ around $Y_n = \theta$ is $$g(Y_n) = g(\theta) + g'(\theta)(Y_n-\theta) + \text{ Remainder, }$$ where the remainder $\rightarrow 0$ as $Y_n \rightarrow \theta$.
Since $Y_n \rightarrow \theta$ in probability, it follows that the remainder $\rightarrow 0 $ in probability.
Why does $Y_n \rightarrow \theta$ in probability? This isn't assumed in the statement of the theorem, so why is this true?
$P(|\sqrt n (Y_n-\theta)| >t) -2(1-\Phi (t))$ $\to 0$ uniformly in $t$ because convergence of distribution functions is uniform when the limiting distribution is continuous. So $P(|\sqrt n (Y_n-\theta)| >\sqrt n \epsilon) -2(1-\Phi (\sqrt n \epsilon))$ $\to 0$. Since $1-\Phi (\sqrt n \epsilon) \to 0$ we are done.