There is the very well known proof of Schwarz's lemma in complex analysis. When I read it I feel like the answers described here. I'm not sure how I would motivate and explain why one should expect the proof to work when talking to someone new to complex analysis despite the proof to be relatively short.
We are given $f$ is a holomorphic map such that $f(0)=0$ and $|f(z)| \le 1$ on $D$. I never understood why there wasn't someway to take this and look at $$f(z) = a_1z+a_2z^2 +\cdots$$ where the $a_i$ are constrained because of $|f(z)|\le 1$ in such a way that one can conclude that $|f(z)\le |z|$ and $f'(0)\le 1$. Why can't such an approach work?
Schwarz lemma is a profound result of non-euclidean geometry as it says that conformal maps that preserve the unit disc decrease hyperbolic distance on the unit disc; there is a book by S. Dineen called precisely that (The Schwarz Lemma, Clarendon Press, 1989, reprinted in an inexpensive pb by Dover Press) that goes into its many facets and extensions.
The results you want about Taylor Coefficients of functions that satisfy it (or more generally bounded analytic functions on the disc) are known collectively as Schur's theorem with extensions by Caratheodory, Pick, Nevanlinna and are fairly unuseful directly and more important theoretically (as they are expressed in terms of positivity of quadratic forms of arbitrary number of variables and similar such inequalities, or in terms of shift/multiplication operators on Hardy Spaces in more abstract terms). Their abstract characterization implies fairly easily the classical Schwarz Lemma, though it feels kind of like going backward