For nonempty sets $A$ and $B$ and functions $f \colon A \to B$ and $g \colon B \to A$, suppose that $g \circ f =$ the identity function on $A$. $(♦)$
(e) $(\Leftarrow)$ Assume that $g$ is one-to-one. Because $g$ is a function, for all $b \in B$, there exists $a \in A$ such that $a = g(b)$. Apply $\color{orangered}{f}$ to both sides: $\color{orangered}{f}(a)= \color{orangered}{f}(g(b))$. Next, apply $g$ to both sides: \begin{align} g(\color{orangered}{f}(a))& =g(\color{orangered}{f}(g(b))) \\ \text{ thanks to } (♦) &= g(b). \end{align} $g$ is given as one-to-one, so $f(a)=b$ and so $f$ is onto.
What's the proof strategy? This isn't a duplicate of https://math.stackexchange.com/a/750495/53259. I'm posting de novo because this direction looks wilier and more guileful. I'm not asking about the proof or formal arguments. For example, how would one determine when to apply $f$ or $g$?
I realise that the proof leverages $(♦)$. Are there pictures?
The overall proof strategy is called "chasing definitions". For the first three steps there is only one thing to do, so we follow the strategy of "fitting square pegs in square holes".
So let's begin:
Definition of $f:A\rightarrow B$ onto is that for every $b\in B$ there exists $a\in A$ such that $f(a)=b$.
So we have to start with $b\in B$ and then find a suitable $a$. So
What can we do with a $b\in B$? Look at what we have: we have $b\in B$, we have a function that takes things in $A$ to $B$, and a function which takes things in $B$ to $A$. We have one square peg, and one square hole. The only thing we can do is apply $g$. So we apply $g$.
Okay now we have an $a\in A$. What can we do with an $a\in A$? We have one round peg and one round hole. The only thing we can do is apply $f$.
Now we have a bunch of $f$'s on the outside. If we knew that $f\circ g$ were identity we would be done (remember our goal? we are trying to get $a=f(b)$). But we only know that $g\circ f = \mbox{id}_A$. To apply that tool we need a $g$ on the outside. Our other tool is that we are allowed to cancel $g$'s on either side, because $g$ is $1-1$. Again, this requires $g$ on the outside. To apply either of our tools then, we need $g$ on the outside. So we apply $g$.
Alright, now we are in a situation where both of our two tools apply. So we have to choose which tool to use. Obviously we shouldn't use the tool that cancels $g$ on both sides, because that takes us back to the previous step (don't walk backwards). So we should use the tool that we can cancel $g\circ f$. But where should we use it? If we cancel out the right side then we've run out of outside $g$'s and we won't be able to use our other tool (don't go down a dead-end street). We want to use our tool but still have a $g$ left on each side to apply our other tool on. So we cancel out $g$ on the left side.
Now we have a $g$ on both sides and can apply our 1-1 tool.
Which was our goal, so we stop.
Notice that once we get to the fourth step, we have to start deciding which hypotheses we should apply: we have two of them, one is that $g\circ f$ is identity and the other is that we can cancel out $g$'s when they appear on both sides. When we have a choice between two tools, a way to choose one is to choose not the other. In this case it was clear which tool we shouldn't use, which left us with only one other tool to apply. To make our decisions we applied two maxims: don't go down a dead end street and don't walk backwards.