I understand the proof of the Picard-Lindelof Theorem, but am having trouble understanding why someone would attempt to use Lipschitz continuity in the first place. The condition of Lipschitz continuity seems ad hoc. What is the motivation of this on the intuitive level? This could be answered be answering one of the following questions:
Why would the failure to be Lipschitz continuous at a point allow for a break in solutions, that is, there is not uniqueness of solutions? (I am not looking for an example, but an intuitive understanding.)
Why would one go about looking for a solution through a Picard iteration, as in, why would someone have the intuition that the integral equation (induced by the differential equation) should contract approximations?
You are trying to prove the existence of the solution of an equation via a fixed-point reformulation. The natural choices for the ODE are $$ y'_{n+1}=f(x,y_n(x)) $$ and $$ y'_{n}=f(x,y_{n+1}(x)). $$ The second one has several problems. First one needs to solve an implicit equation to get to the value of $y_{n+1}$, then differentiation is smoothness-destroying which severely restricts the applicability of this iteration.
In contrast the first variant requires to integrate $y_{n+1}'$ to $$ y_{n+1}(x)=y_{n+1}(x_0)+\int_{x_0}^xf(s,y_n(s))\,ds $$ Using the natural choice $y_{n+1}(x_0)=y_0$ one finds the Picard iteration. Due to the integration the smoothness class of the iterates is increased in each step by one order until the smoothness of $f$ is surpassed by one.
Additionally, $C^{k+1}$ is a small, thin subset in $C^k$ so that the projection down after a Picard step results in something close to contraction from these qualitative considerations alone, with a Lipschitz condition one obtains a quantifiable contraction.