My Understanding:
I would derive the exponential random variable as follows:
I consider an experiment which consists of a continuum of trials on an interval $[0,t)$. The result of the experiment takes the form of an ordered $n$-tuple $\forall n \in \mathbb{N}$ containing distinct points on the interval. Every outcome is equally likely and I measure the size of the set containing tuples of $n$ different points by $I_n$ as:
$$ I_n = \int_0^{t} \int_0^{x_{n}} \int_0^{x_{n-1}} \cdots \int_0^{ x_2 } dx_1 dx_{2} dx_{3} \dots dx_{n-1} dx_{n} = \frac{ t^n } { n! }$$
Since the size of the set where no success occurs, $I_0 = 1$, the ratio of these sets gives the probability that no event occurs and its complement yields the c.d.f:
$$ \begin{align} P(\text{no successful trial in } [0,t)) &= \big(\sum_{n = 0}^{\infty} I_n(t)\big)^{-1} = e^{-t} \\ P(X \leq t) &= 1 - e^{-t} \\ \end{align} $$
Where I Lose Intuition:
It's easy to arrive at a power series solution to the ODE:
$$y' = y \text{ with } y'(0) = 1$$ $$ \boxed{ y = \sum_{n = 0}^{\infty} \frac{ t^n }{ n ! } = e^{t}} $$
My problem is that I do not understand the role of each term in the expansion. Substitution by power series is an attractive idea, but I have no deep intuition as to why we'd do this and hence, I'm having trouble putting it all together to understand the solution.
Question
How do I interpret the power series solution of the ODE? Hopefully this will allow me to reconcile my understanding of both these processes.
I am not willing to accept this as pure coincidence.
A Poisson process with rate $\lambda$ is a limit of a Bernoulli process which makes an attempt to jump at each time $k/n,k \in \mathbb{N}$ and succeeds with probability $\lambda/n$, as $n \to \infty$. Here $n$ is also in $\mathbb{N}$ but you must have $n \geq \lambda$ for this to make any sense.
So the probability of no successes in $[0,k/n]$ is given by the Binomial($k$,$\lambda/n$) distribution to be $(1-\lambda/n)^k$. You now set $k=t n$ for some $t>0$ and send $n \to \infty$. This constraint on $k$, or something very similar, is necessary so that the limit will be in $(0,1)$. It is also physically meaningful, since it is just setting the total time to wait.
The limit you get is $e^{-\lambda t}$, which we call the hazard function of the exponential distribution. The intuition comes from going back to $(1-\lambda/n)^{tn}$ and recognizing something like compound depreciation: each attempt to get another success depreciates your probability of having had no jumps by another factor of $(1-\lambda/n)$, which in the continuum limit behaves like an exponential. The power series $e^{-\lambda t}=\sum_{k=0}^\infty (-\lambda t)^k/k!$ has no real probabilistic significance, but you can compare it to the binomial expansion of $(1-\lambda/n)^{tn}$ (which has different terms, but the terms of the latter converge to the terms of the former).