We assume that $n$ is a very large number. Then
\begin{align*} (1+n)^n &\simeq n^n + \binom{n}{1} n^{n-1} + \binom{n}{2}n^{n-2} + \cdots \\ &\simeq n^n(1 + 1 + \frac{1}{2!} + \frac{1}{3!} + \cdots) \\ &\simeq n^n e. \end{align*}
Therefore
$$ n^n = n n^{n-1} = n (1 + n-1)^{n-1} \simeq n e (n-1)^{n-1}. $$
Also, of course $e^n = e e^{n-1}$. We thus deduce that
$$ \left( \frac{n}{e} \right)^n \simeq n \left( \frac{n-1}{e} \right)^{n-1}.$$
So if we let $f(n) = (n/e)^n$, we see that if $n$ is very large, then
$$f(n) \simeq n f(n-1),$$
which is the functional equation of the factorial (actually of the Gamma function, up to some translation by $1$).
I know this is far from being a proof, but I wonder if there is any truth in this heuristic kind of argument. Is it perhaps possible to build on it and turn it into a proof perhaps?
Here's a version of the argument being a bit more careful about error terms. Write $e_n = \left( 1 + \frac{1}{n} \right)^n = \frac{(n+1)^n}{n^n}$. We have $\frac{(n+1)^{n+1}}{n^n} = (n+1) e_n$ and iterating this gives
$$n^n = n! \prod_{i=1}^{n-1} e_i$$
and hence $n! = \frac{n^n}{\prod_{i=1}^{n-1} e_i}$. Since $e_i$ converges rapidly to $e$ we expect that this gives, approximately, $n! \approx \left( \frac{n}{e} \right)^n$, and we can be more precise about this by bounding the error more precisely. As a rough first pass we have
$$\log e_n = n \log \left( 1 + \frac{1}{n} \right) = 1 - \frac{1}{2n} + O \left( \frac{1}{n^2} \right)$$
(using the first three terms of the Taylor series expansion of $\log (1 + x)$), which gives
$$\log \prod_{i=1}^{n-1} e_i = \sum_{i=1}^{n-1} \log e_i = (n-1) - \frac{H_{n-1}}{2} + O(1) = n - \frac{\log n}{2} + O(1)$$
and exponentiating this gives
$$n! = e^{O(1)} \sqrt{n} \left( \frac{n}{e} \right)^n$$
which gets us within a constant factor of Stirling's approximation. A more careful version of this argument would replace the $O(1)$ term with a constant given by the sum of the error terms in the Taylor series approximations above, which converges, which would get us Stirling's approximation with a fixed but unknown multiplicative constant (several other techniques can also prove this), up to a multiplicative error of order $e^{ O \left( \frac{1}{n} \right)} = 1 + O \left( \frac{1}{n} \right)$ (which can itself be estimated and bounded more precisely).
By the way, this is a nice observation! I don't think I've seen someone explain Stirling's approximation this way before. It is a sort of disguised discrete version of the argument based on comparing $\sum_{i=1}^n \log i$ to the integral $\int_1^n \log x \, dx$, where we instead compare the sum to $n \log n - n$ by taking finite differences.
You may also be interested in reading Terence Tao's notes on Stirling's approximation which I found very helpful. In particular the $\sqrt{2 \pi n}$ factor can be interpreted conceptually as coming from the central limit theorem.