Euler's method for integration can be written as,
$$ f(x) = x + g(x) $$
Assuming that $g$ has a Lipschitz constant which is $<1$, it is a contraction mapping and therefore has a fixed point by the Banach fixed point theorem. If we run the equation in reverse given the endpoint, we can use fixed point iteration to arrive at the correct starting point for each time step given $y = f(x)$ and an initial guess for $x_0 = x^*$ which could be anywhere
$$ x_{[i+1]} = y - g(x_{[i]}) $$
Which basically gives us $f^{-1}(y) = x$ at convergence. I understand why $f^{-1}$ has a fixed point, but I cannot understand why that fixed point is going to just happen to be the specific and correct point we are looking for....$x$.
Is there any easy way to see why this is true?
First correct the fixed-point equation to $x=y+g(x)$ with $y=x_{n-1}$, $g(x)=hf(t_n,x)$ for an ODE $\dot x=f(t,x)$, and the resulting fixed point $x_n=x^*$ is expected to be $O(h)$ close to $y$.
For $h$ sufficiently small you have both a small enough Lipschitz constant for $g$ and can take the previous value $y$ on the numerical trajectory as initial value for the fixed-point iteration, as it will be inside the basin of attraction.
Using the forward Euler value $x=x_{n-1}+hf(t_{n-1},x_{n-1})$ as predictor results in a faster convergence for step sizes that produce a good accuracy. For stiff ODE with large step sizes a sizeable disconnect between forward and backward Euler is expected.
You are correct in your doubt, for stiff ODE and large step sizes, you can have that either the iteration does not converge, or that it jumps to a fixed point that is not close to the trajectory.