Let $E$ be a $\mathbb R$-Banach space, $\Omega\subseteq E$ be open, $f:\Omega\to\Omega$ be continuously Fréchet differentiable, $x_0\in\Omega$ and $\varepsilon>0$ with $B_\varepsilon(x_0)\subseteq\Omega$.
If I got the intuition right, the Lyapunov stability theory of the dynamical system $$x_n=f(x_{n-1})=\cdots=f^n(x_0)\;\;\;\text{for all }n\in\mathbb N,\tag1$$ is trying to measure the change of the system when $x_0$ is perturbed in direction $h\in E$, $\left\|h\right\|_E=1$, to $\tilde x_0=x_0+\varepsilon h$. By $(1)$, this change is equal to $$y_n:=f^n(\tilde x_0)-f^n(x_0)$$ after $n\in\mathbb N$ iterations.
If $E=\mathbb R$, I've read that "the" Lyapunov exponent $\lambda$ is measuring the "exponential change of the distance", $$\varepsilon e^{n\lambda}=|y_n|\;\;\;\text{for all }n\in\mathbb N\tag2.$$
Now, my question is: How do we know that the "change of the distance" happens at an exponential rate? Or is $(1)$ a "model assumption"?
I've started to ask myself this question as I saw the multiplicative ergodic theorem and wondered why the logarithm is occurring in it (this form can be derived from $(2)$ by solving for $\lambda$ and letting $\varepsilon\to0$). Why don't we consider the solely the distance instead of the logarithm of it?
And of course, my subsequent question is how $(2)$ above is generalized to the general Banach space case. From what I've read so far, I assume one is not looking at perturbations in arbitrary directions, but in those which form bases of the eigenspaces related to $A:={\rm D}f(x_0)$ (which togehther form a basis of $\overline{\mathcal R(A)}$, but what is with directions in $\mathcal N(A)$?).
I understand the latter considerations needs to be understood with respect to the linearization of $(1)$.
I will talk about the one dimensional case, which is the one that I understand better, in a not-so-general setting
Suppose you have a transformation $T\colon[0,1]\to[0,1]$, which is somewhat nice, let's say, piecewise $\mathcal{C}^1$. Think of maps such as the doubling map $x\mapsto 2x \pmod 1$ or the Gauss map $x\mapsto \frac{1}{x} \pmod 1$. We are interested in the rate at which close orbits diverge from each other as you iterate the map. Take two points which are very close $x,x+\delta\in[0,1]$. You can approximate the difference of $T(x)$ and $T(x+\delta)$ with $T'(x)$. If you keep iterating, you can use $(T^nx)'$ instead of $T'(x)$.
Now comes something important: you are interested in the typical behavior of the growth of $(T^n x)'$ for generic $x$ according to a given measure. In cases such as the ones I mentioned above, there are not so hard choices of measures, as for both there are nice choices of such measures: the Lebesgue measure and the Gauss measure respectively, which are both invariant and ergodic wrt the maps (the Gauss measure is equivalent to the Lebesgue measure so their null sets are the same). For the doubling map, almost all points in $[0,1]$ are such that $(T^nx)'=2^n$ (in fact for all but a countable set). Thus, it only makes sense to look at the exponential growth of $(T^nx)'$, that is why we look at the almost sure value of
$$ \lambda(x) = \lim_{n\to\infty} \dfrac{1}{n}\log |(T^n)'x| $$
The magic part is that the expression above can be written as a Birkhoff sum, so by the ergodic theorem you have
$$ \lambda(x) = \lim_{n\to\infty} \dfrac{1}{n}\log |(T^n)'x| = \lim_{n\to\infty}\dfrac{1}{n}\sum_{k=0}^{n-1} \log|T'\circ T^k x| = \int_{[0,1]} \log |T'| d\mu $$
for $\mu$-almost every point. For higher dimensional systems the magic comes from the submultiplicative ergodic theorem.
Now, about the change rate, it depends on the assumptions on your map. For a one-dimensional example, just consider an irrational rotation $x\mapsto x + \alpha \pmod 1$. This map has constant derivative equal to one, and as you iterate, it does not grow at all. In higher dimensions the thing gets even more complicated: you can have maps with expanding and contracting directions and possible neutral directions where there is nor expansion nor contraction. Again, here the rate of expansion/contraction does not need to be exponential. You can build examples by taking products of whatever one dimensional system you want.
Trying to wrap up things, people have looked at exponential contraction/expansion because it is one of the easiest set ups to investigate. In uniformly hyperbolic dynamics, the second you allow a point where the derivative does not have modulus bigger than $1$, things start going bad really quick. For instance, one of the most important tools in dynamics is the transfer operator, which lets you understand some dynamical properties in spectral terms. One of the fundamental properties of this operator in this context is that it is quasi-compact, and this yields many important properties for your system (I can give you some references if you are interested). In the context where you don't have uniform expansion, this property of the operator is lost, and the theory becomes much more difficult.