I'm currently going through some book on differential geometry, and I do have a lot of difficulties understanding how things act on other things since we can define the same object with different point of view (a vector field can be a class of curve, a derivation...).
I have a lot of difficulties understanding the following statement:
Let $X,Y$ be two (complete) vector fields on $M$ (smooth, compact). $\phi_X^t$ denotes the flot of $X$. We define the Lie derivative the following way: $\mathcal L_X Y := \frac{d}{dt}_{|t=0} \left(\phi_X^t\right)^* Y$. It easy to show that for a smooth function $f$,
$$ (\mathcal L_X Y)_p f = \frac{d}{dt}_{|t=0}\frac{d}{ds}_{|s=0}f\circ\phi_X^{-t}\circ\phi_Y^{s}\circ \phi_X^{t}(p) $$
I don't understand two thing in the following statement: why $\phi_X^{-t}\circ\phi_Y^{s}\circ \phi_X^{t}(p)$ and why is $f$ on the left side.
This is a good exercise in stubbornly keeping track of things. Let's do it. Throughout, let $M$ be a smooth manifold. The (local) flow of a vector field $X$ will be denoted $(p,t)\mapsto\Phi_X^t(p)$. This is sometimes viewed as function of $t$ with $p$ constant and at other times vice versa. Feel free to ask if this causes additional confusion. First, let's recall something about the two equivalent ways of thinking about tangent vectors you mention. Let $\gamma\colon I\rightarrow M$ be a smooth curve, where $I\subseteq\mathbb{R}$ is an interval about $0$. Then, this defines a tangent vector $\gamma^{\prime}(0)\in T_{\gamma(0)}M$. If you view tangent vectors as equivalence classes of curves, this tangent vector is simply the equivalence class containing $\gamma$. If you view tangent vectors as derivations, which is the convention I will use for the rest of this answer, this vector is obtained as $\gamma^{\prime}(0)=d\gamma\vert_0\left(\frac{\partial}{\partial t}\Big\vert_{t=0}\right)\in T_{\gamma(0)}M$, where $d\gamma\vert_0\colon T_0I\rightarrow T_{\gamma(0)}M$ is the differential of $\gamma$ at $0$ and $\frac{\partial}{\partial t}\Big\vert_{t=0}$ is the derivation that takes the classical analysis derivative at $0$. Thus, if $f\in C_{\gamma(0)}^{\infty}(M)$ is a smooth function germ at $\gamma(0)$, $$\gamma^{\prime}(0)(f)=d\gamma\vert_0\left(\frac{\partial}{\partial t}\Big\vert_{t=0}\right)(f)=\frac{\partial}{\partial t}\Big\vert_{t=0}(f\circ\gamma)=(f\circ\gamma)^{\prime}(0).$$ Here, the RHS is the classical analysis derivative of the germ of $f\circ\gamma\colon I\rightarrow\mathbb{R}$ at $0$. Now why does this matter? Recall that the flow of $X$ has as defining property that the curve $t\mapsto\Phi_X^t(p)$, defined on some interval about $0$, defines the tangent vector $X_p$ under this correspondence. Thus, for any smooth function germ $f\in C_p^{\infty}(M)$, $$X_pf=(f\circ\Phi_X^t(p))^{\prime}(0)=\lim_{t\rightarrow0}\frac{f(\Phi_X^t(p))-f(p)}{t}.$$ Now, we are ready to unwind the claim. First of all, let's spell out the definition of the Lie derivative at $p$: $$(\mathcal{L}_XY)_p=\frac{\mathrm{d}}{\mathrm{d}t}\Big\vert_{t=0}(\Phi_X^t)^{\ast}Y=\lim_{t\rightarrow0}\frac{((\Phi_X^t)^{\ast}Y)_p-Y_p}{t}=\lim_{t\rightarrow0}\frac{d\Phi_X^{-t}\vert_{\Phi_X^t(p)}(Y_{\Phi_X^t(p)})-Y_p}{t}.$$ Now, let $f\in C_p^{\infty}(M)$ be a smooth function germ at $p$. We compute \begin{align*} (\mathcal{L}_XY)_p(f)=&\left(\lim_{t\rightarrow0}\frac{d\Phi_X^{-t}\vert_{\Phi_X^t(p)}(Y_{\Phi_X^t(p)})-Y_p}{t}\right)(f)\\ =&\lim_{t\rightarrow0}\frac{d\Phi_X^{-t}\vert_{\Phi_X^t(p)}(Y_{\Phi_X^t(p)})(f)-Y_p(f)}{t}\\ =&\lim_{t\rightarrow0}\frac{Y_{\Phi_X^t(p)}(f\circ\Phi_X^{-t})-Y_p(f)}{t}\\ =&\lim_{t\rightarrow0}\frac{\lim_{s\rightarrow0}\frac{f(\Phi_X^{-t}(\Phi_Y^s(\Phi_X^t(p))))-f(\Phi_X^{-t}(\Phi_X^t(p)))}{s}-\lim_{s\rightarrow0}\frac{f(\Phi_Y^s(p))-f(p)}{s}}{t}\\ =&\frac{\partial}{\partial t}\Big\vert_0\lim_{s\rightarrow0}\frac{f(\Phi_X^{-t}(\Phi_Y^s(\Phi_X^t(p))))-f(p)}{s}\\ =&\frac{\partial}{\partial t}\Big\vert_{t=0}\frac{\partial}{\partial s}\Big\vert_{s=0}f(\Phi_X^{-t}(\Phi_Y^s(\Phi_X^t(p)))).\end{align*} In the second line, commuting evaluation at $f$ with the limit is legitimate since evaluating a tangent vector at $p$ in $f$ is a linear map (indeed, it is the linear map $df\vert_p$) and linear maps between finite-dimensional vector spaces are continuous. In the fourth line, we have used the observation made in the previous part. The rest is just rewriting definitions.