I'm reading the wikipedia page concerning method averaging. The system has the following form $$\dot x=\varepsilon f(x,t,\varepsilon ),\quad 0<\varepsilon \ll 1,\tag{1}$$ of a phase space variable $\dot x$.
Q1) What mean a phase space variable $\dot x$ ?
The fast oscillation is given by $f$ versus slow drift of $\dot x$.
Q2) I don't really understand what they mean by fast oscillation VS slow drift of $\dot x$. Any idea ?
Then suggest to solve $$\dot y=\varepsilon \frac{1}{T}\int_0^T f(x,t,0)dt=:\varepsilon \bar f(y),$$ and they say : $y$ approximates the solution curves of $\dot x$ inside a connected and compact region of the phase space and over time of $\frac{1}{\varepsilon }$.
Q3) Could someone tell me what it mean ? What is the solution curve of $\dot x$ ? Do they mean the solution of the equation $(1)$ ?
Q4) After, I don't get what they mean by : inside a connected and compact region of the phase space and over time of $\frac{1}{\varepsilon }$.
The basic assumption is that $f$ takes "normal" values, especially that it is bounded independent of $ε$. This then allows to conclude that any solution will be nearly constant, especially if one considers its evolution over time steps $\Delta t\ll \frac1ε$. This is just a consequence of the mean value theorem, or the triangle inequality of the norm applied to the integral formulation of the ODE.
For instance one can say that over time steps $Δt\sim\frac1{\sqrtε}$ the solution moves by $x(t+Δt)=x(t)+O(\sqrtε)$. This means in consequence that $$ x(t+Δt)=x(t)+\int_t^{t+Δt}εf(x(s),s,ε)\,ds =x(t)+\int_0^{Δt}εf(x(t),t+s,ε)\,ds+O(εΔt^2) $$ Now if $εΔt^2\ll 1$ as for instance for $Δt\simε^{-1/3}$, then there will be only a very small difference between the actual and the averaged dynamic of $$ \dot y(t) = ε\bar f(y(t),t,ε) ~~\text{where}~~ \bar f(y,t,ε)=\frac1{Δt}\int_0^{Δt}f(y,t+s,ε)\,ds $$ This will introduce some simplification if $f$ contains some high-frequency terms and $Δt$ is the period of them, or much larger than the period. The first will average the high-frequency terms to zero, the latter will reduce the amplitude by a factor that if the number of periods in $Δt$.
Q1) The phase or state space variable in the source article is $x$, not $\dot x$.
Q2) The formulation of the slow drift vs. fast oscillation is unfortunate, both are identically present in $\dot x$ and $f$. As said above, if $f$ contains fast oscillating terms, their contribution will average out without moving $x$ by much, as the full dynamic is scaled by $ε$. So there will be not much difference to the (slow) trend of the solution $x$ if these terms are averaged out in $f$ directly.
Q3) This is just wrong, the solution curves are $x$, they are solutions of $\dot x=εf(t,x,ε)$.
The differences between $x(t)$ and $y(t)$ will accumulate over time, the two curves will drift away from each other. Thus the approximating behavior will be restricted to some finite interval. The difference of $f$ and $\bar f$ will be of size $Δt$, thus the difference of $$ \dot x(t)-\dot y(t) =ε(f(x(t),t,ε)-f(y(t),t,ε))+ε(f(y(t),t,ε)-\bar f(y(t),t,ε)) \\~\\\implies \|\dot x(t)-\dot y(t)\|\le Lε\|x(t)-y(t)\|+MεΔt $$ so that by Grönwalls lemma $$\|x(t)-y(t)\|\le \frac{M(e^{Lt}-1)}{L}εΔt$$ which only stays small for $t\cdot εΔt\ll 1$.
Q4) To make estimates involving the size of $f$ and its derivatives, in general one will need to restrict the considerations to a compact region so that the existence of the maximum is guaranteed.