Let the curve $C:x = x(t)$ be the extremum path.
Let the curve $C': x_u= x(t) + u\eta(t)$ be the varied path(where $u$ is small).
The curve $C'$ lies in the $\epsilon$-neighbourhood of $C$ if $|x-x_u| < \epsilon$ for $\epsilon > 0$
and the curve $C'$ lies in the $(\epsilon,\epsilon')$-neighbourhood of $C$ if both $|x-x_u| < \epsilon$ and $|\dot{x}-\dot{x}_u| < \epsilon'$ for $\epsilon,\epsilon' > 0$
Basically a $\textbf{weak extremum}$ is found if we compare the curve $C$ to the curves in the $(\epsilon,\epsilon')$-neighbourhood, and similarly a $\textbf{strong extremum}$ is found if we compare the curve $C$ to curves is in the $\epsilon$-neighbourhood.
During the formulation of the Euler-Lagrange equations we have$I(u) = \int^b_aL(t,x(t,u),\dot{x}(t,u))dt = \int^b_aL(t,x(t) + u\eta(t),\dot{x}(t) + u\dot{\eta}(t))dt$
The taylor expansion of our functional is :$I(u) = I(0) + u \left(\frac{\textrm{d}I}{\textrm{d}u}\right)_{u=0} + O(u^2)$
First I belive that the $u$ is small so that the $O(u^2)$ term can be dropped and thus $I(u) - I(0) = u \left(\frac{\textrm{d}I}{\textrm{d}u}\right)_{u=0}$, allowing the terms on the LHS to have the same sign as the term on the RHS(is this the reason for the smallness of $u$?)
For the weak condition, I belive that the first part of the neightbourhood $|x-x_u| < \epsilon$ is only used to show that the extrema that we are trying to find are only local extremums and thus we are only comparing these neightbourhood curves(is this correct?).
But I don't know why or where we use the second part of the neightbourhood $|\dot{x}-\dot{x}_u| < \epsilon'$. Why do we need the curve to have this property?
What are the situations when we need to find the weak instead of the strong extreme values, and visa versa?
I would greatly appreciate any information you can give. I know I asked a lot of questions here, so even if you only want to answer 1 or 2 of them I will still really appreciate it.