the order statistic formula is given as follows $ P(X_{(r)}\leq x) = \sum_{j=r}^n C^n_j F(x)^j (1-F(x))^{n-j} $
I undestand that the combination are from the picking $r$ of the $n$ $X$-s to be less than or equal to $x$ but where does the sum come from by intuition? If more than $r$ of the $X$-s are less than or equal to $x$ shouldn't that be smaller?
I'll review a derivation that I hope will make it more intuitive, see bold part for the key obserfation.
Recall that if $Y$ be binomial with parameters $n$ and $p$, then
$$(*)\quad P(Y \ge r) = \sum_{j=r}^n \binom{n}{j} p^j (1-p)^{n-j}.$$
Now let's assume $X_1,\dots,X_n$ are IID with a CDF $F$. Fix some value $x$. Here's the key:
The random variable $X_{(r)}$, the $r$-th order statistic, is less then or equal to $x$ if and only if at least $r$ of the $X_i$'s are less than or equal to $x$
The number of $X_i$-s less than or equal to $x$ is binomial with parameters $n$ and $p=F(x)$, and we therefore apply $(*)$. Note that the summation range is determined by $r$.