Let's say that a sequence of random variables $(X_n)_{n \in \mathbb{N}}$ is $\mathcal {o}(Y_n)$ for some other sequence of random variables $(Y_n)_{n \in \mathbb{N}}$, if $\lim_{n \to \infty}\frac{X_n}{Y_n}=0$ in distribution.
We can use this to interpret the central limit theorem for such a sequence of random variables of mean $\mu$ and variance $\sigma^2$ as an asymptotic expansion (of the distribution) $$ \frac{1}{n}\sum_{j=1}^{n} X_j = \mu + \frac{\xi}{\sqrt{n}} + \mathcal{o}\left(\frac{1}{\sqrt{n}}\right), $$ where $\xi$ is a random variable distributed as $\mathcal{N}(0,\sigma^2)$.
Intuitively the first term comes from the law of large numbers, and the second term is a correction of order $\frac{1}{\sqrt{n}}$ coming from the central limit theorem. Is there some intuition in why this term should be of order $\frac{1}{\sqrt{n}}$? Why isn't the expansion in some other power of $n$, e.g. $\frac{1}{n}$ as in the definition of differentiability.
Assuming that it is intuitively clear that for large $n$, the sample average should be approximately distributed as a normal distribution, why should the quality of this approximation be exactly $\frac{1}{\sqrt{n}}$ and not some other numerical value?
For simplicity set $\sigma^2=1,\mu=0$.If you take $X_1,...,X_n$ as iid $N(0,1)$ random variables then $\xi=(X_1+...+X_n)/\sqrt n$ has the $N(0,1)$ distribution and the expansion $$ \frac{1}{n}\sum_{i=1}^n X_i = \frac{\xi}{\sqrt n} $$ is exact, so indeed there is not much possible choice for the scaling in front of $\xi$.