Show that
$$\sum_{n\le x} \mu ^2(n)=\frac{x}{\zeta(2)}+o(\sqrt{x}) \; (x\to \infty)$$
I've proven so far that $\sum_{n\le x} \mu ^2(n)=\frac{x}{\zeta(2)}+O(\sqrt{x})$.
I want to reduce this error term $O(\sqrt{x})$ to $o(\sqrt{x})$, and I'm thinking of using the relation $M(x)=\sum_{n\le x}\mu(n)=o(x)$ which is equivalent to the Prime Number Theorem. How can I use this relation to reduce the error term?
I would greatly appreciate any help.
First, let's see if we can understand why this "should" be true from an analytic perspective.
We know that $$ \sum_{n \geq 1} \frac{\mu(n)^2}{n^s} = \frac{\zeta(s)}{\zeta(2s)},$$ and a general way of extracting information from Dirichlet series is to perform a cutoff integral transform (or a type of Mellin transform). In this case, we get that $$ \sum_{n \leq X} \mu(n)^2 = \frac{1}{2\pi i} \int_{(2)} \frac{\zeta(s)}{\zeta(2s)} X^s \frac{ds}{s},$$ where the contour is the vertical line $\text{Re }s = 2$. By Cauchy's theorem, we shift the line of integration left and poles contribute terms or large order. The pole of $\zeta(s)$ at $s = 1$ has residue $$ \frac{X}{\zeta(2)},$$ so we expect this to be the leading order. Naively, since we know that there are no zeroes of $\zeta(2s)$ on the line $\text{Re } s = \frac{1}{2}$, we might expect to push our line to exactly there, leading to an error of $O(\sqrt X)$. But in fact, we know more. We know the zero-free region, which allows us to extend the line of integration ever so slightly inwards, leading to a $o(\sqrt X)$ result (or more specifically, something along the lines of $O(\sqrt X e^{-c (\log X)^\alpha})$ where $\alpha$ and $c$ come from the strength of our known zero-free region.
In this heuristic analysis, I have omitted bounding the top, bottom, and left boundaries of the rectangles of integration. But proceeding in a similar way as in the proof of the analytic prime number theorem, you could proceed here. So we expect the answer to look like $$ \frac{X}{\zeta(2)} + O(\sqrt X e^{-c (\log X)^\alpha})$$ using no more than the zero-free region that goes into the prime number theorem.
We will now prove this result, but in an entirely elementary way (except that I will refer to a result from the prime number theorem).
We do this in a series of steps.
Proof. This comes almost immediately upon noticing that this is a multiplicative function, and it's trivial to prove it for prime powers. $\spadesuit$
So to sum up the squarefree numbers up to $X$, we look at $$ \sum_{n \leq X} \sum_{d^2 \mid n} \mu(d) = \sum_{d^2e \leq X} \mu(d)= \sum_{d^2 \leq X} \mu(d) \left\lfloor \frac{X}{d^2} \right\rfloor.$$
This last expression is written in one of the links in Marty's answer, and they prove it with inclusion-exclusion. I happen to find this derivation more intuitive, but it's our launching point forwards.
We approximate the floored bit. Notice that $$ \left \lfloor \frac{X}{d^2} \right \rfloor = \frac{X}{d^2} + E(X,d)$$ with $\lvert E(x,d) \rvert \leq 1$ (we think of it as the Error of the approximation). So the number of squarefree numbers up to $X$ is $$ \sum_{d^2 \leq X} \mu(d) \frac{X}{d^2} + \sum_{d^2 \leq X}\mu(d) E(x,d).$$ We look at the two terms separately.
The first term
The first term can be approximated by the infinite series plus an error term. $$ \sum_{d^2 \leq X} \frac{\mu(d)}{d^2} = X\sum_{d \geq 1} \frac{\mu(d)}{d^2} - X\sum_{d > \sqrt X} \frac{\mu(d)}{d^2} = \frac{X}{\zeta(2)} - X\sum_{d > \sqrt X} \frac{\mu(d)}{d^2}.$$
We must now be a bit careful. If we perform the naive bound, by bounding $\mu(n) \leq 1$, then this last sum is of size $O(\sqrt X)$. That's too big!
So instead, we integrate by parts (Riemann-Stieltjes integration) or equivalently we perform summation by parts to see that $$ \sum_{d > \sqrt X} \frac{\mu(d)}{d^2} = O\left( \int_{\sqrt X}^\infty \frac{M(t)}{t^3} dt \right)$$ where $$ M(t) = \sum_{n \leq t} \mu(n).$$ By the prime number theorem (and as you mention in your question), we know that $M(t) = o(t)$. (In fact, the analytic prime number theorem in one of the easy forms is that $M(X) = O(X e^{-c (\log X)^{1/9}})$, which we might use here). This means that this last term is bounded by $$ o(\sqrt X)$$ if we just use that $M(t) = o(t)$, or $$ O(\sqrt X e^{-c (\log X)^{1/9}})$$ if we use more. This completes the first term $\spadesuit$
Second term
This is easier now. We again use integration by parts. Notice that $$ \begin{align} \sum_{d \leq \sqrt X} \mu(d) E(X,d) &= \sum_{d \leq \sqrt X} (M(d) - M(d - 1))E(X,d) \\ &= M(\lfloor \sqrt X \rfloor) E(X, \lfloor \sqrt X \rfloor) + \sum_{d \leq \sqrt X - 1} M(d) (E(X, d) - E(X, d+1)). \end{align}$$
By using that $E(X,d) \leq 1$ and $M(\sqrt X) = o(\sqrt X)$ (or the stronger version), we match the results from the first part. $\spadesuit$
Putting these two results together, we have proven that the number of squarefree integers up to $X$ is $$\frac{X}{\zeta(2)} + o(\sqrt X)$$ using only that $M(X) = o(\sqrt X)$, and alternately $$ \frac{X}{\zeta(2)} + O(\sqrt X e^{-c (\log X)^{1/9}})$$ using a bit of the zero-free region. This completes the proof. $\diamondsuit$