I was introduced to the Markov inequality:
For any r.v. $X$ and constant $a > 0$,
$$P(\vert X \vert \ge a) \le \dfrac{E \vert X \vert}{a}.$$
I was then introduced to the Chebyshev inequality:
Let $X$ have mean $\mu$ and variance $\sigma^2$. Then for any $a > 0$,
$$P(\vert X - \mu \vert \ge a) \le \dfrac{\sigma^2}{a^2}$$
The proof for the Chebyshev inequality was given as follows:
By Markov's inequality,
$$P(\vert X - \mu \vert \ge a) = P((X - \mu)^2 \ge a^2) \le \dfrac{E(X - \mu)^2}{a^2} = \dfrac{\sigma^2}{a^2}.$$
Substituting $c \sigma$ for $a$, for $c > 0$, we have the following equivalent form of Chebyshev's inequality:
$$P(\vert X - \mu \vert \ge c \sigma) \le \dfrac{1}{c^2}.$$
This gives us the upper bound on the probability of an r.v. being more than $c$ standard deviations away from its mean, e.g., there can't be more than a 25% chance of being 2 or more standard deviations from the mean.
I have two questions. First, I'm wondering how the authors went from $P(\vert X - \mu \vert \ge a)$ to $P((X - \mu)^2 \ge a^2)$? I'm unsure of the algebra that $\vert X - \mu \vert = (X - \mu)^2$. And then, how did the authors go from "being more than $c$ standard deviations away from its mean" to the specific claim that "there can't be more than a 25% chance of being 2 or more standard deviations from the mean"? Thank you.
For the first question, just see that the event $|X-\mu|\ge a$ is the same event as $(|X-\mu|)^2\ge a^2$ (and we can omit the absolute value).
For the second question , it just an example of a usage of this inequality with $c=2$.