Here, Terence Tao writes:
Proposition 4 (No universal set) There does not exist a set which contains all sets (including itself).
Proof: Suppose for contradiction that there existed a universal set ${X}$ which contained all sets. Using the axiom schema of specification, one can then construct the set
$\displaystyle Y := \{ A \in X: A \not \in A\}$
of all sets in the universe which did not contain themselves. As ${X}$ is universal, ${Y}$ is contained in ${X}$. But then, by definition of ${Y}$, one sees that ${Y \in Y}$ if and only if ${Y \not \in Y}$, a contradiction. $\Box$
[…]
One can “localise” the above argument to a smaller domain than the entire universe, leading to the important
Proposition 5 (Cantor’s theorem) Let ${X}$ be a set. Then the power set ${2^X := \{ A: A \subset X \}}$ of ${X}$ cannot be enumerated by ${X}$, i.e. one cannot write ${2^X := \{ A_x: x \in X \}}$ for some collection ${(A_x)_{x \in X}}$ of subsets of ${X}$.
Proof: Suppose for contradiction that there existed a set ${X}$ whose power set ${2^X}$ could be enumerated as ${\{ A_x: x \in X \}}$ for some ${(A_x)_{x \in X}}$. Using the axiom schema of specification, one can then construct the set
$\displaystyle Y := \{ x \in X: x \not \in A_x \}$.
The set ${Y}$ is an element of the power set ${2^X}$. As ${2^X}$ is enumerated by ${\{ A_x: x \in X \}}$, we have ${Y = A_y}$ for some ${y \in X}$. But then by the definition of ${Y}$, one sees that ${y \in A_y}$ if and only if ${y \not \in A_y}$, a contradiction. $\Box$
What does Terence Tao mean with this:
One can “localise” the above argument to a smaller domain than the entire universe, leading to the important [Cantor's theorem]
But for me the arguments of proposition 4 and 5 seem not to be the same. Of course, there are somewhat similar but however, it doesn't seem to be an easy job to prove Cantor's theorem if one already knows Russell's paradox. Can you guess what Tao had in mind when saying this quote ("one can 'localise' the above …")? Or isn't there something deep behind it, and he is just trying to say that we use a similar argument to prove something about "sets" rather than "classes"?
The idea is that we think of the proposed bijection taking $x$ to $A_x$ as an identification - so we're thinking of $x$ and $A_x$ as being the same thing. Our universe is $X$. In this argument, we then interpret Russell's set within this universe: $\{x \in X : x \notin x\}$. But we're thinking of $x$ and $A_x$ as being the same thing, so the set is $\{x \in X : x \notin A_x\}$. By the same argument as Russell's Paradox, this set can't be (the $A$ identified with) any member of $X$.
It's not completely the same, which is why Tao doesn't list Cantor's Theorem as a corollary rather than a proposition in its own right; but Cantor's Theorem and Russell's Paradox are "basically" the same thing. It might be easier to see in the other direction: if we had a universal set $X$, then $2^X$ would have to be a subset of it, and then we could use the identity function as the map from $X$ to $2^X$ (so $A_x = x$). Then the set used in Cantor's Theorem, $\{x \in X : x \notin A_x\}$ is exactly Russell's set.