Section 3.5 of the homotopy type theory book describes the type $\text{Set}_{\mathcal{U}_{i}}\equiv\sum_{(A:\mathcal{U}_i)}\text{isSet}(A)$, which can be thought as the "type of all sets in the universe $\mathcal{U}_i$". Note, there is an obvious map $\varepsilon:\equiv\lambda x.x : \text{Set}_{\mathcal{U}_{i}}\rightarrow \text{Set}_{\mathcal{U}_{i+1}}$. (It is easy to verify this is well-typed as we have a deduction rule that from $A:\mathcal{U}_i$ we can deduce $A:\mathcal{U}_{i+1}$.)
Now, the book says that this map "cannot be an equivalence, since then we could reproduce the paradoxes of self-reference that are familiar from Cantorian set theory", but does not elaborate. I think I have a sketch proof of this (ie a way of constructing an element of $\text{isEquiv}(\varepsilon)\rightarrow \mathbf{0}$), but I'm struggling to fill in the details at two points. This is my attempt thus far:
(Disclaimer: throughout I will use the standard abuse of notation of "forgetting" the second entries of elements of $\text{Set}_{\mathcal{U}_{i}}$, allowed by lemma 3.5.1.)
To recreate a Cantor style paradox, we would first like to give a notion of a "power set" of a set $S:\text{Set}_{\mathcal{U}_{i}}$. The obvious candidate is the type $(S\rightarrow \mathbf{2})$. However, I found that an alternative formulation makes the proof easier: for $A:\mathcal{U}_{i}$ (not necessarily a set) define $\mathcal{P}_i(A):\equiv \sum_{(P:A\rightarrow\mathcal{U_i})}\prod_{(x:A)}\text{isProp}(P(x))$. I believe I have a proof that $\mathcal{P}_i(A)\simeq (A\rightarrow\mathbf{2})$ using $\text{LEM}$, but as it is rather technical and also not relevant I will omit it. (Tell me if you are skeptical and I will write it up.)
Now, clearly $\mathcal{P}_i(A):\mathcal{U}_{i}$ (this follows immediately from the formal deduction rules in the second appendix), and it is also not so hard to see that $\mathcal{P}_i(A)$ is a set by applying function extensionality several times. (Just use the two facts that (a) $\text{isProp}(P(x))$ is a mere proposition for each $P:A\rightarrow \mathcal{U}_i$, and hence by extensionality so is $\prod_{(x:A)}\text{isProp}(P(x))$, and that (b) if $P, Q:A\rightarrow \mathcal{U}_i$ are families of mere propositions, then for each $x:a$ the type $P(x)=_{\mathcal{U}_i} Q(x)$ is a mere proposition – this just follows from a univalence argument.) Thus we have $\mathcal{P}_i(A):\text{Set}_{\mathcal{U}_{i}}$ for any $A:\mathcal{U}_i$.
To proceed, I imagine we wish to show that $\text{isEquiv}(\varepsilon)$ proves the existence of a set $S:\text{Set}_{\mathcal{U}_{i}}$ such that $S\simeq\mathcal{P}_i(S)$, and then derive an element of $\mathbf{0}$ from this. I have been able to show the latter half of this sentence assuming $\text{LEM}$ (I will give my proof below), but showing the first half is my first roadblock.
My suspicion for why we could do such a thing is that, since $\mathcal{U}_i :\mathcal{U}_{i+1}$, we have $\mathcal{P}_{i+1}(\mathcal{U}_i):\text{Set}_{\mathcal{U}_{i+1}}$. Hence if we had a quasi-inverse $\delta:\text{Set}_{\mathcal{U}_{i+1}}\rightarrow \text{Set}_{\mathcal{U}_{i}}$ to $\varepsilon$, we would have an element $\delta (\mathcal{P}_{i+1}(\mathcal{U}_i)):\text{Set}_{\mathcal{U}_i}$, such that $\varepsilon(\delta(\mathcal{P}_{i+1}(\mathcal{U}_i)))=_{\text{Set}_{\mathcal{U}_{i+1}}} \mathcal{P}_{i+1}(\mathcal{U}_i)$, ie such that $\delta(\mathcal{P}_{i+1}(\mathcal{U}_i))\simeq\mathcal{P}_{i+1}(\mathcal{U}_i)$. This feels fishy to me (surely $\mathcal{P}_{i+1}(\mathcal{U}_i)$ is "too big" to be an element of $\mathcal{U}_i$), and I suspect there should be some way of using it to get the desired $S:\mathcal{U}_i$ with $S\simeq \mathcal{P}_i(S)$, but I have struggled to figure out how to do this. So, question 1: how can we do this? (And is it even possible?)
In any case, assume this is possible, and suppose given $S:\mathcal{U}_i$ with $S\simeq \mathcal{P}_i(S)$. We can now formulate a version of Cantor's paradox in the language of type theory; indeed let $e:S\rightarrow \mathcal{P}_i(S)$ be an equivalence. (Throughout we will abuse notation by identifying elements of $\mathcal{P}_i(S)$ with their first projections, again exploiting fact (a) above and lemma 3.5.1.) Thus in particular, for each $x:S$ we have $e(x):S\rightarrow\mathcal{U}_i$, and so defining the type family $Q:S\rightarrow\mathcal{U}_i$ by $Q:\equiv \lambda x.(e(x)(x)\rightarrow \mathbf{0})$ is evidently well-typed. Furthermore, since $\mathbf{0}$ is a mere proposition, each $Q(x)\equiv (e(x)(x)\rightarrow \mathbf{0})$ is also a mere proposition, and thus we have $Q:\mathcal{P}_i(S)$.
Now, to derive the desired contradiction, we will of course consider $s:\equiv f(Q):S$, where $f:\mathcal{P}_i(S)$ is a quasi-inverse to $e$. We will also assume $\text{LEM}$. Note that since $e(s):\mathcal{P}_i(S)$, we have $\prod_{(x:S)}\text{isProp}(e(s)(x))$, and so in particular $e(s)(s)$ is a mere proposition. Thus by $\text{LEM}$ we have an element $z: e(s)(s)+(e(s)(s)\rightarrow \mathbf{0})$. But now, since $e$ is quasi-inverse to $f$, we have $e(s)(s)\equiv e(f(Q))(s)=Q(s)\equiv (e(s)(s)\rightarrow\mathbf{0})$ and likewise $(e(s)(s)\rightarrow\mathbf{0})=((e(s)(s)\rightarrow\mathbf{0})\rightarrow\mathbf{0})$. Thus, finally, by case analysis and appropriate transport along the relevant path (say taking $z$ to $z'$), we have $z'(z):\mathbf{0}$ as desired.
I believe this argument works; if anyone could confirm that would be greatly appreciated. But it also raises a second question for me: is it possible to reach this desired result without assuming $\text{LEM}$? I don't really see how we could go without it, but I am not sure. There is also the issue of question 1 above, which is really stumping me, and I am not sure if it is an unfixable hole in the proof. Any insight – either a (partial) answer to question 1 or question 2, or a (reference to a) nicer proof of this entire result – would be enormously appreciated. In all cases apologies for the long-windedness of this question.
For your zeroth question: Using the universal property for mapping into sigma-types, your $\mathcal{P}_i(A)$ is equivalent to $A \to \rm Prop$, where ${\rm Prop} = \sum_{P:\mathcal{U}_i} {\rm isProp}(P)$. Thus, if we assume LEM then it is equivalent to $A\to \mathbf{2}$, since LEM is equivalent to saying ${\rm Prop} \simeq \mathbf{2}$.
For your second question: Yes. First suppose $e(s)(s)$. Then since $e(s) = e(f(Q)) = Q$, we have $Q(s)$, i.e. $e(s)(s) \to \mathbf{0}$. But since we also supposed $e(s)(s)$, we have $\mathbf{0}$. That is, supposing $e(s)(s)$ we have deduced $\mathbf{0}$, so we have $e(s)(s)\to \mathbf{0}$, in other words $Q(s)$, in other words $e(s)(s)$. But also $e(s)(s)\to \mathbf{0}$ (as we just proved), hence $\mathbf{0}$. Note that the proof only uses that $e$ is a surjection, not an equivalence. See here for some further refinements and generalizations, none of which use LEM.
For your first question: Yes, it's possible, but it's a little tricky because of the structural nature of type theory. It's actually easier to reproduce the Burali-Forti paradox, which in type theory goes by the name of "Girard's paradox" -- this is because well-orderings can be built as structure on a set. To reproduce Cantor's paradox, as you intend, you can use a similar notion of well-founded relation: let $S'$ be the type of types in $\mathcal{U}_i$ equipped with extensional well-founded accessible pointed graphs (see chapter 10 of the book for well-foundedness, and e.g. here for other terminology) and $S = \delta(S')$.
Another, simpler, approach is possible if you use W-types (introduced in chapter 5 of the book), as noticed by Thierry Coquand. Let $S = \mathsf{W}_{(A:\mathcal{U}_i)} A$ (regarded as in $\mathcal{U}_i$ by $\delta$). Define $e:S \to \mathcal{P}_i(S)$ by $e(\mathsf{sup}(A,f))(t) :\equiv \exists(a:A) (f(a)=t)$, where $A:\mathcal{U}_i$, $f:A\to S$, and $t:S$. Then $e$ is surjective, since for any $W:\mathcal{P}_i(S)$ we can let $A:\equiv \delta(\sum_{s:S} W(s))$ and $f:A\to S$ the first projection, and then $e(\mathsf{sup}(A,f)) = W$.