On page $17$ of these Lecture Notes (theorem on page 15, and section of question copied in image below), why is $H(E) = H(P_e)$? Specifically, $E = \{1 \text{ if } \hat X \neq X, \space 0 \text{ if } \hat X = X\}$ and $P_e = p(\hat X \neq X)$.
Shouldn't we have:
$H(E) = -p(\hat X \neq X) \log p(\hat X \neq X) - p(\hat X = X) \log p(\hat X = X)$
and
$H(P_e) = H(p(\hat X \neq X)) = \sum_{\hat x, x} p(\hat x, x) \log p(\hat x \neq x)$
What am I misunderstanding?

Let $P_e = p(\hat{X}\neq X)$, which is a number (not a probability distribution). By a slight abuse of notation, for $p\in[0,1]$ the entropy $H(p)$ is the binary entropy function, which is the entropy of the Bernoulli r.v. with parameter $p$: $$ H(p) = -p \log p - (1-p)\log(1-p) $$ see lecture 2, p.8 for confirmation. (I personally hate this notation, and would encourage you to use $h(p)$ for the bianry entropy, since it's a function $h\colon[0,1]\to\mathbb{R}$, not taking an r.v. or probability distribution as argument.)
So here, indeed, $$\begin{align} H(P_e) &= -P_e\log P_e - (1-P_e)\log(1-P_e)\\ &= -p(\hat{X}\neq X)\log p(\hat{X}\neq X) - p(\hat{X}= X)\log p(\hat{X}= X)\\ &= H(E) \end{align}$$