I'm having trouble relating the content on these notes
to these ones from MIT OCW
here.
Specifically, the question I'm having is the first set of notes describes the specific half space where $a^Tx \leq a^T x_0$ and the second concludes that any one of the half spaces must include $C$.
Generally, I'm having trouble visualizing this for all $a$. Does the theorem assume the same $a$ for all boundary points?
Both statements are equivalent. For any given point on the boundary, there is a hyperplane $a^Tx=k=a^Tx_0$ such that your set is in one of the half-spaces defined by it. You can always assume it is the "positive" half-space $a^Tx\ge k$, otherwise you replace $a$ by $-a$. Yes, the linear form $a$ depends on the point, it is different at every point, in general.