On page 28 of Ziegler's "Lectures on Polytopes", we are told that an $H-poyhedron$ is an intersection of closed halfspaces. He then defines a halfspace as a set $P \subset \mathbb{R}^d$ presented in the form $P = P(A, z) = \{x \in \mathbb{R}^d: Ax \leq z\}$ for some $A \in \mathbb{R}^{m \times d}, z \in \mathbb{R}^m$. (Here $Ax \leq z$ is the usual shorthand for a system of inequalities, namely $a_1x \leq z_1, ..., a_mx \leq z_m$, where $a_1, ..., a_m$ are the rows of $A$ and $z_1, ..., z_m$ are the components of $z$.)
Can anyone please explain to me why halfspaces are defined this way? Also, why is $z$ an $m-$dimensional vector? I tried thinking about lower dimensions and in $\mathbb{R}^2$, I still don't understand why a halfspace is defined as $ax+by + c \leq 0$. I get that $ax+by + c = 0$ is a hyperplane in $\mathbb{R}^2$, but why making it into an inequality turns it into a halfspace? How do we know that all the points satisfying $ax+by + c \leq 0$ necessarily lie below this line?
Thanks a lot in advance. I would really appreciate any help. Thanks again.
A hyperplane is defined by its normal vector $\vec{n}$ and signed distance $d$ from origin in units of the normal vector length. Point $\vec{p}$ is on the hyperplane if and only if $$\vec{n} \cdot \vec{p} = d$$ Note that this is the definition for the signed distance $d$ when $\vec{p}$ is on the plane itself. If the plane passes through origin, then $d = 0$. If the plane is on the same side from origin the normal points towards, then $d \gt 0$. If the plane is on the opposite side from origin the normal points towards, then $d \lt 0$.
The hyperplane splits the space into two: the side where the normal vector points to, and the other one (opposite to where the normal vector points), $$\vec{n} \cdot \vec{p} \lt d \label{1}\tag{1}$$ which is the definition of a single halfspace.
(The comparison direction varies, since it is just an arbitrary choice. It can also include or exclude the hyperplane. Thus, all of $\lt$, $\le$, $\ge$, and $\gt$ are acceptable in $\eqref{1}$ above.)
In $\mathbb{R}^n$, we can express this relationship using a matrix $\mathbf{A}$ with $m$ row vectors (each representing a single $\vec{n}$), and a result column vector $\vec{z}$, whose $m$ components are the signed distances from origin in units of the corresponding normal vector (each representing the corresponding $d$). Then, the comparison $\lt$ in $$\mathbf{A}\vec{p} \lt \vec{z} \tag{2a}\label{2a}$$ refers to component-wise comparison; it is true if and only if each component of $\mathbf{A}\vec{p}$ is less than the corresponding component of $\vec{z}$. So, it is a very simple shorthand compared to something like $$\vec{n}_i \cdot \vec{p} \lt d_i \quad \forall ~ 0 \le i \lt m \in \mathbb{N} \tag{2b}\label{2b}$$ which you can see in texts using basic vector algebra notation instead of linear algebra notation. The two ($\eqref{2a}$ and $\eqref{2b}$) do describe the exact same thing, assuming $\mathbf{A}$ has $m$ rows, and $\vec{z}$ is a column vector with $m$ components. The convex polytope is then defined by $m$ halfspaces.
In this definition, the normal vectors point outwards from the polytope, and strictly speaking, the surface of the polytope is not included in its volume. (It does not change the volume at all, of course, but it will affect computer programs that implement the algorithm, since they work with limited precision numbers.)
In vector notation, if we use $\vec{n} = (a, b)$ and $\vec{p} = (x, y)$, this is again $$\vec{n} \cdot \vec{p} \lt c$$ where $\vec{n}$ is the normal vector of the hyperplane, perpendicular to the 2D line $a x + b y = c$. If we pick any point $\vec{p}_0$ on the line itself, then $$\vec{n} \cdot (\vec{p} - \vec{p}_0) = \vec{n} \cdot \vec{p} - c$$ because by definition, $c = \vec{n} \cdot \vec{p}_0$. In vector algebra terms, $\vec{n} \cdot \vec{p}$ is also the length of $\vec{p}$ projected to vector $\vec{n}$: if $\vec{p}$ is in the same halfspace as $\vec{n}$, the dot product increases as $\vec{p}$'s magnitude increases; if the two are perpendicular, the dot product is zero; and if $\vec{p}$ is in the other halfspace as $\vec{n}$, the dot product decreases.
A practical example. Let's choose line $y = 1$, with the normal pointing towards negative $y$ axis, so we choose the half-space above $y = 1$. Then, $\vec{n} = (0, -1)$, i.e. $a = 0$, $b = -1$, $c = -1$ (negative, because $y = 1$ is in the opposite direction to $\vec{n}$), and we have $$a x + b y - c = 0 x - 1 y - (-1) = 1 - y \lt 0 \quad \forall y \gt 1$$
Another practical example. Let's choose line $x = 0$, with the normal pointing towards positive $x$ axis, so we choose the negative $x$ halfspace. Now, $\vec{n} = (1, 0)$, $a = 1$, $b = 0$, $c = 0$, and $$a x + b y - c = 1 x - 0 y - 0 = x \lt 0 \quad \forall x \lt 0$$ So, OP's definition of "below" must be understood in relation to the normal vector, as in "the halfspace in the opposite direction of the hyperplane than the normal vector points towards".