I'm reading about Zero-order conditions in Nonlinear Programming and the following confuses me (my questions are below the theory):

Consider the set $\Gamma \subset E^{n+1} = \{(r,\textbf{x}): r\geq f(\textbf{x}), \textbf{x}\in E^n\}.$ In a figure of the graph of $f$, the set $\Gamma$ is the region above the graph, shown in the upper part of the figure. This set is called the epigraph of $f$. It is easy to verify that the set $\Gamma$ is convex if $f$ is convex function. Suppose that $\textbf{x}^*\in \Omega$ is the minimizing point with value $f^*= f(\textbf{x}^*).$ We construct a tubular region with cross section $\Omega$ and extending vectically from $-\infty$ up to $f^*$, shown as $B$ in the upper part of the figure. This is also a convex set, and it overlaps the set $\Gamma$ only at the boundary point $(f^*, \textbf{b}^*)$ above $\textbf{x}^*$ (or possibly many boundary points if $f$ is flat near $\textbf{x}^*$).
According to the separating hyperplane theorem, there is a hyperplane separating these two sets. This hyperplane can be represented by a nonzero vector of the form $(s,\boldsymbol \lambda)\in E^{n+1}$ with $s$ a scalar and $\boldsymbol \lambda\in E^n$, and a separation constant $c$. The separation conditions are
$$sr + \boldsymbol \lambda^T\textbf{x} \geq c\;\;\;\text{for all $\textbf{x} \in E^n$ and $r\geq f(\textbf{x})$}\;\;\;\;\;(1)$$ $$sr + \boldsymbol \lambda^T\textbf{x} \leq c\;\;\;\text{for all $\textbf{x} \in \Omega$ and $r\leq f^*$}.\;\;\;\;\;\;\;\;\;(2)$$
It follows that $s\neq0$; for otherwise $\boldsymbol \lambda \neq\textbf{0}$ and then (1) would be violated for some $\textbf{x}\in E^n$. It also follows that $s\geq0$ since otherwise (2) would be violated by very negative values of $r$. Hence, together we find $s>0$ and by appropriate scaling we may take $s=1$.
My questions are:
1) Why did $B$ overlap $\Gamma$ at the point $(f^*, \textbf{b}^*)$? I thought it should be $(f^*, \textbf{x}^*)$
2) Why is $s\neq0$ and $s\geq0$? I didn't understand this
3) Why may we take $s=1$?
Please let me know if you need more information, thank you for any help! =)
I think 1) is just a misprint. Indeed, it should read $(f^*, x^*)$.
Ad 2): Assume $s = 0$. Then, we have $\lambda^\top \, x \ge c$ for all $x \in E^n$. This yields $\lambda = 0$ which contradicts $(s, \lambda) \ne 0$.
Now Assume $s < 0$. But then $s \, r + \lambda^\top \, x \le c$ is violated, since $r$ could be arbitrarily small.
This yields $s > 0$.
Ad 3): This is just a simple scaling argument. Set $s^* = 1 = s/s$, $c^* = c / s$ and $\lambda^* = \lambda / s$ and divide (1) and (2) by $s$.
By the way: I am wondering why you called these conditions ``zero-order'' conditions. Indeed, by evaluating (1) and (2), one should arrive at $-\lambda \in \partial f(x^*)$ (the convex subdifferential of $f$ at $x^*$) and $\lambda \in N_\Omega(x^*)$ (the normal cone of $\Omega$ at $x^*$), which is just the first order condition $0 \in \partial f(x^*) + N_\Omega(x^*)$.