I am struggling to understand the following theorem.
Theorem. Suppose that $f$ and $h$ are sufficiently smooth functions on the open set $C$. Let $x \in C$ be a local minimum of $f$ subject to the constraints $h(x) = 0$, $x \in C$. In addition, assume the regularity condition (constraint qualification)
$$h'(x) \ is\ onto$$
Then there exists $\lambda$ (a Lagrange multiplier vector) such that $$0 = \nabla f(x) + \langle\lambda, h'(x)\rangle.$$
Question
What does it mean for $h'(x)$ to be onto at a local min (how can a function be onto at only one point)? and why does linear independence of $\{\nabla h_k(x)\}_k$ imply $h'(x)$ being onto, where $h(x)=[ \ h_1(x) ... h_m(x) \ ]^T$?
You could say that an $m$ by $n$ matrix $A$ is onto if the function
$q(x)=Ax$
is onto $R^{m}$, or equivalently if $\mbox{rank}(A)=m$, or equivalently if $A$ has $m$ linearly independent rows.
For a particular point $x$, $h'(x)$ is the matrix
$h'(x)=\left[ \begin{array}{c} \nabla h_{1}(x)^{T} \\ \nabla h_{2}(x)^{T} \\ \vdots \\ \nabla h_{m}(x)^{T} \end{array} \right] $
Saying that $h'(x)$ is onto is simply an unusual way of saying that the vectors $\nabla h_{1}(x)$, $\nabla h_{2}(x)$, $\ldots$, $\nabla h_{m}(x)$ are linearly independent.