I am having some trouble understanding the proof of the Necessary Conditions of the ECP using Lagrange Multipliers in Bertsekas' book Nonlinear programming (1999).
The part relevant to my inquiry is the following partial statement
Proposition 3.1.1: (Lagrange Multiplier Theorem - Necessary Conditions)
Let $x^*$ be a local minimum of $f$ subject to $h(x) = 0$, and assume that the constraint gradients $\nabla h_1(x^*), \dots, \nabla h_m(x^*)$ are linearly independent. Then there exists a unique vector $\lambda^*=(\lambda_1^*, \dots, \lambda_m^*)$ called Lagrange multiplier vector, such that $$\nabla f(x^*)+\sum_{i=1}^{m}\lambda_i^*\nabla h_i(x^*)=0$$ [...]
Regarding notation, we have the declaration at the start of the section NECESSARY CONDITIONS FOR EQUALITY CONSTRAINTS.
We assume $f:\mathbb{R}^n →\mathbb{R}^n$, $h_i:\mathbb{R}^n→\mathbb{R}$ for $i\in{1,\dots,m}$ are continuously differentiable functions. [...]
For notational convenience, we introduce the constraint function $h:\mathbb{R}^n→\mathbb{R}^m$, where $$h=(h_1,\dots,h_m)$$
For the record, $x^*$ is a local minimum of $f$ if it is the minimizer within a neighborhood of $x^*$. A feasible point is a point that satisfies the constraints given by the problem.
Now, I am going to write down a transcription of the proof presented in the book using the penalty approach.
Penalty Approach
Here we approximate the original constrainted problem by an unconstrained optimization problem that involves a penalty for violation of the constraints. In particular, for $k\in\mathbb{N}$, we introduce the cost function $$F^k(x)=f(x)+\frac{k}{2}‖h(x)‖^2+\frac{\alpha}{2}‖x-x^*‖^2$$ where $x^*$ is the local minimum of the constrained problem and $\alpha$ is some positive scalar. [...]
Since $x^*$ is a local minimum, we can select $\epsilon>0$ such that $f(x^*)\leq f(x)$ for all feasible $x$ in the closed sphere $$S=\{x : ‖x-x^*‖ \leq \epsilon\}$$ Let $x^k$ be an optimal solution of the problem of minimizing $F^k$ subject to $x\in S$ [An optimal solution exists because of Weierstrass' theorem [...]].
We will show that the sequence $\{x^k\}$ converges to $x^*$.
We have for all $k$ $$F^k(x)=f(x)+\frac{k}{2}‖h(x)‖^2+\frac{\alpha}{2}‖x-x^*‖^2\leq F^k(x^*)=f(x^*) \tag{1}$$ and since $f(x^k)$ is bounded over $S$, we obtain $$\lim_{k→\infty}‖h(x^k)‖ = 0$$ otherwise the left hand side of Eq. (1) would become unbounded above as $k→\infty$. Therefore, every limit point $\bar{x}$ of $\{x^k\}$ satisfies $h(\bar{x})=0$. Furthermore, Eq. (1) yields $f(x^k)+\frac{\alpha}{2}‖x^k-x^*‖^2\leq f(x^*)$ for all $k$, so by taking the limit as $k→\infty$, we obtain $$f(\bar{x})+\frac{\alpha}{2}‖\bar{x}-x^*‖^2\leq f(x^*)$$ [...]
My question is very simple regarding to 2 usages of the concept of limit:
Regarding to this statement
We have for all $k$ $$F^k(x)=f(x)+\frac{k}{2}‖h(x)‖^2+\frac{\alpha}{2}‖x-x^*‖^2\leq F^k(x^*)=f(x^*) \tag{1}$$ and since $f(x^k)$ is bounded over $S$, we obtain $$\lim_{k→\infty}‖h(x^k)‖ = 0$$
I don't see why can we take the limit to be equal to $0$. As far as I am concerned, we don't know if the sequence $\{x^k\}$ converges at all.
Regarding this statement
Furthermore, Eq. (1) yields $f(x^k)+\frac{\alpha}{2}‖x^k-x^*‖^2\leq f(x^*)$ for all $k$, so by taking the limit as $k→\infty$, we obtain $$f(\bar{x})+\frac{\alpha}{2}‖\bar{x}-x^*‖^2\leq f(x^*)$$
By the same reason as before, I can't quite justify that $\{x^k\}$ converges.
I have tried modifying the proof by taking a subsequence that converges, as I do see that by being a bounded sequence, $\{x^k\}$ must have a limit point. And by using this subsequence, I see no problem at all when asserting the previous statements.
However, I am probably missing something in the original proof, so I would appreciate if someone could shed some light on this.
Thank you in advance.
Question 1.
There is some misconception here. At this point of the argument, it is not that the sequence $\{x^k\}$ converges, but that the sequence $\{\|h(x^k)\|\}$ converges to zero. This is already explained in the excerpt you quote from the book:
Equation (1) says that: $$f(x)+\frac{k}{2}‖h(x)‖^2+\frac{\alpha}{2}‖x-x^*‖^2\leq f(x^*) \tag{1}$$ By boundedness (in $S$) of all other quantities except $\frac{k}{2}‖h(x)‖^2$, it follows that there exists some $M>0$ such that for all $k$, $$ k\|h(x^k)\|\le M\;. $$ It thus follows that $\lim_k \|h(x^k)\|=0$.
Question 2.
The logic regarding your second question is that in order to show that $\lim_{k}x^k=x^*$, the author shows that $\|\bar{x}-x^*\|=0$ for every possible limit point $\bar{x}$ of the sequence $\{x^k\}$; this would imply that $x^k$ converges to $x^*$.
Now if $\bar{x}$ is a limit point of the sequence $\{x^k\}$, then there exists a subsequence, with common abuse of notation, still denoted as $\{x^k\}$, such that $x^k\to \bar{x}$ as $k\to\infty$. Now by the limit you get in Question 1, and continuity of $h$, you get $h(\bar{x})=0$.