Help Needed for Understanding Sequential Quadratic Programming

32 Views Asked by At

I am trying to understand the formulation of the sequential quadratic programming with inequality constraints using interior point method by reading the PhD thesis of Long Hei entitled "Practical Techniques for Nonlinear Programming".

$$\min_{x\in R^n} f\left(x\right)\\{\text{subject to}}\quad c\left(x\right)\ge0$$

where $f\left(x\right): R^n\rightarrow R$ and $c\left(x\right): R^n\rightarrow R^m$

to do so, a slack variable is introduced and the problem is converted into a barrier problem:

$$\min_{x\in R^n, s\in R^m} f\left(x\right)-\mu\sum_{i=1}^{m}\ln{s_i}\\{\text{subject to}}\quad c\left(x\right)-s=0$$

where $\mu\rightarrow 0$ is the penalty parameter. The Lagrangian for the barrier problem is:

$$\mathcal{L}\left(x,s,\lambda\right) = f\left(x\right)-\mu\sum_{i=1}^{m}\ln{s_i}-\lambda^T\left(c\left(x\right)-s\right)$$

where $\lambda\in R^m$ are the Lagrangian multipliers of the barrier problem.

All the above makes perfect sense to me. However, when I linearize $\mathcal{L}$ around the point $\left(x^k,s^k,\lambda^k\right)$

$$\begin{split}\mathcal{L}\left(x,s,\lambda\right) &\approx \\ &\mathcal{L}\left(x^k,s^k,\lambda^k\right) + \\ &\nabla_x\mathcal{L}\left(x^k,s^k,\lambda^k\right)^Td_x^k + \\ &\nabla_s\mathcal{L}\left(x^k,s^k,\lambda^k\right)^Td_s^k + \\ &{d_x^k}^T\nabla_{xx}^2\mathcal{L}\left(x^k,s^k,\lambda^k\right)^Td_x^k + \\ &{d_s^k}^T\nabla_{ss}^2\mathcal{L}\left(x^k,s^k,\lambda^k\right)^Td_s^k \end{split}$$

I don't get the following minimization formulation given in the thesis as (so-called the local quadratic programming model)

$$\min_{d_x\in R^n, d_s\in R^m} \frac 12 d_x^T\left(\nabla_{xx}^2\mathcal{L}\right)d_x+\frac 12 d_s^T\Sigma d_s + \nabla f^Td_x-\mu\left(S^{-1}e\right)^Td_s\\{\text{subject to}}\quad Ad_x-d_s+\left(c-s\right)=0$$

where $\nabla_{xx}^2\mathcal{L}$ is the Hessian of the Lagrangian and $\Sigma \in R^{m\times m}$ is the Hessian of the barrier terms:

$$\Sigma = S^{-1}\Lambda,\quad S = \text{diag}\{s_i\}_{i=1}^{m},\quad \Lambda = \text{diag}\{\lambda_i\}_{i=1}^{m}$$

My question is:

  • how are the terms in the local quadratic programming model obtain? For example, shouldn't the Hessian of the barrier terms contain $\frac{1}{s^2}$?

I would appreciate it if someone could kindly help me.