I'm trying to find the minimum of a function with 3 variables constrained to a few restrictions with 5 variables applying the penalty method in order to optimise it using the steepest descent method with a backtracking algorithm to compute progressively the alphas.
The function and restrictions are the following:
min h(A, C, E) = 5.35*C^2+0.83*A*E+37.29*A-40.14
subject to:
0 <= 85.33+0.0056*B*C+0.00026*A*D-0.0022*C*E <= 92
90 <= 80.51+0.0071*B*E+0.0029*A*B+0.0021*C^2 <= 110
20 <= 9.3+0.0047*C*E+0.0012*A*C+0.0019*C*D <= 25
78 <= A <= 102
33 <= B <= 45
27 <= C <= 45
27 <= D <= 45
27 <= E <= 45
(the numbers don't really matter, my question is more from a theoretical point of view)
The remaining whole function to minimise will therefore be:
f(A, B, C, D, E)= h (A, C, E) + c*P(A, B, C, D, E)
being P(A, B, C, D, E)=0.5*sum_i(max(0,gi)^2)
Now, after computing the function to minimise applying the previous formulas in the penalty formula, we need to calculate the gradient vector, which will be:
g[0] = -(0.83*E+37.29);
g[1] = 0;
g[2] = -2*5.35;
g[3] = 0;
g[4] = -0.83*A;
The issue I have is with the steepest direction - it doesn't tell variables B and D where to go and they will obviously remain in the initial conditions numbers regardless whether they are within the restriction parametres or not. How should I proceed with that?
Thanks a lot!
Let $x \in \mathbb{R}^n$ and $y \in \mathbb{R}^m$ and the minimization problem $\mathbf{P}$ be:
$$ \min_{x, y} \quad f_{0}(x) \\ s.t \quad f_{i}(x, y) \leq 0 \quad \forall i \in \mathcal{I}. $$
One way to find an approximate solution to this problem is using the penalty method, where a series of unconstrained minimization problems $\mathbf{\tilde{P}}_t$ are solved instead, where $\mathbf{\tilde{P}}_t$ is given by:
$$ \min_{x, y} \quad f_{0}(x) + \lambda_t \sum_{i \in \mathcal{I} } g(f_{i}(x, y)), $$
where $g:\mathbb{R} \rightarrow \mathbb{R}$ is an exterior penalty function and $\lambda_t$ is the penalty coefficient of the $t$-th problem. Let $D = \{x \in \mathbb{R}^n$ and $y \in \mathbb{R}^m : f_{i}(x, y) \leq 0 \quad \forall i \in \mathcal{I}\}$ be the feasible set of the main problem, the main idea of this method is to penalize the solutions of $\mathbf{\tilde{P}}_t$ that violate the constraints of the original problem, therefore, we want $g$ to have a large value for a positive input and to be close to zero for a non positive one. Moreover, at each iteration $t$ the value of $\lambda_t$ is updated to bias the solution of the approximated of the unconstrained problem to the feasible set of the original problem.
Back to your question as $g$ is a function of both $x$ and $y$ the gradient of the approximated objective function (i.e. $f_{0}(x) + \lambda_t \sum_{i \in \mathcal{I} } g(f_{i}(x, y))$) will depend on the values of $y$ even though the gradient of the original objective function did not, thus, the values of $y$ (or in the problem you presented $B$ and $D$) will be updated by the steepest descent method.