I'm trying to build a neural network to solve this optimization problem
minimize $f(x)$
s.t. $h(x)=0$
where $x =(x_1, x_2, \dots, x_n)^T \in R^n$, $f: R^n \rightarrow R$ and $h: R^n \rightarrow R^m$ are given functions and $m \le n$. $f$ and $h$ are assumed to be twice continuous differentiable.
The idea is presented in this paper (IEEE abstract) and consists in creating a neural network whose equilibrium point satisfies the necessary conditions of optimality.
Based on the Lagrange multiplier theory, the neural network will be governed by:
$\frac{dx}{dt} = - \nabla_x L(x, \lambda) $
$\frac{d\lambda}{dt} = \nabla_{\lambda} L(x, \lambda) $
where $L: R^{m+n} \rightarrow R$ is the Lagrange function defined by: $L(x, \lambda) = f(x) + \lambda^T h(x)$, $\lambda \in R^m$
- I do not see how we could use the last equations to create a neural network?
- For a neural network, the output $(x^*, \lambda^*)$ "must" be known, also, how to implement forward propagation and backpropagation?
- Could someone help me understand how to create such a network or, if possible, provide a Matlab or Python code to implement the network?
Many thanks.