I have been working on a problem with a spherical constraint and another normalization constraint.
To be precise I have a function $\mathrm{H}(X_{i})$, and the $X_{i}$ are the variables that I wish to optimise. The constraints are
\begin{equation} \sum_{i=1}^{N} X_{i}^2 = 1 \qquad \sum_{i=1}^{N} X_{i} = m \end{equation}
I tried the method of lagrange multipliers to construct the function $\mathrm{H}'$ in the following form:
\begin{equation} \mathrm{H}' = H + \nu_{1}(\sum_{i=1}^{N} X_{i}^2 - 1) + \nu_{2} (\sum_{i=1}^{N} X_{i} - m) \end{equation}.
However, for some reason that I haven't been able to figure out this doesn't seem to work numerically. (I can compute $\nu_{1}$ and $\nu_{2}$ analytically. This is what I injected into the final gradient descent routine).
Following the response here, I was wondering whether a similar thing could be done for the normalization constraint above. I tried the following procedure:
Compute the gradient of the function $\mathrm{H}$ i.e. without the constraints. Then follow the link above. This fixes the spherical constraint.
Then "reproject" the new variable $X_{i}(t+\Delta t)$ on the plane defined by $\sum_{i}^{N} X_{i} = m$.
For the last step, I chose a vector $((1-m)/N, ... (1-m)/N)$ and then performed standard linear algebra operations for projecting $X_{i}(t+\Delta t)$.
This however doesn't seem to work too well in practice: The gradient decreases and so does the function $\mathrm{H}$. The spherical constraint is satisfied as well. However the normalization constraint isn't.
Any suggestions/ideas/references for such a problem? I have scoured the web for problems of such type. The spherical constraint seems a pretty standard one but the other one doesn't seem to occur in many places. I haven't seen any references that treat the two together. Thanks!
First of all, the general "brute force" solution of doing projected gradient descent should work:
In the question that you linked, they have used the fact that the sphere has a closed-form exponential map to substantially simplify step 4: they compute a position $\tilde X$ directly on the sphere without needing to step + project.
Generally speaking the constraint manifold is too complex to allow such tricks. However in your case, notice that the constraint manifold is the intersection of a sphere and a plane, e.g. it is a sphere of dimension $N-1$, and therefore it is possible to write down a reduced representation of the set of all feasible points. In particular, let $v_1,\ldots,v_{n-1}$ be a completion of an orthonormal basis for $\mathbb{R}^N$, together with $\mathbf{1}/\sqrt{N}.$ Then all points satisfying your constraints are of the form $$X = \frac{m}{N}\mathbf{1} + \sum_{i=1}^{n-1} \alpha_iv_i$$ with $\|\alpha\|^2 = 1-\frac{m^2}{N^2}.$
You can thus do gradient descent directly on $h(\alpha) = H \circ X$ using the technique from the linked post, and avoid needing to deal with the constraints at all.