I'm solving an optimization problem
\begin{equation}
\begin{aligned}\label{}
& \underset{A\in \mathbb{R}_{m,d_1} B\in \mathbb{R}_{n,d_2}}{\text{minimize}}
& & \phi({A},B) \\
& \text{subject to}
& & \|Ha_j\|_2^2=1 , \ j = 1, \ldots, d_1.\\
&&& \|Vb_j\|_2^2=1 , \ j = 1, \ldots, d_2.
\end{aligned}
\end{equation}
where $H$ and $V$ are known matrices, $a_j$'s and $b_j$'s are columns of $A$ and $B$
I can't solve the optimal solution, so I try to use numerical methods like gradient descent
But I don't know how to use gradient descent with such constraint
I already have derivative of $\phi$ w.r.t. to $A$ and $B$
My thought is to seek the matrix $A'$ in the constraint set which has the smallest distance (like Frobenius norm) to $A_{k+1}^{temp} = A_k- \alpha_k\frac{\partial\phi}{\partial A}(A_k,B_k)$ to be $A_{k+1}$and follow the similar way to update $B$.
That is, solve
\begin{equation}
\begin{aligned}\label{}
& \underset{A\in \mathbb{R}_{m,d_1} }{\text{minimize}}
& & \|A-A_{k+1}^{temp}\|_F^2 \\
& \text{subject to}
& & \|Ha_j\|_2^2=1 , \ j = 1, \ldots, d_1.\\
\end{aligned}
\end{equation}
Am I on the right way to constrainted gradient descent?
Any help would be appreciated!
Refer to Rem's answer, BFGS may work better. But how BFGS work on matrix variable and under some constraints?
You probably want to look at the dual problem. Introduce a penalty in your cost function and solve a bigger but easier problem instead. See Duality, Lagrange multiplier.
The idea is solve a new, unconstrained problem, namely find a saddle point of: $$L(A,B,\lambda,\mu) = \phi(A,B) - \sum_{j=1}^{d1}\lambda_j(\lVert Ha_j\rVert^2-1) - \sum_{j=1}^{d2}\mu_j(\lVert Vb_j\rVert^2-1)$$