I have an optimization problem with two sets of variables, $x \in \mathbb{R}^d$ and $y \in \mathbb{R}^p$, and you want to minimize the objective function $f(x, y)$.
The basic idea of AGD is to update one set of variables while keeping the other fixed, and then switch to updating the other set while keeping the first fixed. This process alternates between the two sets of variables.
Here's a high-level description of the AGD algorithm for your optimization problem:
Initialize: Choose initial values for $x$ and $y$.
Iterative Update: a. Update $x$ while keeping $y$ fixed: $$x^{(k+1)} = x^{(k)} - \alpha \nabla_x f(x^{(k)}, y^{(k)})$$ Here, $\alpha$ is the step size, and $\nabla_x f(x^{(k)}, y^{(k)})$ is the gradient of the objective function with respect to $x$ evaluated at the current values of $x$ and $y$.
b. Update $y$ while keeping $x$ fixed: $$y^{(k+1)} = y^{(k)} - \beta \nabla_y f(x^{(k+1)}, y^{(k)})$$ Similar to step (a), $\beta$ is the step size, and $\nabla_y f(x^{(k+1)}, y^{(k)})$ is the gradient of the objective function with respect to $y$ evaluated at the updated values of $x$ and the current values of $y$.
Repeat: Repeat the iterative update process until convergence or a stopping criterion is met.
I was looking for a reference that shows the convergence of such an algorithm for the (strongly) convex case but I failed to find one so far, so I'm hoping someone could point me in the right direction.