What Does Coordinate Descent Actually Do? Any Intuition Behind Coordinate Descent?

320 Views Asked by Bumbble Comm At 27 Mar 2026 - 10:20

We've done a bunch of theoretical stuff in my optimization class, but basically no time for the actual implementation details. I'm trying to get an understanding of coordinate descent, which if I'm understanding it correctly utilizes the Gauss-Sediel method for solving a set of linear equations.

What I'm not getting is WHICH system of linear equations we're solving.

In the case of $f$ being convex, then $\text{min. } f(x) \Leftrightarrow \nabla f(x) = 0$, which leads to a set of equations, but they're only linear when $f$ is quadratic... It gets even worse if we're relaxing some constraint, in particular a convex, separable but nondifferentiable constraint - here we can't lean on the gradient to produce a set of equations, let alone a linear set of equations.

So where does the linear set of equation solving come into the picture? What are the actual computational steps for coordinate descent?

Original Q&A

There are 1 best solutions below

Bumbble Comm On 23 Aug 2017 - 8:23

What is the Gradient Descent Method?

Let's generalize it by the Steepest Descent Method.
We'll define the Steepest Descent using the following optimization problem:

$$ \arg \min_{ \left\| d \right\|_{p} \leq 1} \left\{ {\nabla f \left( x \right)}^{T} d \right\} $$

If we chose $ p = 2 $ you will get the known Gradient Descent.
The interesting thing happens when you chose $ p = 1 $.
Then the optimal direction (Easily derived form the KKT) is given by:

$$ {d}_{i} = \begin{cases} -\operatorname{sign} \left( {\nabla f \left( x \right)}_{i} \right) & \text{ if } \left| {\nabla f \left( x \right)}_{i} \right| = \max_{i} \left\{ \left| {\nabla f \left( x \right)}_{i} \right| \right\} \\ 0 & \text{ if } \left| {\nabla f \left( x \right)}_{i} \right| \neq \max_{i} \left\{ \left| {\nabla f \left( x \right)}_{i} \right| \right\} \end{cases} $$

Namely the Steepest Descent in the $ {L}_{1} $ Norm sense is taking the direction which is aligned to coordinate with the steepest slope.

This is basically Coordinate Descent, solve the problem in a direction aligned with single coordinate.

So you can think of Coordinate Descents something like Steepest Decedent in $ {L}_{1} $ Norm.

What Does Coordinate Descent Actually Do? Any Intuition Behind Coordinate Descent?

There are 1 best solutions below

Related Questions in OPTIMIZATION

Related Questions in CONVEX-OPTIMIZATION

Related Questions in NONLINEAR-OPTIMIZATION

Related Questions in NUMERICAL-OPTIMIZATION

Trending Questions

Popular # Hahtags

Popular Questions