Gradient descent derivation

70 Views Asked by Bumbble Comm At 30 Mar 2026 - 11:42

Background: Regular gradient descent can be written something like $x_{t + 1} = x_t - \eta g_t$, where $g_t$ is the gradient of the function we're trying to optimize.

Problem: If we have a (symmetric, positive definite) matrix $Q$, then I want to show that the "preconditioned" gradient descent $x_{t + 1} = x_t - \eta Q^{-1} g_t$ can equivalently be written

$$x_{t + 1} = \text{argmin}_x f(x_t) + g_t^T(x - x_t) + \frac{1}{2\eta}\lVert x - x_t \rVert_Q^2.$$

My attempt:

I want to start with the arg min expression for $x_{t + 1}$ and show that this is equal to $x_t - \eta Q^{-1}g_t$. We have

\begin{align*} &\text{argmin}_x f(x_t) + g_t^T(x - x_t) + \frac{1}{2\eta}\Vert x - x_t \Vert_Q^2\\ &= \text{argmin}_x g_t \cdot x + \frac{1}{2\eta}(x - x_t)^T Q (x - x_t) \end{align*}

At this point I want to differentiate and set equal to zero, I know that the derivative of $g_t \cdot x$ should be $g_t$, but how do I differentiate $\frac{1}{2\eta}(x - x_t)^T Q (x - x_t)$?

If its derivative is $\frac{1}{\eta}Q(x - x_t)$, then I can write

\begin{align*} g_t + \frac{1}{\eta}Q(x_{t + 1} - x_t) &= 0\\ \implies Q(x_{t + 1} - x_t) &= -\eta g_t\\ \implies x_{t + 1} &= x - \eta Q^{-1}g_t. \end{align*}

Is this legitimate? I've been reading about matrix calculus and I'm not at all sure the derivative of the squared $Q$ norm is correct.

Thanks a lot.

Original Q&A

Gradient descent derivation

Related Questions in CALCULUS

Related Questions in LINEAR-ALGEBRA

Related Questions in OPTIMIZATION

Related Questions in TAYLOR-EXPANSION

Related Questions in GRADIENT-DESCENT

Trending Questions

Popular # Hahtags

Popular Questions