How is multiplying A by A transpose related to the gradient in the least squares problem?

220 Views Asked by Bumbble Comm At 14 May 2026 - 11:24

I have a function:

$f(\textbf{x}) = \frac{1}{2} || \textbf{Ax - b} ||_2^2$

I am trying to minimize this function on values of x using gradient based optimization. The textbook I am following suggests the gradient can be computed as:

$\nabla_\textbf{x}f(\textbf{x}) = \textbf{A}^T(\textbf{Ax} - \textbf{b})$.

I am not understanding what has happened here. I am familiar with the notion of gradient, and how the function itself works, but I am not understanding why this computation represents the gradient.

Original Q&A

There are 1 best solutions below

Bumbble Comm On 18 Nov 2018 - 9:25

In $$f(\textbf{x}) = \frac{1}{2} || \textbf{Ax - b} ||_2^2$$ you have a vector $\textbf v=\textbf{Ax} - \textbf{b}$. The absolute value is obtained either by dot product, or you can write it in vector form $\textbf v^T\textbf v$

Now the gradient is the same as taking a derivative with respect to $\textbf x$

$$\nabla_\textbf{x}f(\textbf{x}) = \frac{d\textbf v^T}{d\textbf x}\textbf v+ \textbf v^T\frac{d\textbf v}{d\textbf x} =\textbf{A}^T(\textbf{Ax} - \textbf{b})$$ In the last equation I've used the property of transpose of the product of a matrix with a vector.

How is multiplying A by A transpose related to the gradient in the least squares problem?

There are 1 best solutions below

Related Questions in OPTIMIZATION

Related Questions in LEAST-SQUARES

Trending Questions

Popular # Hahtags

Popular Questions