calculate the gradient of inverse matrix during gradient descent

204 Views Asked by Bumbble Comm At 25 Mar 2026 - 10:25

suppose $M\in R^{N\times N}$ is a positive definite matrix, each entry $M_{ij}=f_{ij}(x)$ is a function of a p-dimensional input variable $x\in R^{p}$, we want to use the gradient descent to optimize a function $g(x)=h(M^{-1})$ (assume we can calculate all the gradients).

(for example, in the Gaussian process model, we can use the gradient-based method to optimize the marginal log likelihood: $ \log p(\mathbf{y}\mid \mathbf{X},\Theta,\sigma)=-\frac{1}{2}\mathbf{y}^{T}(K+\sigma^2 I)^{-1}\mathbf{y}-\frac{1}{2}\log |K+\sigma^2 I| - \frac{N}{2}\log 2\pi$ with respect to hyper-parameters $\Theta$ and $\sigma$, where the matrix $K$ is a function of $\Theta$)

Suppose $N=\Theta(p)$, I am curious about the cost we need to spend on calculating the gradient of this function for each iteration. Naively, every time we need to calculate the inverse of $M$ once, so when $N=\Theta(p)$, the cost of the gradient descent for each iteration is at least $\mathbf{O}(p^3)$, which would not be the same as the cost the gradient descent needs to pay in general (which is $\mathbf{O}(p)$).

Is this analysis correct? Are there any methods that can help reduce the cost used in this kind of situation?

Thank you!

Original Q&A

calculate the gradient of inverse matrix during gradient descent

Related Questions in LINEAR-ALGEBRA

Related Questions in MACHINE-LEARNING

Related Questions in GRADIENT-DESCENT

Related Questions in LOG-LIKELIHOOD

Trending Questions

Popular # Hahtags

Popular Questions