Convergence rate of linear regression without chain rule

33 Views Asked by Bumbble Comm At 10 May 2026 - 9:27

To minimize an MSE, a common method is to perform a gradient descent on the objective. For example, the derivative is: $\frac{d}{dw} \sum_{i=1}^n (t_i - w x_i)^2 = \sum_{i=1}^n 2 (t_i - w x_i) x_i$. My question is what happens if we don't have a product that applied the chain rule. To be specific, we perform a gradient descent with: $\sum_{i=1}^n 2 (t_i - w x_i)$ ($x_i$ is not present here).

I can see that the location of the minimum point did not change. However, I feel like the convergence rate may be different. Would you recommend some ideas to analyze the convergence rate between one with chain rule and one without chain rule?

Thank you!

Original Q&A

Convergence rate of linear regression without chain rule

Related Questions in CONVERGENCE-DIVERGENCE

Related Questions in REGRESSION

Related Questions in LINEAR-REGRESSION

Related Questions in GRADIENT-DESCENT

Related Questions in REGRESSION-ANALYSIS

Trending Questions

Popular # Hahtags

Popular Questions