How to understand this gradient used here to compute the square root of $x$?

Question

How to understand this gradient used here to compute the square root of $x$?

108 Views Asked by Bumbble Comm At 29 Mar 2026 - 2:47

I found a snippet of C++ code to compute the square root of non-negative integer x via MSE Loss function and gradient descent.

class Solution {
public:
    double mySqrt(int x) {
        int c = x;

        // Mean Square Error，MSE loss function
        auto L = [c](double xi){
            return (xi * xi - c) * (xi * xi - c);
        };

        //gradient
        auto newton = [c](double xi){
            return 4 * xi * (xi * xi - c)/(4 * (xi * xi - c) + 8 * xi * xi);
        };

        //init
        double xNew = x;
        //train
        while(L(xNew) > 1e-7){
            xNew = xNew - newton(xNew);
        }
        return xNew;
    }
};
// https://leetcode-cn.com/problems/sqrtx/solution/yong-ji-qi-xue-xi-he-niu-dun-fa-de-jie-t-lvxc/

The gradient defined by the newton function is

$$ \frac{4x_i(x_i^2-c)}{4 (x_i^2 - c) + 8x_i^2} $$

My question is how to understand this gradient? The numerator is $\frac{\partial{L}}{\partial{x_i}}=\frac{\partial{(x_i^2-c)^2}}{\partial{x_i}}$, while what about the denominator?

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Answer 1 · 2022-09-02 06:48:52

Finally, I know what the newton function means.

The Mean Square Error(MSE) loss function is $L(x)=(x^2-c)^2$, and the numerator is the first order derivative of $L(x)$, $L'(x)=4x(x^2-c)$ , and the denominator is the second order derivative of $L(x)$, $L''(x)=4(3x^2-c)$.

The snippet of code uses Newton's method to find a minimum of the MSE loss function.

Newton's method can be used to find a minimum or maximum of a function f(x). The derivative is zero at a minimum or maximum, so local minima and maxima can be found by applying Newton's method to the derivative. The iteration becomes: $$ {x_{n+1}=x_{n}-{\frac {f'(x_{n})}{f''(x_{n})}}.} $$ https://en.wikipedia.org/wiki/Newton%27s_method#Minimization_and_maximization_problems

I think this solution can not be considered as a neural-networks method because the training process just finds the roots of the equation instead of training parameters.

How to understand this gradient used here to compute the square root of $x$?

There are 1 best solutions below

Related Questions in GRADIENT-DESCENT

Related Questions in NEURAL-NETWORKS

Trending Questions

Popular # Hahtags

Popular Questions