I found a snippet of C++ code to compute the square root of non-negative integer x via MSE Loss function and gradient descent.
class Solution {
public:
double mySqrt(int x) {
int c = x;
// Mean Square Error,MSE loss function
auto L = [c](double xi){
return (xi * xi - c) * (xi * xi - c);
};
//gradient
auto newton = [c](double xi){
return 4 * xi * (xi * xi - c)/(4 * (xi * xi - c) + 8 * xi * xi);
};
//init
double xNew = x;
//train
while(L(xNew) > 1e-7){
xNew = xNew - newton(xNew);
}
return xNew;
}
};
// https://leetcode-cn.com/problems/sqrtx/solution/yong-ji-qi-xue-xi-he-niu-dun-fa-de-jie-t-lvxc/
The gradient defined by the newton function is
$$ \frac{4x_i(x_i^2-c)}{4 (x_i^2 - c) + 8x_i^2} $$
My question is how to understand this gradient? The numerator is $\frac{\partial{L}}{\partial{x_i}}=\frac{\partial{(x_i^2-c)^2}}{\partial{x_i}}$, while what about the denominator?
Finally, I know what the
newtonfunction means.The Mean Square Error(MSE) loss function is $L(x)=(x^2-c)^2$, and the numerator is the first order derivative of $L(x)$, $L'(x)=4x(x^2-c)$ , and the denominator is the second order derivative of $L(x)$, $L''(x)=4(3x^2-c)$.
The snippet of code uses Newton's method to find a minimum of the MSE loss function.
I think this solution can not be considered as a neural-networks method because the training process just finds the roots of the equation instead of training parameters.