What is the relationship between minimizing a function and finding a root of an equation? Are the the same? I know in both problem we have similar algorithms, such as gradient decent, or newton's methods.
For example, Let use assume $x$ is a scalar. Finding a root for an equation $f(x)=b$ is checking where the function $f(x)-b$ cross the x axis. This is definitely not equal to minimize $f(x)-b$.
But in the convex optimization book, Minimizing $\|Ax-b\|_2^2$ is equal to the solution of the linear system $Ax=b$.
What I am missing? in which case we can transfer optimization to root finding?
Definitely not same. In your case, assuming that columns of $A$ span vector $b$ and $||Ax-b||$ being norm which is non-negative, the problem of investigating minima is similar to finding solution of $Ax-b=0$.