Non-linear system vs minimisation problem

237 Views Asked by At

If you have a non-linear system of equations which can be formally written as :

\begin{equation} \begin{cases} F_1(\mathbf{x})=0\\ F_2(\mathbf{x})=0\\ \ \ \ \ \vdots\\ F_n(\mathbf{x})=0\\ \end{cases} \end{equation}

we can write the above system as a minimisation problem

\begin{equation} \min_{\mathbf{x}}\{[F_1(\mathbf{x})]^2+[F_2(\mathbf{x})]^2+...+[F_n(\mathbf{x})]^2\} \end{equation}

What are the benefits and disadvantages of doing this. I'm interested from both an analytical point of view as well from a numerical point of view?

Edit To make the question more specific, let's say that we are looking to apply Newton method to solve the system of equations and the Newton method to solve the minimisation problem.

One advantage of using the minimisation form is that you could for example impose some constraints, which is not possible in the case of the system. With this question I am trying to understand what are other benefits or disadvantages of the two fromulations.

1

There are 1 best solutions below

0
On

For simplicity I'm going to assume the functions $F_k$ are real valued and globally defined and smooth on $\mathbb{R}^n$.

As a broad brush observation, we throw away some information when we tackle the solution of a nonlinear system via the least squares minimization of the sum of squares. This should be clear from a moments reflection on the fact that the scalar minimization problem can be derived from the separate $n$ scalar functions $F_k(x)$, but the converse derivation is impossible.

Working with these distinct scalar functions allows the gradient of each $F_k(x)$ to be separately computed and analyzed, $\nabla F_k(x)$, even if we think about them being assembled into a Jacobian.

Once those functions are combined via least squares:

$$ G(x) = \sum_{k=1}^n (F_k(x))^2 $$

we only have the gradient of $G$ (and its Hessian) that can be assessed:

$$ \frac{\partial G}{\partial x_i} = 2 \sum_{k=1}^n F_k(x) \frac{\partial F_k}{\partial x_i} $$

A Newton-Raphson iteration to solve the original linear system may converge to a unique root while the corresponding Gauss-Newton least squares minimization may get sidetracked by a local minimum and fail to reach the desired solution, or converge at such a slow rate as to be impractical without modification.