Why solving a system of linear equations by minimizing residual is not working?

210 Views Asked by At

I've faced a totally unexpected situation while I'm studying on FEA and optimization. I have the following system of linear equations (according to FEA) with constraints. $$ \begin{aligned} \mathbf K \mathbf u &= \mathbf f, \\ \mathbf Q \mathbf u &= \mathbf b, \end{aligned} $$

where $\mathbf K$ is a symmetric, banded, and positive semi-definite system matrix, $\mathbf u$ is the field variable, $\mathbf f$ is the vector of external loads, $\mathbf Q$ is a sparse constraint matrix, and $\mathbf b$ is the vector of constrained values.

I can solve the above system equations in two ways:

  1. Lagrange multiplier method $$ \begin{bmatrix}\mathbf u \\ \boldsymbol\lambda\end{bmatrix}= \begin{bmatrix}\mathbf K & \mathbf Q^\mathsf T \\ \mathbf Q & \mathbf 0\end{bmatrix}^{-1}\begin{bmatrix}\mathbf f \\ \mathbf b\end{bmatrix}. $$

  2. Or, just directly using eliminated system $$ \bar{\mathbf u} = \bar{\mathbf K}^{-1}\bar{\mathbf f}, $$ where bar indicates the vectors or matrices that are eliminated. Of course, both provide the same solutions.


I thought that it's equivalent to solve an optimization problem minimizing the residual, i.e., $$ \mathbf u^* = \mathrm{arg}\min_{\mathbf Q\mathbf u=\mathbf b} f\left(\mathbf u\right), $$ where $$ f\left(\mathbf u\right) = \frac 1 2 \mathbf r^\mathsf T\mathbf r, \\ \mathbf r = \mathbf K\mathbf u - \mathbf f. $$ However, it seems not working at all.

I've also tried an unconstrained version it, i.e., $$ \bar{\mathbf u}^* = \mathrm{arg}\min \bar f\left(\bar{\mathbf u}\right), $$ where $$ \bar f\left(\bar{\mathbf u}\right) = \frac 1 2 \bar{\mathbf r}^\mathsf T\bar{\mathbf r}, \\ \bar{\mathbf r} = \bar{\mathbf K}\bar{\mathbf u} - \bar{\mathbf f}. $$

The eliminated version seems to give a bit of an improved solution, but it's still wrong. Heat conduction on 10x10 grid These are FE solutions of heat conduction on 10x10 grid mesh; from left, solving $\mathbf K\mathbf u = \mathbf f$, minimizing residual, and eliminated version of minimizing residual. Although it seems that the third one results in the correct answer, it goes wrong when the discretization gets bigger. enter image description here

After a bit of investigation, I found that there was a trivial solution of the optimization problem, that is, a vector $\mathbf u$ with zeros for most of its components except near the constrained and loaded DOFs. It's because $\mathbf f$ is also a very sparse vector.


Finally, here's my question:

  1. Is my hypothesis that solving a system of linear equations directly and minimizing residual are equivalent to each other wrong? What am I missing?

  2. Is the problem related to one of the following facts?

  • Ill-conditioned and narrowly banded Hessian (c.f. $\mathbf H = \mathbf K^\mathsf T\mathbf K$, $\mathrm{cond}\left(\mathbf H\right) = \mathcal O\left(10^{5}\right)$)

enter image description here

  • A very sparse load vector

enter image description here

  1. Is there any way to get a nontrivial solution by optimization?

Any comment will be helpful, thanks