very slow convergence of Picard method for solving nonlinear system of equations

658 Views Asked by At

I have a nonlinear system of equations as $$ \left(\mathbf{K}_{\mathbf{L}}+\mathbf{K}_{\mathbf{N L}}(\mathbf{X})\right) \mathbf{X}=\mathbf{F} $$ in which $\mathbf{K}_{\mathbf{N L}}(\mathbf{X})$ represents the nonlinear stiffness matrix which is dependent to $\mathbf{X}$. I'm solving with Picard iteration like this:

  1. first ignore the nonlinear stiffness matrix and solve the linear matrix for $\mathbf{X}$.
  2. put the resulting $\mathbf{X}$ in the nonlinear stiffness matrix and solve the full equation for $\mathbf{X}$.
  3. check convergence and repeat 2 if the convergence is not satisfied.

the problem i have here is when the force vector($\mathbf{F}$) is small the nonlinear equation solves very fast but when i increase the force beyond some threshold it gets ages to converge. i have tried to solve it using Matlab fsolve function with algorithms like 'trust-region' and 'levenberg-marquardt' but the same thing happens with large force vectors.

is there any way i can improve the convergence speed ?

p.s. heres a gif of the result vector $\mathbf{X}$ inside the convergence loop with a force vector slighly over the threshold.

enter image description here

edit(more details): so my problem is bending of a nonlinear timoshenko beam that has three governing equations as below: $$ -\frac{d}{d x}\left\{A_{x x}\left[\frac{d u}{d x}+\frac{1}{2}\left(\frac{d w}{d x}\right)^{2}\right]+B_{x x} \frac{d \phi_{x}}{d x}\right\}=0 $$ $$ -\frac{d}{d x}\left\{A_{x x} \frac{d w}{d x}\left[\frac{d u}{d x}+\frac{1}{2}\left(\frac{d w}{d x}\right)^{2}\right]+B_{x x} \frac{d w}{d x} \frac{d \phi_{x}}{d x}\right\}-\frac{d}{d x}\left[S_{x x}\left(\frac{d w}{d x}+\phi_{x}\right)\right]=q $$ $$ -\frac{d}{d x}\left\{D_{x x} \frac{d \phi_{x}}{d x}+B_{x x}\left[\frac{d u}{d x}+\frac{1}{2}\left(\frac{d w}{d x}\right)^{2}\right]\right\}+S_{x x}\left(\frac{d w}{d x}+\phi_{x}\right)=0 $$ along with the proper boundary conditions and using finite difference, when assembled they form: $$ \left(\mathbf{K}_{\mathbf{L}}+\mathbf{K}_{\mathbf{N L}}(\mathbf{X})\right) \mathbf{X}=\mathbf{F} $$

3

There are 3 best solutions below

1
On BEST ANSWER

I assume that $K_{NL}(0) = 0$.

Currently you are using the iteration $$ (K_L + K_{NL}(X_n))X_{n+1} = F $$ with $X_0 = 0$. Instead, first solve $$ (K_L + K_{NL}(X_\sigma))X_\sigma = \sigma F $$ for small $\sigma$, using this method. This converges quickly as you noticed. Then solve $$ (K_L + K_{NL}(X_{\sigma'}))X_{\sigma'} = \sigma' F $$ with the same iteration for some $\sigma' > \sigma$, using the same iteration but now starting with the previously found $X_\sigma$ and not with the zero vector. And so on until the right hand side is $F$.

For example, choose $\sigma = N^{-1}, \, \sigma' = 2 N^{-1}$ and so on, for sufficiently large $N$.

Of course Anderson acceleration is also a good idea here :)

4
On

Here's a simple idea you could try with very little extra effort: Your GIF shows that it's oscillating back and forth, a phenomenon that can also occur in classical gradient descent algorithms if the problem is bad conditioned. A very popular and powerful method to alleviate this kind of problem is called momentum, which basically consists of averaging over previous iterations.

So instead of throwing away all the previous iterates, you can do something like

$$ x_{k+1} = (1-\beta)g(x_{k}) + \beta x_k$$

Note that when $\beta=0$, we recover a standard fixed point iteration. Consider a simple fixed point problem like $x=\cos(x)$, which exhibits the oscillatory phenomenon. Then, starting from the same seed here are the residuals $|x_*-x_k|$ for different values of $\beta$:

$$ \small\begin{array}{lllllll} k & \beta=0 & \beta=0.1 &\beta=0.2 &\beta=0.3 &\beta=0.4 &\beta=0.5 \\\hline 0 & 5.45787 &5.45787 &5.45787 &5.45787 &5.45787 &5.45787 \\1 & 0.2572 & 0.777267 & 1.29733 & 1.8174 & 2.33747 & 2.85754 \\2 & 0.19566 & 0.538475 & 0.690985 & 0.555697 & 0.107195 & 0.610102 \\3 & 0.116858 & 0.162927 & 0.0696096 & 0.00419339 & 0.00218156 & 0.0454083 \\4 & 0.0835784 & 0.0908543 & 0.0249916 & 0.000723828 & 8.0351e-06 & 0.0070347 \\5 & 0.053654 & 0.0431759 & 0.00828335 & 0.000124022 & 3.34983e-08 & 0.0011389 \\6 & 0.0371882 & 0.0224696 & 0.00282738 & 2.12772e-05 & 1.39595e-10 & 0.000185622 \\7 & 0.0245336 & 0.0112062 & 0.000955803 & 3.64953e-06 & 5.81757e-13 & 3.02859e-05 \\8 & 0.0167469 & 0.00571477 & 0.000324182 & 6.26001e-07 & 2.44249e-15 & 4.94232e-06 \\9 & 0.0111768 & 0.00288222 & 0.000109831 & 1.07377e-07 & 1.11022e-16 & 8.06552e-07 \end{array} $$

A well chosen momentum can speed up convergence tremendously! A variant of this idea specific to fixed point iterations appears to be known as Anderson Acceleration.

0
On

You can try some of the convergence acceleration algorithms which can work very well for fixed-point (Picard-type) iterations. If you are using R, there is a package called SQUAREM which implements a reliable, convergence acceleration scheme. It is based on the paper (Varadhan and Roland, Simple and Globally Convergent Methods for Accelerating the Convergence of Any EM Algorithm, Scandinavian Journal of Statistics, 2008). EM algorithms are essentially Picard-like algorithms - they are contraction mappings which converge slowly.