Interpreting rank deficiency in Levenberg-Marquardt optimization

210 Views Asked by At

I'm solving a nonlinear LSE problem which, depending on the data, has a different degree of dependence between parameters of the estimated model. The dependence manifests itself, for example, as a rank deficiency of the jacobian.

An example of such a problem would be fitting a $f(x)=be^{ax}+c$ line to a set of data points. If these data points form a horizontal line, the $a$ would be $0$ and parameters $b$ and $c$ would be dependent (as we can only estimate their sum from the data).

I would like to be able to somehow analyze which parameters are dependent and to what degree. I have a feeling that a SVD of jacobian might help, but I have no idea how to approach this.

EDIT:

I've made a small python 2.7 test program for above example, with Corr matrix calculation, as per vibe's answer.

import sympy as sp
import numpy as np

def merge_dicts(a,b):
    c = a.copy()
    c.update(b)
    return c

a,b,c,x,y=sp.symbols("a b c x y")
params = [a,b,c]

fun = sp.exp(a*x)*b+c
residual = fun-y
diffs = [ sp.diff(residual,p) for p in params ]
print("diffs:\n",diffs)

sol = {a:0.0, b:10.0, c:-5.0}
data = [ {x:xv, y: float(fun.subs(merge_dicts(sol, {x:xv})))  } for xv in range(-2,5) ]
print("data:\n",data)

J = np.array([[ float(diffs[i].subs(merge_dicts(data[j], sol))) for i in range(len(diffs)) ] for j in range(len(data))] )

print("J:\n", J)
C = np.linalg.pinv(np.matmul(J.transpose(), J))
print("C:\n", C)

D = np.diag(np.sqrt(np.diag(C)))
print("D:\n", D)

Dinv = np.linalg.inv(D);
print("Corr:\n", np.matmul(np.matmul(Dinv, C), Dinv))
1

There are 1 best solutions below

4
On

The SVD of the Jacobian for the problem you've described would likely have a singular value of zero, telling you that one of your model parameters (or a linear combination of model parameters) is not determined by the data. It won't however tell you which model parameter is the culprit.

What you want to analyze is the covariance matrix, $$ C = (J^T J)^{-1} $$ The diagonal elements are the variances of your model parameters $a,b,c$ and the off-diagonal elements give you the covariances between different model parameters. My guess is for your problem you would find that $Cov(b,c)$ is large, meaning they are nearly co-linear. Ideally you would want the off-diagonal elements to be near zero, indicating your parameters are all independent from each other.

In some cases if your model parameters have widely different scales, the covariance matrix can be misleading, since it carries units. In this case you can also look at the correlation matrix, which is the dimensionless version of the covariance matrix: $$ Corr = D^{-1} C D^{-1} $$ where $D = \sqrt{\textrm{diag}(C)}$. Here the diagonal elements will be 1, and the off-diagonal elements give you the correlation coefficient between the different model parameters.