I'm trying to implement the UKF for parameter estimation as described by Eric A. Wan and Rudolph van der Merwe in Chapter 7 of the Kalman Filtering and Neural Networks book: Free PDF
I am confused by the setting of $\lambda$ (used in the selection of sigma points and in the calculation of the weights for the mean). The authors recommend setting:
$\lambda = \alpha^2(L+k) - L$ where L is the dimension of the x. With alpha "small" $0 < \alpha < 1$, k either 0 or 3-L (different sources disagree on this). Sigma points are then calculated as a matrix $\chi$ with:
$\chi_{0} = x$
$\chi_{i} = x + \sqrt{ ((L+\lambda)*P_{x})_{i} }$ for i = 1....L
$\chi_{i} = x - \sqrt{ ((L+\lambda)*P_{x})_{i} }$ for i = L+1....2L
Where $(\sqrt{ (L+\lambda)*P_{x} })_{i}$ is the ith column of the square root of the covariance matrix of x.
Sigma points are ran through f:
$ Y_{i} = f(\chi_{i})$ i=0...2L
and the mean of Y is calculated as:
$ \bar{Y} = \sum{w_{i}Y_{i}}$
With the weights $w_{i}$ given as:
$$ w_{0} = \dfrac{\lambda}{L + \lambda} $$ $$ w_{i} = \dfrac{1}{2(L + \lambda)} $$
The issue I am running into is that for any reasonable values of L,$\alpha$ and k, $W_{0}$ ends up being negative (often very large negative values). While $W_{i}$ does sum to 1, the negative value results in the calculated mean being extremely far off. I'm sure there is something I am missing, but I can't figure out what.
I've been fighting with the same problem for some time now and I think that there is actually no real solution if you want to stick to the well-known algorithms. Here is what I found:
Adding to Mark's answer, in The Scaled Unscented Transform, Julier actually points out that (at least without $\beta$)
The untransformed weights in Wan and van der Merwe's paper correspond to Julier's original formulation in A New Extension of the Kalman Filter to Nonlinear Systems with $w_0 = \frac{k}{n+k}$ and $ w_i = \frac{1}{2(n+k)}$. The k + n = 3 rule will thus still not result in a guaranteed positive semidefinite covariance estimate if n is larger than 3. Actually, Wan and van der Merwe suggest setting k to 0, which results in an non-negative untransformed $w_0=0$ but still gives a negative transformed $w_0$.
So it seems that following the standard approaches, negative weights for the mean cannot be avoided if n is larger than 3. This is also a result in A Numerical-Integration Perspective on Gaussian Filters Section IV A, where they point out that having negative weights is indeed undesirable and can mess up the stability of the filter. So most probably, you are not missing something - the problem is the (scaled) UKF.
My solution so far has been to use the version in A New Extension of the Kalman Filter to Nonlinear Systems with a small, positive k. This however also somewhat suffers from high-dimensional states since the sigma points are placed farther away from the mean the higher n is. (Which I think was the reason for introducing the scaled unscented transform to begin with...)
Finally, if you feel adventurous, you can look into newer approaches for placing the sigma points. For example here the authors tried to learn the parameters $\alpha$,$\beta$ and k and ended up with very different and sometimes unconventional choices for different systems.