I've run across a problem which asks me to calculate a best fit line through data using a 'generalized linear least squares' approach where, instead of minimizing the residual:
$\vec{r} = \vec{b} - A\vec{X}$
we minimize the residual in:
$B\vec{r} = \vec{b} - A\vec{X},$
where $B$ is an symmetric positive definite matrix.
Is this a standard practice in approximation? Can anyone explain what this $B$ matrix is doing in terms of the approximation? Would the standard normal equation still apply (after multiplying through by $B^{-1}$)?
Normally the problem is to minimize $r^T V^{-1} r$ where V is the covariance matrix which allows for correlations and non-constant variances. B is sort of the square root of $V^{-1}$. Not sure why it has been introduced in this context.