I have a matrix $X = [x_1, ..., x_n]^T \in \mathbb{R}^{n \times m}$ consisting of row vectors $x_k^T$ for which I want to solve an overdetermined $$y=X\beta$$ using least squares. I know that I can obtain $\beta$ by $$\beta=(X^TX)^{-1}X^Ty.$$
Question: Provided I already obtained $\beta$, is there a computationally efficient way to obtain $\beta'$ for a matrix $X' = [x_1, ..., x_{i-1}, x_i', x_{i-1}, ..., x_n]^T$ (i.e. matrix $X$, where row $x^T_i$ is replaced by a new $(x_i')^T$) such that $y' = X'\beta$ is solved by least squares?
Or short: Is there a computationally efficient way to do a rank one update of the matrix X in $\beta=(X^TX)^{-1}X^Ty$?
Previous ideas: It might appear that recursive least squares (RLS) is what I'm looking for. However, with RLS the dimension of $X$ (conceptually) increases with each new $x'$, whereas I want to replace a specific vector $x_i$ in $X$ by $x_i'$ (not always the vector at the same position in $X$, though).
Yes, you can make rank one updates provided you have access to at least $\beta$ and $(X^{T}X)^{-1}$. The updated $\beta$ where you've added an extra row to $X$ and $Y$ can be related to the "old" $\beta$ via the matrix inversion lemma, in terms of the new predictor-response pair, $\beta$ and $(X^{T}X)^{-1}$.
Edit: @Martin: Yes, by the matrix inversion lemma here I meant what Wikipedia calls the Sherman-Morrison formula, which is the rank one version of the Woodbury identity. Either is commonly referred to as "the" matrix inversion lemma in the Machine Learning literature. Like a lot of useful results in linear algebra these seem to have been independently (re)discovered many times.
I think that either this answer or Martin's somewhat fuller answer really implies the answer to OP's question, via two rank one updates, but there's a more efficient way:
In this context first we observe that one could zero out the i-th row $(x_i , y_i)$ in the data matrix $[X|Y]$ by appending the new predictor row $-x_i$ to $X$ and as well the corresponding new response $-y_i$ to $Y$ and row-reducing, and the resulting $\beta$ would be the same one as would be obtained if one had just removed the i-th row instead (assuming of course that this still leaves $X$ with full column rank). Adding the new row $(x',y')$ to that row-reduced system (i.e. via a rank one update) would therefore give the updated $\beta$ that OP is seeking. Since adding a row containing $(x'-x_i,y'-y_i)$ has the same overall effect as these two steps combined (in exact maths), I think that OP can use the recursive least squares code of their choice to apply the update $(x'-x_i,y'-y_i)$.