QR and LQ solver for over and under determined systems

1.4k Views Asked by At

I am trying to code a solver for linear systems which is not limited by the shape and size of the input matrices. I have $$ A\cdot X = B$$

If A is a tall matrix, I perform a pivoted and reduced RQ Algorithm. I obtain the rank of the matrix and I solve using back substitution using only the submatrix the size of the rank. I obtain the least squares solution

If A is a wide matrix, I perform a pivoted and reduced LQ Algorithm. I obtain the rank of the matrix and I solve using forward substitution using only a submatrix the size of the rank. I obtain the solution that minimizes $X$

When the matrix is full column rank and tall or full row rank and wide, this works great. How should i decide which solver to use when I don't know the rank in advance? For example: $$\begin{bmatrix}0 & 1 & 0 \\\ 0 & 0 & 1 \\\ 0 & 1 & 1 \end{bmatrix} X = \begin{bmatrix}2 & 3 \\\ 1 & 2 \\\ 3.02 & 5.05 \end{bmatrix}$$ Matrix A is obviously of rank 2

The QR algorithm gives the solution of the least squares tall matrix without the first column. $$\begin{bmatrix}1 & 0 \\\ 0 & 1 \\\ 1 & 1 \end{bmatrix} \begin{bmatrix}0 \\\ X_2 \end{bmatrix} = \begin{bmatrix}2 & 3 \\\ 1 & 2 \\\ 3.02 & 5.05 \end{bmatrix}$$

but the LQ algorithm gives the solution without the last row.

$$\begin{bmatrix}1 & 0 \\\ 0 & 1 \ \end{bmatrix} X' = \begin{bmatrix}2 & 3 \\\ 1 & 2 \end{bmatrix}$$

I don't fully understand why this happends, how to deal with this problem without having to calculate the SVD? The correct answer being the one given by the QR algorithm.

Thanks

1

There are 1 best solutions below

2
On BEST ANSWER

Please forgive me if I am not using the standard terminology here.

There are three common types of linear systems:

  1. For revertible square coefficient matrices, you have a unique solution, both QR/LQ solvers can be used, though QR is usually preferred.
  2. For overdetermined systems (https://en.wikipedia.org/wiki/Overdetermined_system), there is no exact solution to meet/solve all equations, but the Ordinary Lease Squares (OLS) solutions are usually preferred in practice.
  3. For underdetermined systems (https://en.wikipedia.org/wiki/Underdetermined_system), there are more than one solution available. We usually compute solutions with minimal 2-norm ($\operatorname{min}||X||_2$).

You did pick a special example, and it doesn't belong to any of those three cases. It actually combines Case 2 and 3. First, in your example, the system is over-determined, and there is no solution for $x$ that can solve all three equations simultaneously. Usually OLS solutions are preferred in this scenario. At same time, no matter what value $x_1$ takes, it has no impact on the whole system, so there are more than one OLS solution. Therefore in your case, the best solution might be combining criteria from both Case 2 and 3, i.e. an OLS solution with $x_1=0$. Note that the solution is UNIQUE under the new, commonsense criteria.

Though relatively straightforward and simple in theory, I don't think there is any package or algorithm in practice that can help you determine universally which case it is, or their combination, especially if the coefficient matrix is sparse. Also, SVD just gives you one type of "solution" following its criteria. For example, the solution you get via SVD for an overdetermined system still does NOT solve/satisfy all the original equations. The SVD solution only satisfies a new system that is transformed from the original one. Note that before and after transformation, the old and new systems might not be equivalent.

In other words, our goal is to satisfy/solve all the original equations if possible. If such a solution doesn't exist or there are more than one solution, we either modify our criteria and/or modify our equations/system such that we can still have "one unique optimal solution" regarding to the new system and/or new criteria. SVD is just one way to modify the problem.

Added on Sep 20. More notes on different approaches from QR decomposition and LQ decomposition.

Assume $A$ to be a $m\times n$ matrix.

Let's first discuss the case of overdetermined systems (so $m>n$). Note that the term "overdetermined" here means that there is NO $X$ such that $AX=Y$. For simplicity, I assume here that $A$ has a full column rank.

  1. The solution from QR decomposition ($A=QR$), say $X_R$, satisfies $AX_R=\operatorname{Proj}_A(Y)$ and $A^T(Y-AX_R)=0$. It "maximizes" the projection of $Y$ onto the subspace spanned by the column vectors of $A$. You can see that such a $X_R$ is unique. Note that $R=({R1}^T,0)^T$ where $R1$ is a nonsingular square upper triangular matrix.
  2. The situation is subtle if you put the Hermitian matrix $Q$ to the right, $A=LQ$. First, if you want to compute the LQ decomposition, you might need to apply an extra permutation matrix $P$ to $A$ from the left, i.e. $PA=LQ$. Otherwise, the desired decomposition $A=LQ$ might not exist. Again, for simplicity, we assume $P=I$. Next, the shape of $L$ is where the subtlety comes from. We put the lower triangular part on the top like in QR decomposition, so we assume $L=({L1}^T,M^T)^T$, where $L1$ is a nonsingular square lower triangular matrix. Note that the nonsingularity of $L1$ (or $R1$ above) is from the fact that $A$ has full column rank. Is $M=0$? The answer is NO. If $M=0$, you'd be able to solve $LQX=Y$ directly, so you'd be able to solve $AX=Y$ directly, which contradicts with "overdetermined" assumption. In practice, one simply drops $M$, modify $Y$, and solves instead $L_{new} QX=Y_{new}$. Here $L_{new}=({L1}^T,0)^T$, and $Y_{new}=({Y1}^T,0)^T$. In other words, we drop all equations corresponding to rows in $M$.
  3. You can see that, in order solve the original overdetermined systems, we manually modify or add conditions. The new conditions we create for QR and LQ approaches are different. If we don't need to apply the permutation matrix $P$, then QR decomposition approach is to project $Y$ into the column subspace of $A$. LQ decomposition approach is projecting $Y$ into the subspace spanned by $\{e_1,...,e_n\}$ where $e_i=(0,...,0,1,0,...,0)$, i.e. the standard basis for the Euclidean space. If we have to apply the permutation matrix $P$, it'd be more complicated, but it is still clear that we two different sets of criteria for QR and LQ approaches. We'd expect solutions from these two approaches to be different too.
  4. It'd be messier if $A$ doesn't have a full column rank. Your example falls in this case because there is NO $X$ to satisfy all equations.

For underdetermined systems, QR decomposition and LQ decomposition interchange their roles. Also, instead of projection of $Y$ into the subspace spanned by the column vectors of $A$, you'd look at the projection of $X$ into the subspace spanned by the row vectors of $A$. You can do it by yourself.

One more question you may ask is when we might get same solutions from QR and LQ approaches. Well, it's the case if $A$ is an invertible square matrix, or if we transform the original equations into such kind of problems. Otherwise, it becomes an "overdetermined" (the most common case you see in practice) or "underdetermined" system, though some might argue a combination of both as a third case.

As pointed out above, your problem is an "overdetermined" one. At same time, the first element of $X$ can be anything and will not affect your calculations. This kind of behavior is somehow similar to "underdetermined" system, which some people might tempt to apply LQ decomposition.

Hope it helps.