Determining consistency of a general overdetermined linear system

304 Views Asked by At

For $m > 2$, consider the $m \times 2$ (overdetermined) linear system $$A \mathbf{x} = \mathbf{b}$$ with (general) coefficients in a field $\mathbb{F}$; in components we write the system as $$\left(\begin{array}{cc}a_{11} & a_{12} \\ \vdots & \vdots \\ a_{m1} & a_{m2} \end{array}\right) \left(\begin{array}{c}x_1 \\ x_2\end{array}\right) = \left(\begin{array}{c}b_1 \\ \vdots \\ b_m\end{array}\right),$$ where $m > 2$, so that the system is overdetermined.

If $m = 3$ and the system is consistent, (equivalently, $\mathbf{b}$ is in the column space of $A$), then the columns of the augmented matrix $\pmatrix{ A \mid {\bf b}}$ are linearly dependent, and so $$\det \pmatrix{ A \mid {\bf b}} = 0.$$ In particular, we have produced a polynomial in the components $(a_{ij}, b_j)$ of the linear system for which vanishing is a necessary condition for system's consistency. I'll call such polynomials polynomial obstructions for the system.

If $m > 3$, then we can produce ${m}\choose{3}$ such polynomials by considering the determinants of the $3 \times 3$ minors of $\pmatrix{ A \mid {\bf b}}$.

Are essentially all polynomials obstructions to the system essentially given by these, or are there others? Put more precisely: By definition the polynomial obstructions comprise an ideal in the polynomial ring $\mathbb{F}[a_{11}, \ldots, a_{m2}, b_1, \ldots b_m]$---do the determinants of the $3 \times 3$ minors generate this ideal? If not, how does one produce a complete set of generators?

More generally, for an $m \times n$ overdetermined linear system (so that $m > n$) $$A \mathbf{x} = \mathbf{b},$$ we can produce polynomial obstructions by taking the determinants of the ${m}\choose{n+1}$ minors (of size $(n + 1) \times (n + 1)$). What are the answers to the obvious analogues to the above questions in the $n = 2$ case?

2

There are 2 best solutions below

5
On BEST ANSWER

When $m=3$ and $\mathbb F$ is infinite, there are no other obstructions besides the determinant. When $\mathbb F$ is finite, there are many others : for example if we put $\chi_{\mathbb F}(X)=\prod_{t\in {\mathbb F}^*} (X-t)$, $\chi(t)$ is zero iff $t$ is nonzero, so that the following $n$ polynomials are all obstructions :

$$ w_i=b_i\prod_{j=1}^n\chi_{\mathbb F}(a_{ij}) $$

For the infinite case, one can use the following lemma :

Generalized Euclidean division. Let $A$ and $B$ be two polynomials in ${\mathbb F}[X_1,X_2,\ldots,X_n,Y]$. Let $a={\sf deg}_Y(A)$, $b={\sf deg}_Y(B)$, and let $L$ be the leading coefficient of $B$ with respect to $Y$ (so that $L\in{\mathbb F}[X_1,X_2,\ldots,X_n]$ and $B-LY^b$ has degree $<b$ in $Y$). Then if $a \geq b$, there are two polynomials $Q,R\in {\mathbb F}[X_1,X_2,\ldots,X_n,Y]$ such that $L^{a-b+1}A=QB+R$ and ${\sf deg}_Y(R)<b$.

Proof. Let ${\mathbb K}={\mathbb F}(X_1,X_2,\ldots,X_n)$. We can view $A$ and $B$ as members of ${\mathbb K}[Y]$, and perform ordinary euclidian division ; this yields $Q^{\sharp},R^{\sharp}\in {\mathbb K}[Y]$ such that $A=Q^{\sharp}B+R^{\sharp}$. Since the division process involves $a-b+1$ divisions by $L$, we see that $Q^{\sharp}$ and $R^{\sharp}$ are of the form $\frac{Q}{L^{a-b+1}}$ and $\frac{R}{L^{a-b+1}}$ with $Q,R\in {\mathbb F}[X_1,X_2,\ldots,X_n,Y]$. This concludes the proof of the lemma.

Let us now explain how this can be used when $m=3$. Let $I$ be the ideal (in the ring ${\mathfrak R}={\mathbb F}(A_{11},A_{12},A_{13},A_{21},A_{22},A_{23},B_1,B_2,B_3)$ of all obstructions. In particular, the determinant

$$ \Delta=(A_{12}A_{23}-A_{13}A_{22})B_1+ (A_{13}A_{21}-A_{11}A_{23})B_2+ (A_{11}A_{22}-A_{12}A_{21})B_3 \tag{1} $$

is a member of $I$. Let $P\in I$, and let $p={\sf deg}_{B_3}(P)$. By the generalized Euclidean division property above, there are polynomials in $Q,R$ in $\mathfrak R$ such that $(A_{11}A_{22}-A_{12}A_{21})^p P=\Delta Q+R$, such that $R$ does not contain the variable $B_3$ (note that we need $p\geq 1$ in order to apply the lemma ; but if $p=0$, we can simply take $Q=0,R=P$). Then $R\in I$. Consider the set

$$ W=\bigg\lbrace (a_{11},a_{12},a_{13},a_{21},a_{22},a_{23},b_1,b_2) \in {\mathbb F}^{8} \ \bigg| \ a_{11}a_{22}-a_{12}a_{21} \neq 0\bigg\rbrace \tag{2} $$

Since $\mathbb F$ is infinite, $W$ is a Zariski-dense open subset of ${\mathbb F}^{8}$. We have a natural map $\phi : W \to V(I)$, defined by

$$ \phi(a_{11},a_{12},a_{13},a_{21},a_{22},a_{23},b_1,b_2)= \bigg(a_{11},a_{12},a_{13},a_{21},a_{22},a_{23},b_1,b_2, -\frac{(a_{12}a_{23}-a_{13}a_{22})b_1+ (a_{13}a_{21}-a_{11}a_{23})b_2}{a_{11}a_{22}-a_{12}a_{21}}\bigg) \tag{3} $$

For any $w\in W$, we have $R(\phi(w))=0$ since $R\in I$. We deduce $R(w)=0$ for all $w\in W$. Since $W$ is Zariski-dense, $R$ is zero eveywhere. So $R$ must be the zero polynomial, $(A_{11}A_{22}-A_{12}A_{21})^p P=\Delta Q$. Since $A_{11}A_{22}-A_{12}A_{21}$ and $\Delta$ have no common factors, we see that $\Delta$ divides $P$.

1
On

We assume that $b\not=0$ and that $\mathbb{F}$ is an infinite field. The required condition is $rank(A)=rank(A|b)$ or equivalently $rank(A|b)\leq rank(A)$. If $A$ is a generic matrix, then $rank(A)=n$ and your conditions are sufficient because they are equivalent to $rank(A|b)\leq n=rank(A)$. Yet, if $A$ is not generic and $rank(A)=r<n$, then it is not sufficient -for instance assume that the first column of $A$ is zero-. You must extract from $A$ a $r\times r$ invertible submatrix $A'$; we can assume that $A'$ is formed using the first $r$ columns and rows of $A$. After you construct, for $r<i\leq m$, the $(r+1)\times (r+1)$ matrices $U_i=\begin{pmatrix}A'&[b_1,\cdots,b_r]^T\\L_i&b_i\end{pmatrix}$ where $L_i=[a_{i,1},\cdots,a_{i,r}]$ and you say that their determinants are zero. Of course, if you know $r$ and not $A'$, then you must consider all the $(r+1)\times (r+1)$ matrices, extracted from $(A|b)$, containing in the last column a part of the vector $b$. If you do not know $r$, that is if $A$ is a formal matrix, then $r=0$ or $r=1$ or ...$r=n$ and you apply that is said above.