Invertible $X^TX$ - what happens when you clone rows of $X$?

48 Views Asked by Bumbble Comm At 16 May 2026 - 5:24

My question is inspired by https://stats.stackexchange.com/questions/70899/what-correlation-makes-a-matrix-singular-and-what-are-implications-of-singularit, in particular ttnphns's answer where they state "Also, duplicating observations in a dataset will lead the matrix towards singularity. The more times you clone a case the closer is singularity."

Consider the matrix $X \in \mathbb{R}^{n \times p}$, where $n$ is the number of observations. Assume that $X$ is full rank, so this guarantees $n \geq p$. By "duplicating an observation," it is meant that you take some row in $X$, and append it to $X$, creating $X' \in \mathbb{R}^{(n + 1) \times p}$. What the above link suggests is that the more you do this, the more $X'^TX'$ tends towards singularity.

It is not intuitive to me why this would happen. You'd create linear dependency in the rows as you duplicate rows, but this shouldn't affect the column rank at all, and therefore the invertibility of $X^TX$ should not be affected?

Original Q&A

There are 1 best solutions below

Bumbble Comm On 30 Jun 2020 - 12:24

Imagine you add a row $y^\top$ to the matrix $X$ $n$ times. Call this new matrix $X'$. Then

$$ (X')^\top X' = \begin{bmatrix} X^\top & y & \cdots & y \end{bmatrix}\begin{bmatrix} X \\ y^\top \\ \vdots \\ y^\top \end{bmatrix} = X^\top X + nyy^\top. $$

By picking $n$ big enough, $X^\top X$ will be tiny relative to $nyy^\top$ and thus $(X')^\top X'$ will be nearly equal to the rank $1$ matrix $nyy^\top$ and thus will be nearly singular.

At the heart of your question is a question about what it means for it means for a matrix to be "close to singularity". After all, singularity is a binary notion: a matrix is singular or it is not--what does it mean for a matrix be close to singularity? Let's use the following loose definition:

A matrix $A$ is close to singularity if there exists a "small" perturbation matrix $E$ such that $A+E$ is singular.

For a matrix to be "small", we need a measure of how large a matrix is and this is furnished by choosing any suitable matrix norm $\|\cdot\|$. Then a matrix $E$ is small relative to $A$ if $\|E\| \ll \|A\|$. One can see that adding one repeated data entry $y^\top$ a large number $n$ times makes $(X')^\top X'$ close to singularity since adding the small perturbation matrix $E = -X^\top X$ to $A = (X')^\top X'$ makes $A+E = nyy^\top$ a rank one and thus singular matrix.

This explanation just considers the addition of a single data entry multiple times, but a similar argument will work with more than multiple repeated data entries.

Invertible $X^TX$ - what happens when you clone rows of $X$?

There are 1 best solutions below

Related Questions in LINEAR-ALGEBRA

Related Questions in REGRESSION

Related Questions in MATRIX-RANK

Related Questions in LEAST-SQUARES

Related Questions in SINGULARITY

Trending Questions

Popular # Hahtags

Popular Questions