Difference between the eigenvalues of an $n \times n$ matrix $D$ and its "centered" version $DH_n$

137 Views Asked by At

Let $1_n$ denote the column vector of all $1$'s. Let $H_n := I_n - \frac{1}{n}1_n1_n'$ denote the centering matrix. We know that $H_n$ has eigenvalues $0$ (with multiplicity $n$) and $1$ with multiplicity $1$. We also know that: $H_n^2 = H_n$.

Let $D$ be a symmetric $n \times n$ matrix. I'm interested in knowing what can we say about the difference between the eigenvalues of $D$ and $DH_n$. See its properties here. So the type of questions I'm interested in are:

(1) What's a bound on the operator norm $||D - DH_n||_{op} ?$

(2) What's a bound on the operator norm $||D - H_nDH_n||_{op} ?$ ($H_nDH_n$ is often called double centering in statistics/machine learning literature.)

(3) What's the maximal difference of their eigenvalues: i.e. what is a bound on:

$max _{1 \le i \le n}|\lambda_i(D) - \lambda_{\sigma(i)}(DH_n)|, \sigma $ is a permutation of indices $\{1,2,...n\}.$

(4) What's the maximal difference of their eigenvalues: i.e. what is a bound on:

$max _{1 \le i \le n}|\lambda_i(D) - \lambda_{\sigma(i)}(DH_n)|, \sigma $ is a permutation of indices $\{1,2,...n\}.$

1

There are 1 best solutions below

0
On

I'm not sure if you're looking for bounds uniform in $D$ or not, but some easily derived properties that take a stab at your first two questions are as follows: We have that $D-DH_n = D-D(I - \frac{1}{n}\mathbf{1}\mathbf{1}^\top) = \frac{1}{n}D\mathbf{1}\mathbf{1}^\top$. Hence, an immediate bound for (1) is \begin{equation*} \|D-DH_n\| = \frac{1}{n}\|D\mathbf{1}\mathbf{1}^\top\| \le \frac{1}{n}\|D\|\|\mathbf{1}\mathbf{1}^\top\| = \|D\|. \end{equation*} For (2), note that \begin{align*} D-H_nDH_n ={}& D-(I-\frac{1}{n}\mathbf{1}\mathbf{1}^\top)D(I - \frac{1}{n}\mathbf{1}\mathbf{1}^\top) \\ ={}& D - (I - \frac{1}{n}\mathbf{1}\mathbf{1}^\top)(D-\frac{1}{n}D\mathbf{1}\mathbf{1}^\top) \\ ={}& D - D + \frac{1}{n}D\mathbf{1}\mathbf{1}^\top + \frac{1}{n}\mathbf{1}\mathbf{1}^\top D - \frac{1}{n^2}\mathbf{1}\mathbf{1}^\top \mathbf{1}\mathbf{1}^\top \\ ={}& \frac{1}{n}(D\mathbf{1}\mathbf{1}^\top + \mathbf{1}\mathbf{1}^\top D - \mathbf{1}\mathbf{1}^\top). \end{align*} Using the bound for (1) and the triangle inequality gives \begin{equation*} \|D-H_nDH_n\| \le \frac{1}{n}(2\|D\| + \|\mathbf{1}\mathbf{1}^\top\|) = \frac{2}{n}\|D\| + 1. \end{equation*}

Also note that (1) can be simplified without introducing bounds: \begin{align*} \|D-DH_n\| ={}& \frac{1}{n}\|D\mathbf{1}\mathbf{1}^\top\| \\ ={}& \frac{1}{n}\sqrt{\lambda_\text{max}(\mathbf{1}\mathbf{1}^\top D^\top D\mathbf{1}\mathbf{1}^\top)} \\ ={}& \frac{\|D\mathbf{1}\|_2}{n}\sqrt{\lambda_\text{max}(\mathbf{1}\mathbf{1}^\top)} \\ ={}& \frac{\|D\mathbf{1}\|_2}{\sqrt{n}}. \end{align*}