Proof for a bound $\epsilon > m$ such that $A^TXA \preceq \epsilon X$, where $X \succ 0$.

318 Views Asked by At

I'm interested in the following problem:

Let $A,X$ be matrices in $\mathbb{R}^{n\times n}$ with $X$ symmetric and positive definite ($X \succ 0$). Proof a lower bound $0 < m < \epsilon$ so that $$ A^TXA \preceq \epsilon X.$$

As illustrated in an related question $m = \lambda_\max(A)^2$ is not sufficient even for $A \succeq 0$.

I came up with the following bound based on inequalities for the trace of matrix product:

If $ \epsilon > \lambda_\max(AA^T)\frac{\lambda_\max(X)}{\lambda_\min(X)},$ then $ A^TXA \preceq \epsilon X.$

The proof is given below.

Questions:

  • First, I'm happy if anyone could point out flaws in the proof below.
  • Are there better bounds for $\epsilon$ such that the matrix inequality holds?
  • Is there a simpler proof of the given solution?

Proof:

Let $\rho(\cdot)$ denote the spectral radius of a matrix and let $||\cdot||_F$ be the Frobenius norm. Then \begin{align} A^TXA \preceq \epsilon X &\iff \rho (A^TXA (\epsilon X)^{-1}) < 1 \\ &\iff \underset{k\rightarrow \infty}{\lim} ||(A^TXA (\epsilon X)^{-1})^k||_F = 0. \end{align}

We use the following inequality:

Let $A,B$ be matrices in $\mathbb{R}^{n\times n}$ with $B \succeq 0$. Then $$ \text{trace}(AB) \le \lambda_\max \left(\frac{A+A^T}{2}\right) \text{trace}(B) $$

We have \begin{align} ||(A^TXA (\epsilon X)^{-1})^k||_F &= \sqrt{ \text{trace} \left( (A^TXA (\epsilon X)^{-1})^k \right)^T \left( (A^TXA (\epsilon X)^{-1})^k \right)} \\ &= \frac{1}{\epsilon^k} \sqrt{ \text{trace} \left( X^{-1} A^TXA \ldots X^{-1} A^TXAA^T X A X^{-1} \ldots A^T X A X^{-1} \right)} \\ &= \frac{1}{\epsilon^k} \sqrt{ \text{trace} \left( X^{-2} \underbrace{A^TXA \ldots X^{-1} A^TXAA^T X A X^{-1} \ldots A^T X A}_{\text{positive semidefinte by definition}} \right)} \\ &\le \frac{1}{\epsilon^k} \sqrt{ \lambda_\min(X)^2 \text{trace} \left( AA^TXA \ldots X^{-1} A^TXAA^T X A X^{-1} \ldots A^T X \right)} \\ & \ \ \vdots \\ &\le \left(\frac{1}{\epsilon} \lambda_\max(AA^T) \frac{\lambda_\max(X)}{\lambda_\min(X)} \right)^k n \end{align}

  • The first equality applies the definition of the Frobenius norm.
  • In the second equality we use that $X$ is symmetric and so is $X^{-1}$.
  • In the third equality we apply the cyclic property of the trace and observe that the right part is positive semidefinite per construction.
  • The first inequality applies the trace inequality, uses the fact the eigenvalues of the square of a symmetric matrix are the eigenvalues of that matrix squared and that the eigenvalues of the inverse of a positive definite matrix are the reciprocals of the eigenvalues of that matrix.
  • For the last inequality we apply the proceeding approach successively for $X^{-2}, AA^T, X^2$, $A^TA$ and use that the eigenvalues of $AA^T$ and $A^TA$ are equal.

Hence if $\frac{1}{\epsilon} \lambda_\max(AA^T) \frac{\lambda_\max(X)}{\lambda_\min(X)} < 1$, then $\underset{k\rightarrow \infty}{\lim} ||(A^TXA (\epsilon X)^{-1})^k||_F = 0$.

1

There are 1 best solutions below

3
On BEST ANSWER

I haven't looked through your proof, but here is another way to obtain the same bound.

First, note that $A^TXA \preceq \epsilon X$ if and only if $\epsilon X - A^TXA \succeq 0$. Thus, it suffies to show that there exists $\epsilon > 0$ so that the eigenvalues of $\epsilon X - A^TXA$ are all non-negative.

Now note that, $$ \lambda_\min(\epsilon X - A^TXA) \geq \lambda_\min(\epsilon X) - \lambda_\max(A^TXA) \geq \epsilon \lambda_\min(X) - \sigma_\max(A)^2\lambda_\max(X) $$

Thus, if we pick $\epsilon \geq \|A\|^2 \kappa(X) = \lambda_\max(A^TA)\kappa(X)$ we have $A^TXA \preceq \epsilon X$ which is the same as the bound you obtained.

I will leave the proof of the first inequality up to you.

This is the best you can do using only the norms of $X,X^{-1},A$ and no information about the eigenvectors. For instance, let $X$ be any positive definite matrix. Then given any value for $\|A\|$ it is possible to pick $A$ so that this bound is tight. In particular, $A$ will swap the eigenvectors of $X$ corresponding to the smallest and largest eigenvales and scale them appropriately. Again, you should be able to show this based on the tightness of the first and second inequalities.