properties of orthogonal similarity transformation

544 Views Asked by At

I have a general question (1) and a more specific problem (2) with respect to orthogonal similarity transformations.

(1) A similarity transformation $\mathbf{A} = \mathbf{S}^{-1} \mathbf{B} \mathbf{S}$ preserves eigenvalues, trace, determinant, Frobenius norm etc. Are there additional invariances or other statements for an orthogonal similarity transformation where $\mathbf{S}$ is an orthogonal matrix ($\mathbf{S}^{-1} = \mathbf{S}^T$)?

(2) I'm particularly interested in the orthogonal similarity transformation of diagonal matrices: $\mathbf{A} = \mathbf{Q}^T \mathbf{D} \mathbf{Q}$ (where $\mathbf{Q}$ is orthogonal with column vectors $\mathbf{q}_i$ and $\mathbf{D}$ is diagonal with pairwise different entries $d_i$, $d_1 > \ldots > d_n > 0$, both of dimension $n$). I have several problems like this one:

Determine $\mathbf{Q}$ such that the partial trace $t = \sum_{i=1}^{m} A_{ii}$ is maximal, given $m < n$.

Supposedly one can approach this problem by combining the invariance of the trace (a property of the similarity transformation) with the Rayleigh-Ritz theorem which states that $\mathbf{q}_i^T \mathbf{D} \mathbf{q}_i \in [d_n, d_1]$ (this just exploits the fact that $\|\mathbf{q}_i\| = 1$, but not the orthogonality of $\mathbf{Q}$), and some property of the orthogonal similarity transformation.

I would expect as solution that $\mathbf{Q}$ is a block-diagonal matrix with an $m \times m$ upper block $\hat{\mathbf{Q}}$ and an $(n-m) \times (n-m)$ lower block $\check{\mathbf{Q}}$. This would give $t = \sum_{i=1}^{m} \mathbf{q}_i^T \hat{\mathbf{D}} \mathbf{q}_i = \textrm{tr}\{ \hat{\mathbf{Q}}^T \hat{\mathbf{D}} \hat{\mathbf{Q}}\} = \sum_{i=1}^{m} d_i$ where $\hat{\mathbf{D}}$ is the upper $m \times m$ block of $\mathbf{D}$. However, I could not prove that considering the remaining $n-m$ entries of $\mathbf{D}$ (if $\mathbf{Q}$ is not block-diagonal) could not further increase $t$ beyond the sum of the largest $m$ diagonal elements. For this, I suppose, one would need to exploit the orthogonality of $\mathbf{Q}$, which brings me back to question (1).

Is there maybe something like a generalized Rayleigh-Ritz theorem for orthogonal matrices instead of unit vectors?