Prove $\text{rank}(S)\le k-1)$ where $S=\sum_{i=1}^{k}\alpha_iu_iu_i^T$ and $\sum_{i=1}^{k}u_i=0$

117 Views Asked by At

I've followed this procedure to solve the problem so far:

We know that $$\text{rank}(\alpha_iu_iu_i^T)=1\qquad\forall i=1,2,\cdots,k$$
Since $$\operatorname{rank}(A+B) \leq \operatorname{rank}(A) + \operatorname{rank}(B)$$
We can write: $$\text{rank}(S)\le\sum_{i=1}^{k}\underbrace{\text{rank}(\alpha_iu_iu_i^T)}_{1}=k$$
But in the book that I'm reading page 146 of the book Statistical Pattern recognition edition 2002, it's said that the maximum rank would be $k-1$?? (The concept is about Linear Discriminant Analysis based on Fisher's criterion $J_F(A)=\frac{A^TS_bA}{A^TS_wA}$ for a multi-class classification problem)

  • Why should the maximum rank be $k-1$ and not $k$?

Maybe I should explain more about $\alpha_i$ and $u_i$ in the concept:

Suppose that we have $n$ samples and $p$ features for each sample in our data and the data can be classifies in $k$ classes:
$$C_1,C_2,\cdots ,C_j ,\cdots , C_k$$
if we assume the number of samples in class $C_j$ in $n_j$, then we can define: $$\bar{x}=\frac{1}{n}\sum_{i=1}^{n}x_i\in\mathbb{R}^{p\times 1}\qquad\text{Mean of the whole data}$$
$$\bar{x}_j=\frac{1}{n_j}\sum_{x\in C_j}x\in\mathbb{R}^{p\times 1}\qquad\text{Mean of the data in class $C_j$}$$
know we can define: $$S_b=\sum_{j=1}^{k}n_j(\bar{x}_j-\bar{x})(\bar{x}_j-\bar{x})^T\in\mathbb{R}^{p\times p}$$
I want to prove that $$\text{rank}(S_b)\le k-1$$
In the beginning of my question, I've assumed $$u_i=\bar{x}_j-\bar{x}$$ and $$\alpha_i=n_j\qquad\text{number of samples in $j^{\text{th}}$ class}$$
and I've tried to solve the problem.


More information that I've found in my thoughts about the problem:
Since we have $$\bar{x}=\frac{1}{k}\sum_{j=1}^k \bar{x}_j\Rightarrow \sum_{j=1}^k \bar{x}_j=k\bar{x}$$
we can say that $$\sum_{j=1}^k u_j=\sum_{j=1}^k(\bar{x}_j-\bar{x})=\sum_{j=1}^k\bar{x}_j-\sum_{j=1}^k\bar{x}=k\bar{x}-k\bar{x}=0$$
This means that $u_j$'s are not linearly independent and so the dimension of the subspace, created by these $k$ matrices is less than $k$
This can make me satisfied that $$\text{rank}(S_b)\le k-1$$ but I'm still seeking a straightforward mathematical proof.

1

There are 1 best solutions below

0
On BEST ANSWER

You can see directly that every vector in the range of $S$ belongs to the span of $\{u_1,\ldots,u_k\}$. But the set of vectors $\{u_1,\ldots,u_k\}$ is linearly dependent, because the vectors $u_i$ sum to $0$, so the span of $\{u_1, \ldots, u_k\}$ has dimension at most $k - 1$. Hence, the dimension of the range of $S$ is at most $k-1$.


Here's the original proof I wrote:

Let $U = \begin{bmatrix} u_1 & \cdots & u_k \end{bmatrix}$ and let $D = \begin{bmatrix} \alpha_1 & & \\ & \ddots & \\ & & \alpha_k \end{bmatrix}$, and notice that $$ S = \sum_i \alpha_i u_i u_i^T = U D U^T. $$ Clearly, $\text{rank}(S) \leq \text{rank}(U)$. The null space of $U$ has dimension at least $1$, because $$U \begin{bmatrix} 1 \\ \vdots \\ 1 \end{bmatrix} = \sum_i u_i = 0. $$ By the rank-nullity theorem, the rank of $U$ is at most $k-1$.