Different formulations of within-class scatter matrix

44 Views Asked by At

If we have a dataset $X= {x_1,x_2,....,x_n}$ where all the datapoints are in $d-$dimensional feature space and there are $2$ classes $c_1$ and $c_2$ for which $n_1$ points from $X$ are for class $c_1$ and rest are for class $c_2$. $n_1$ points are also for those $y_i$ for which $y_i=v^Tx_i$ for some vector $v$ and class label of $x_i$ is $c_1$ and rest belongs to class $c_2$ means we have $n_1+n_2 = n$.
$m_1$ is the mean-vector of class $c_1$ and $m_2$ is the mean-vector of class $c_2$. $S_1$ and $S_2$ are co-variance matrices corresponding to the class $c_1$ and $c_2.$
Now, in projected space, $y_i=v^Tx_i$ for all $i=1,2,....,n.$ In this space, $\mu_1$ is the mean-vector of class $c_1$ and $\mu_2$ is the mean-vector of class $c_2$. $s_1$ and $s_2$ are co-variance matrices corresponding to the class $c_1$ and $c_2.$

I have to derive $3$ things :
$1)$ within class scatter is : $(\mu_1 - \mu_2)^2 + \frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}$
$2)$ within class scatter can also be written as: $\frac{1}{n_1n_2}\sum_{y_i \in class\;c_1} \sum_{y_j \in class\;c_2} (y_i - y_j)^2$
(Here, $y_i \in class\;c_1$ means $y_i = w^Tx_i$ and class-label of $x_i$ is $c_1$ and $y_j \in class\;c_2$ means $y_j = w^Tx_j$ and class-label of $x_j$ is $c_2$)
$3)$ Total scatter is : $\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}$

According to Fisher Linear Discriminant,
A) within class-scatter($S_w$) = $\sum_{x_i \in c_1}(x_i - m_1)(x_i - m_1)^T$ + $\sum_{x_i \in c_2}(x_i - m_2)(x_i - m_2)^T$
B) $\mu_1 = v^Tm_1$ and $\mu_2 = v^Tm_2$
C) $(n_1 s_1)^2 = v^T(n_1S_1)v$ and $(n_2 s_2)^2 = v^T(n_2S_2)v\;$ where $n_1S_1+n_2S_2 =S_w$
D) $v= S_w^{-1} (m_1 - m_2)$
E) $S_1 = \sum_{x_i \in c_1} (x_i - m_1)(x_i - m_1)^T$ and $S_2 = \sum_{x_i \in c_2} (x_i - m_2)(x_i - m_2)^T$
Now, for $1)$
$(\mu_1 - \mu_2)^2 + \frac{s_1^2}{n_1} + \frac{s_2^2}{n_2} = (v^Tm_1 - v^Tm_2)^2 + \frac{v^TS_1v}{n_1^2}+ \frac{v^TS_2v}{n_2^2}$
Now, how to introduce $x_i$ here to get the $S_w$.
I was manipulating all these things to get the answer but I was not getting it.
Can anyone please give a hint how to get all these derivations. Any help would be appreciated.