Computations for LDA: Eigendecomposition

508 Views Asked by At

While reading the book Elements of Statistical Learning p. 113, the author used eigendecomposition of the covariance matrix $\hat{\Sigma}_k =\mathbf{U}_k\mathbf{D}_k\mathbf{U}_k^T$ where $\mathbf{U}_k$ is $p \times p$ orthonormal, and $\mathbf{D}_k$ a diagonal matrix of positive eigenvalues $d_{kl}$

The formula for the quadratic function is as follows:

$$ \delta_k(x) = -\frac{1}{2}\log|\mathbf{\Sigma}_k|-\frac{1}{2}(x-\mu_k)^T\mathbf{\Sigma}^{-1}_k(x-\mu_k)+\log \pi_k $$

Using the idea of eigendecomposition the $\delta_k(x)$ is redefind:

1) $(x-\mu_k)^T\mathbf{\Sigma}^{-1}_k(x-\mu_k) = |\mathbf{U}_k^T(x-\mu_k)|^T\mathbf{D}_k^{-1}|\mathbf{U}_k^T(x-\mu_k)|$

2) $\log|\hat{\mathbf{\Sigma}}| = \Sigma_l\log(d_{kl})$

Apparently, the computational steps for the LDA classifier can be implemented by starting with the following step:

$Sphere$ the data with respect to the commmon covariance estimate $\hat{\Sigma}$:

$X^*\leftarrow \mathbf{D}^{-\frac{1}{2}}\mathbf{U}^TX$, where $\hat{\mathbf{\Sigma}} = \mathbf{UDU}^T$. The common covariance estimate of $X^*$ will now be the identity.

I did not understand how the author came to this step and I don't know how to read the back arrow from $\mathbf{D}^{-\frac{1}{2}}\mathbf{U}^TX$. Anybody who has an idea where I should start or how I should tackle this? What do I have to imagine when reading $X^*$?