While reading the book Elements of Statistical Learning p. 113, the author used eigendecomposition of the covariance matrix $\hat{\Sigma}_k =\mathbf{U}_k\mathbf{D}_k\mathbf{U}_k^T$ where $\mathbf{U}_k$ is $p \times p$ orthonormal, and $\mathbf{D}_k$ a diagonal matrix of positive eigenvalues $d_{kl}$
The formula for the quadratic function is as follows:
$$ \delta_k(x) = -\frac{1}{2}\log|\mathbf{\Sigma}_k|-\frac{1}{2}(x-\mu_k)^T\mathbf{\Sigma}^{-1}_k(x-\mu_k)+\log \pi_k $$
Using the idea of eigendecomposition the $\delta_k(x)$ is redefind:
1) $(x-\mu_k)^T\mathbf{\Sigma}^{-1}_k(x-\mu_k) = |\mathbf{U}_k^T(x-\mu_k)|^T\mathbf{D}_k^{-1}|\mathbf{U}_k^T(x-\mu_k)|$
2) $\log|\hat{\mathbf{\Sigma}}| = \Sigma_l\log(d_{kl})$
Apparently, the computational steps for the LDA classifier can be implemented by starting with the following step:
$Sphere$ the data with respect to the commmon covariance estimate $\hat{\Sigma}$:
$X^*\leftarrow \mathbf{D}^{-\frac{1}{2}}\mathbf{U}^TX$, where $\hat{\mathbf{\Sigma}} = \mathbf{UDU}^T$. The common covariance estimate of $X^*$ will now be the identity.
I did not understand how the author came to this step and I don't know how to read the back arrow from $\mathbf{D}^{-\frac{1}{2}}\mathbf{U}^TX$. Anybody who has an idea where I should start or how I should tackle this? What do I have to imagine when reading $X^*$?