From the wikipedia, leverage of a point is defined as the measure of how far away the independent variable values of an observation are from those of the other observations. Mathematically for point(observation) $x_i$, it is given by $h_{ii}=x_i^T(X^TX)^{-1}x_i$, where rows of matrix $X \in R^{n*d}$ contains points. Now my question is how does the formula leads to the interpretation given in first line. This is how I see it, If we calculate the covariance matrix from the observation to find mahalanobis type distance metric we have covariance as $(X^TX-\frac{1}{n^2}X^T \mathbf{1} X)$, ( note rows of X are our points) where $\mu^T=\frac{1}{n}[1,1...,1]X$ and $\mathbf{1} \in R^{n*n}$ is matrix of all $1$'s. And now if we use this as distance metric shouldn't the leverage be defined as $\hat{h_{ii}}=(x_i-\mu)^T(X^TX-\frac{1}{n^2}X^T \mathbf{1} X)^{-1}(x_i-\mu)$ ? How does one come up with the interpretation?
Also I am curious what properties of colspace of $X$ does leverage score(leverage of each point) help us understand? Any help, comments, hints are greatly appreciated. Thanks.