I am having a hard time understanding the concept of standardizing residuals and how the variance of a residual is decomposed. In a linear model, we defined residuals as:
$e = y - \hat{y} = (I-H)y$ where H is the hat matrix $X(X^TX)^{-1}X^T$
and we defined standardized residuals as:
$r_i = \frac{e_i}{s\sqrt{1-h_{ii}}}$, $i = 1,...,n$
where $s^2$ is the usual estimate of $\sigma^2$, $var(e_i) = \sigma^2h_{ii}$, and $h_{ii}$ is the diagonal entry of H at the $i^{th}$ row and $i^{th}$ column
However, I am not sure why $r_i$ and $e_i$ are functions of $h_{ii}$ rather than the whole row $h_i$. Basically I am confused about what $h_{ii}$ stands for as opposed to row $h_{i}$.
Let's look at the variance of the residuals vector $e$, $$ Var(e) = Var( (I-H)y ) = (I-H)^TVar(y) (I-H) = \sigma^2(I-H). $$ The main diagonal of $\sigma^2(I -H)$ are the variance terms of $e$. Particularly, the variance of $e_i$ is the $i$th diagonal element of $\sigma^2(I-H)$, that is $\sigma^2(1 - h_{ii})$. Since you don't know $\sigma^2$ and estimate it with $s^2$, thus the estimated variance of $e_i$ is $s^2(1-h_{ii})$.