I have a problem with calculating Cook Distance (I'm trying to understand it).
Ok so here is the task and my 'solution'. I'm asking for comment, is it ok, or what do I wrong.
We have simple linear model for $10$ observations $Y =b_0+ b_1X_1 + b_2X_2 + b_3X_3$, where $X_1$ and $X_2$ are numbers and $X_3$ can be $Z$ or $D$.
And the result of the model is:
(intercept) $18,1422$
$X_1$: $-0,6349$
$X_2$: $-0,1577$
$X_3 Z$: $-0,3828$
We also know $RSS=0,81632$ and leverages $h_{ii} = (0,68894\; 0,45208\; 0,53916\; 0,26537\; 0,32897\; 0,34334\; 0,28986\; 0,23839\; 0,58853\; 0,26537)$
My task is to calculate Cook's Distance for first observation : $Y_1=6$ and $X_1=12$, $X_2=25$, $X_3=Z$.
So, I've started from searching the formula for Cook's Distance: $D_1=\frac{(y_1-\hat{y_1})^2}{k \cdot MSE}\cdot \frac{h_{11}}{(1-h_{11})^2}$.
Then I calculate $\hat{y_1}=18,1422 -0,6349 \cdot 12 -0,1577 \cdot 25 -0,3828 = 6,1981$ Next $MSE=\frac{RSS}{10}=0,081632$
We have $k=3$ and $h_{11}=0,68894$.
Using the formula above we have finally: $D_1=1,1408$
My questions are:
Is it a good solution?
Can Cook Distance be greater than $1$?
$k$ will be $4$ not $3$, number of parameters in the full model. $Y1$ estimate should be obtained using model fitted ignoring the first observation as you are interested in finding the influence of the first observation.