Cook's Distance

309 Views Asked by At

I have a problem with calculating Cook Distance (I'm trying to understand it).

Ok so here is the task and my 'solution'. I'm asking for comment, is it ok, or what do I wrong.

We have simple linear model for $10$ observations $Y =b_0+ b_1X_1 + b_2X_2 + b_3X_3$, where $X_1$ and $X_2$ are numbers and $X_3$ can be $Z$ or $D$.

And the result of the model is:

(intercept) $18,1422$

$X_1$: $-0,6349$

$X_2$: $-0,1577$

$X_3 Z$: $-0,3828$

We also know $RSS=0,81632$ and leverages $h_{ii} = (0,68894\; 0,45208\; 0,53916\; 0,26537\; 0,32897\; 0,34334\; 0,28986\; 0,23839\; 0,58853\; 0,26537)$

My task is to calculate Cook's Distance for first observation : $Y_1=6$ and $X_1=12$, $X_2=25$, $X_3=Z$.

So, I've started from searching the formula for Cook's Distance: $D_1=\frac{(y_1-\hat{y_1})^2}{k \cdot MSE}\cdot \frac{h_{11}}{(1-h_{11})^2}$.

Then I calculate $\hat{y_1}=18,1422 -0,6349 \cdot 12 -0,1577 \cdot 25 -0,3828 = 6,1981$ Next $MSE=\frac{RSS}{10}=0,081632$

We have $k=3$ and $h_{11}=0,68894$.

Using the formula above we have finally: $D_1=1,1408$

My questions are:

  1. Is it a good solution?

  2. Can Cook Distance be greater than $1$?

1

There are 1 best solutions below

1
On

$k$ will be $4$ not $3$, number of parameters in the full model. $Y1$ estimate should be obtained using model fitted ignoring the first observation as you are interested in finding the influence of the first observation.