How do i determine if standard deviation is known or unknown?

995 Views Asked by At

I am trying to do a prediction interval but I cannot seem to differentiate when to use the formula for the prediction interval of a future observation where $σ^2$ is known, and when to use the formula when $σ^2$ is unknown. I did a problem in my textbook, which is as follows:

5.5 A random sample of 100 automobile owners in the state of Virginia shows that an automobile is driven on average 23,500 km per year with a standard deviation of 3900 km. Assume the distribution of measurements to be approximately normal. (a) Construct a 99% confidence interval for the average number of km an automobile is driven annually in Virginia.

I found the interval to be 22 496 < μ < 24 504 by using the formula to find the Confidence Interval on μ where $σ^2$ is known (ie. i used z0.005 and σ=3900). However, another problem I attempted references problem 5.5 and is as follows:

Referring to Exercise 5.5, construct a 99% prediction interval for the km traveled annually by an automobile owner in Virginia.

I used the equation for the Prediction of a Future Observation where $σ^2$ is known, so I used z0.005=2.575 and σ=3900 which gave (13 407, 33 593). In the solution manual, they used the equation when $σ^2$ is unknown so they used t0.005=2.626 and s=3900 which gives (13 075, 33 925) instead. I do not understand why in the second problem, I am supposed to use t0.005 and s=3900 instead of z0.005 and σ=3900. Where in the second problem does it indicate that $σ^2$ is unknown?

1

There are 1 best solutions below

0
On

In fact, both questions require the use of the $t$-distribution critical value for $\nu = n-1 = 99$ degrees of freedom, namely $$t_{99,0.005} \approx 2.62641.$$

The reason for this is because the standard deviation provided in the problem, $3900$, is clearly stated to have been obtained from the sample of $n = 100$ automobile owners. Therefore, it is an estimate of the true parameter $\sigma$. The approximate $99\%$ confidence interval is then $$23500 \pm t_{99,0.005} \frac{3900}{\sqrt{100}} \approx [22475.7, 24524.3].$$

What aspect(s) of this interval estimate are approximate and what are the underlying assumptions?

  • It is assumed that the annual mileage of a randomly chosen driver in Virginia is an approximately normally distributed random variable.
  • It is assumed that the mileage of any driver is independent of the mileage of any other drivers.
  • This interval estimate is only an approximation to the extent that the above assumptions are not met.

It would be inconsistent to apply the $z$-score critical value $z_{0.005}$ to the interval estimate, and then in the subsequent part of the question, apply the $t$-score.