How do I find the missing sample points with the Mean and Sample Standard Deviation?

2.2k Views Asked by At

I am completely lost on this statistics question:

A statistician had a data set containing 13 data points written in his research notebook. He spilled coffee on his notebook and now he cannot read two of the data values. He remembers that the sample mean of the original data set was 26.692 and the sample standard deviaion was 10.177. Use your power of deduction and the 11 still readable data values given below, to determine the two lost data values.

2

There are 2 best solutions below

0
On BEST ANSWER

$$ \underbrace{x_1+\cdots+x_{11}} + x_{10}+x_{12} = \bar x =26.692 $$ Thus you know the sum of the two missing observations; it is $26.692 -(x_1+\cdots+x_{11})$.

The mean of the two missing observations is half of that.

Often "standard deviation" in this context means $\dfrac1{n-1}\sum\limits_{i=1}^n (x_i-\bar x)^2$, but here it will be useful to use $\dfrac 1 n \sum\limits_{i=1}^n (x_i-\bar x)^2$, since that makes the next step possible. If $10.177^2=\left(\frac1{n-1}\cdot \text{sum of squares}\right)$ then $\left(\frac{n-1}n\cdot\text{sum of squares}\right)$ is what we will call the variance for now, and that is $\frac{12}{13}\cdot10.177^2\approx94.48\ldots$.

Now let $D$ be the difference between the mean of the $11$ known observations, and the mean of the two missing observations. Then we have \begin{align} 94.48 \approx {} & \frac{11}{13}\cdot(\text{known variance of 11 observations}) \\[8pt] & {} + \frac 2 {13}\cdot(\text{unknown variance of the two unknown observations}) \\[8pt] & {} + {} \frac{11}{13}\cdot\frac2{13}\cdot D^2. \end{align} Then we have the mean and variance of the two unknown observations. Their mean is $\dfrac{x_{12}+x_{13}}2$ and their variance is $\dfrac{(x_{12}-x_{13})^2}4$, and given the actual numbers we can solve for $x_{12}$ and $x_{13}$.

0
On

You can set up a system of equations, denote the two missing data points by x and y, then the formula for the mean gives you one equation involving x and y and the one for the standard deviation gives you a second one. Then you have to solve this system of two equations for x and y.