Probablity measurment misconception

83 Views Asked by At

Imagine that there are a number of data about student's performance on a test (out of 100). The average grade of 30 tests is 80. So, I want to compute the probability of getting a grade more than 80 for the successive test. I have no difficulty for solving such problem either through poisson distribution or normal distribution. The fallacy here that makes me wonder is that how can I adjust the probability distribution in a way it considers the fact that the grade can't be more than 100. (Because it seems, I won't get the right answer if I take the difference between 80 and 100 as the probability from the distribution).

1

There are 1 best solutions below

0
On BEST ANSWER

Lots of useful speculation here, but let's return to your testing example. Test scores are often assumed to be normal. (Scores are sums of many sub-scores that may be, roughly-speaking, iid. Also the population of students taking the exam may have normally distributed abilities. For commercial tests, such as GRE and SAT, test makers sometimes go to considerable trouble to make tests so that scores will turn out to have close to a normal distribution.)

So let's suppose test scores in your example are distributed $Norm(\mu = 80, \sigma = 5).$ The sample mean was given as 80. I'm using that as the population mean, and just arbitrarily picking $\sigma = 5$ as a reasonable population SD for a concrete example.

Then the chance an individual gets more than 100 points is quite small: the Empirical Rule says that 99.7% (or "almost all") scores are in the interval $80 \pm 15$. The probability of a score over 100 is tiny:

 1 - pnorm(100, 80, 5)
 ## 3.167124e-05

Under the assumption that a successor test has the same distribution, the probability a randomly chosen individual score over 80 is 1/2, and the probability that the average score of 30 randomly chosen students exceeds 80 is also 1/2. The probability that the class average $\bar X$ of 30 students is above, say, 85 is very small, and easy to compute using $SD(\bar X) = 5/\sqrt{30}.$

 1 - pnorm(90, 85, 5/sqrt(30))
 ## 2.160232e-08

However, from your question, it seems you are trying to predict the score of a randomly chosen student on the next exam, given the class sample mean on a current exam. For that you need to assume that the next exam is of equal difficulty with the current one, that the student is chosen from the same population as the current class of 30, and to know (or have an estimate of) the SD of exam scores for both exams in that population.

If that is really your question, then you need to look at such topics as 'prediction intervals', accounting for both the variance of the current $\bar X$ and the variance of the score of a future randomly chosen student. There too many speculative steps in that for me to try to give a numerical example. (The Wikipedia article on 'prediction interval' is. admittedly, a work in progress, but it finally gets around to a reasonable interval; many of the other articles from a Google search are for prediction in a context of regression, and so not directly applicable to your question.)