I really need your help. Here's the problem:
A rookie is brought to a baseball club on the assumption that he will have a .300 batting average. (Batting
average is the ratio of the number of hits to the number of times at bat.) In the first year, he comes to bat
300 times and his batting average is .267. Assume that his at bats can be considered Bernoulli trials with
probability .3 for successs. Could such a low average be considered just bad luck or should he be sent back
to the minor leagues?
I know that, as his batting average was .267 in 300 times, he only hit 80 times.
I know I have to use the Central Limit Theorem, but how?
First I thought I had to calculate, for example, the probability that he had hit between 75 to 85 times. Obviously knowing that this can be modeled by a Binomial(300,0.3) and aproximating this by a Normal (because n=300). So I could use the CLT that way.
But I think that what I really need to do is to give an interval, where I can assure, with high probability (90% for example), the real expectation is in that interval.
So, how I bound the real expectation using the CLT?
(Sorry if I wrote something wrong, English is not my first language)
A batting average of $.267$ is pretty good in the major leagues.
But for your probability problem, I think you want to do something like this: Let $X_i$ be the random Bernoulli outcome for batting attempt $i$, for $i \in \{1, \ldots, 300\}$. Assume $\{X_1, \ldots, X_{300}\}$ are mutually independent Bernoulli with $Pr[X_i=1]=0.3$. So you want to compute $Pr[\sum_{i=1}^{300} X_i \leq 80]$. You can translate this into a CLT problem by:
\begin{align} Pr\left[\sum_{i=1}^{300} X_i \leq 80\right] &= Pr\left[\frac{1}{\sqrt{300 Var(X_1)}}\sum_{i=1}^{300} (X_i-.3)\leq \frac{1}{\sqrt{300 Var(X_1)}}(80-90)\right]\\ &\approx Pr\left[G \leq \frac{-10}{\sqrt{300 Var(X_1)}}\right] \end{align} where $G$ is a standard Normal Gaussian with zero mean and unit variance, and this assumes that 300 samples is "large enough" for the approximation to be good. So now use a "lookup table approach." By symmetry of the Gaussian, this is also equal to the $Q(\cdot)$ function evaluated at $10/\sqrt{300 Var(X_1)}$, where $Q(x) = \int_{x}^{\infty} \frac{1}{\sqrt{2\pi}} e^{-t^2/2}dt$.
"Lookup tables" are so outdated, I don't know why they are still taught when everyone has a computer at their disposal. Also, I don't know why computers and calculators do not have a standard $Q(\cdot)$ function built in, they have the much less useful $erf(\cdot)$ function, and I always have to try to remember how that one is defined and how to derive one from the other.
Okay, I looked it up. We have the following definitions for $Q(\cdot)$ and $erf(\cdot)$:
\begin{align} Q(x) &= \frac{1}{\sqrt{2\pi}} \int_x^{\infty} e^{-t^2/2}dt\\ erf(x) &= \frac{2}{\sqrt{\pi}} \int_0^x e^{-t^2}dt \end{align}
and so: $$ Q(x) = \frac{1}{2}-\frac{1}{2}erf\left(\frac{x}{\sqrt{2}}\right) $$