What is what between binomial many flip outcomes and statistical observation?

69 Views Asked by At

I am trying to get my head around normal distribution which evolves as a good approximation for a binomial problem (like coin flips).

Theoretical outcome:

When I have say 50 flips, there is a deterministic no of binomial outcomes (X = no of heads in any outcome sequence), which can be visualized, and graphed. I tried to graph the frequency of X, that is, $n(X_k)$ vs $X_k$, where the $n(X_k)$ are the binomial coefficients. And also the resultant probability $p(X_k)$ vs $X_k$. I also tried to overlay normal approximation for both graphs.

No of flips: 50
p = 0.75 (probability of heads per flip)

enter image description here

Along with applying regular Normal distribution formula on probability curve, I also tried to normal-approximate on frequency curve $n(X_k)$ on LHS as shown below. That is why the term $max(n(X_k))$ multiplied by exponential denoting the red curve which is obviously misplaced regarding which I have questions below.

Statistical outcome:

I also tried to simulate statistically with p(H)=0.75. Since practically we only get one outcome out of $2^{50}$ possibilities every time, I ran this experiment for say 2000 times. I then collected the $n(X_k)$, and also statistical $p(X_k)$ by just dividing each $n(X_k)$ with total no of outcomes (sum of all $n(X_k)$ ) and plotted them. I also tried to approximate that with normal curves. I get this.

No of flips: 50
No of experiments : 2000 ( 1 experiment $\rightarrow$ 50 flips $\rightarrow$ 1 output sequence )
p = 0.75

enter image description here

My questions:

  1. I get vaguely, why $n(X_k)$ in Theoretical outcome still has 25 as mean, because no probability has been associated there yet or inherently we assume equal probability for all? In statistical outcome, you can see $n(X_k)$ has shifted mean to around 37 rightly thus resulting in pdf also on RHS. Theoretical outcome RHS also has pdf sitting on proper mean 37, but LHS looks misplaced.
  2. If my inference is correct, that by nature, possible outcomes $n(X_k)$ is always probability neutral or independent, how could I normally approximate such theoretical outcomes with varying probability?
  3. If my inference is wrong, what is missing piece that prevents my binomial outcomes (not their probability which is already in place 37) to shift to mean 37?
  4. What are proper terminology for both type of probabilities above? Top one is from theoretical outcome, while bottom is statistical. People use one of these to explain normal approximation, so which is correct or better?

On big picture, I am trying to understand how normal distribution evolves from sample distribution and the underlying cause for it.enter preformatted text here