Simple Probability Formula and Convergence

30 Views Asked by At

I have the probability of an event A defined as $P(A) = \frac{N_{a}}{N}$. In order to understand precisely this concept, i hypothesized that the meaning of this formula is : ,

as i repeat the observation of the event A in practically the same conditions, i get that, as $i$ grows, $\frac{\sum N_{a_{i}}}{\sum N_{i}}\Rightarrow \frac{N_{a}}{N}$, where $N_{a_{i}}$ is a specific occurrence of event A.

Is it correct ?

1

There are 1 best solutions below

3
On

You mention a limiting process. My guess is you are looking at the so-called 'frequentist' formulation of probability.

This formulation of probability requires a repeatable experiment. Suppose you are tossing a fair coin and are interested in the probability of getting a Head. For simplicity we denote Heads as $1$'s and Tails as $0$'s At the $n$th toss of the coin we get a number $T_i$ which is $0$ of $1.$

After each toss, we look at the number $A_n$ of $1$'s up through the $n$th roll and the ratio $R_n = A_n/n$ which is the proportion of $1$'s up through the $n$th roll.

If I simulate 4000 tosses of such a coin using a computer, then I might get a data table with the first 10 rows looking like this:

   n t a         r
   1 1 1 1.0000000
   2 1 2 1.0000000
   3 0 2 0.6666667
   4 0 2 0.5000000
   5 0 2 0.4000000
   6 1 3 0.5000000
   7 0 3 0.4285714
   8 0 3 0.3750000
   9 1 4 0.4444444
  10 1 5 0.5000000

I happened to get Heads ($1$'s) on tosses 1, 2, 6, 9, and 10. Up to the 5th roll I had two $1$s for a ratio $R_5 = 0.4$ and up to the 10th roll I had five $1$s for a ratio $R_{10} = 0.5.$ At the beginning, the ratios $R_n$ fluctuate greatly.

However, if we look at the last ten entries in the data table, here is what I see:

     n t   a         r
  3991 0 2019 0.5058882
  3992 0 2019 0.5057615
  3993 0 2019 0.5056349
  3994 1 2020 0.5057586
  3995 1 2021 0.5058824
  3996 1 2022 0.5060060
  3997 1 2023 0.5061296
  3998 0 2023 0.5060030
  3999 1 2024 0.5061265
  4000 1 2025 0.5062500

The values of $R_n$ have stabilized very near to $1/2$, the probability of getting Heads. A plot of the ratio $R_n$ against $n$ is shown below.

enter image description here

Because coin tossing is a random process, there is no guarantee what value $R_{4000}$ will take. A theorem you will likely see a little later in your course, called the Law of Large Numbers, says $R_n$ tends to get ever closer to $1/2$ as $n$ increases. Usually, by about $n = 4000$ we get $R_n$ in the range $0.5 \pm 0.016.$

(Somewhere around $n = 300$ we happened to get a lot of Tails, but eventually those 'extra' Tails were balanced by the large patches of results where Heads and Tails happened to be nearly equal. The blue line will often behave 'badly', with a lot of variability, near the left of the graph where $n$ is small. Certainly, the coin is not consciously keeping track to compensate for patches of strange behavior early on. It's called the law of large numbers because random anomalies early on get 'swamped out' by huge numbers of tosses later on.)

However, this is not a deterministic limiting process. For the sequence $D_n = \frac{n+5}{2n-7}$ we know exactly what $D_n$ will be for each value of $n$ and we can write $\lim_{n \rightarrow \infty} D_n = 1/2,$ without mentioning anything about probability.

For the coin-tossing limit, we sometimes write $\text{plim}_{n \rightarrow \infty} R_n = 1/2$ and say "$R_n$ converges in probability to $1/2.$" But that expression needs a careful definition before we can prove theorems about it.