Here is what I have for it:
Laplace's theorem: Let $S_n$ denote the number of "successes" in $n$ Bernoulli's trials, and let $p$ be the probability of success $p\in(0,1)$. Then
$\forall a,b\in \mathbb{R} \text{ | } a < b \text{ } \lim_{n\to\infty} P( a < \frac{ S_n - np }{\sqrt{np(1-p)} } < b) = \frac{1}{\sqrt{2\pi}}\int_{a}^{b}e^{-\frac{t^2}{2}}dt $
I have difficulty understanding what does the condition $a < \frac{ S_n - np }{\sqrt{np(1-p)} } < b$ mean. Basically I'm looking for a way to state this theorem informally (written in words). Please keep in mind that this in my book gets introduced way before random variables, variance, expected value and normal distribution so they should not be included.
Thanks for your help
Google Books gives this (translated) description of your textbook, "Note di Calcolo della Probabilità" by Giuseppe Modica and Laura Poggiolini:
No preview is available in Google Books, but we may guess that the proposition asked about is included in the book's discussion of Bernoulli process. Although here attributed as "Laplace's theorem", the idea was proposed by Abraham DeMoivre as early as 1733, to approximate a sum of consecutive terms from a binomial expansion by an integral form. This is quite early in the development of calculus; DeMoivre was a younger contemporary and friend of Isaac Newton himself.
The setup here is that of $n$ Bernoulli trials, each with an independent chance of "success" $p\in (0,1)$ or of "failure" $1-p$. The random variable $S_n$ counts the total number of successes, so that $0 \le S_n \le n$ is certain.
Intuitively the average number of successes will be $np$, since each trial has chance $p$ of succeeding independently of the other trials. You disclaim having been introduced to the notion of expected value, but that is the same in this context as the average $S_n$ if we were to repeat those $n$ Bernoulli trials over and over again. So the outcome described in your Question:
$$ a \lt \frac{S_n - np}{\sqrt{np(1-p)}} \lt b $$
says that in a particular run, $S_n$ differed from its average value $np$ in a way that the ratio shown falls into the interval $(a,b)$. It may be easier to interpret if we clear the denominator, i.e. get this equivalent outcome:
$$ a\sqrt{np(1-p)} \lt S_n - np \lt b\sqrt{np(1-p)} $$
Asking about $P\left(a \lt \frac{S_n - np}{\sqrt{np(1-p)}} \lt b \right)$ means the probability that the fraction falls into the range $(a,b)$. Intuitively the shorter the interval $(a,b)$, the smaller this probability should be.
We find by experience that $S_n - np$ is more likely to be close to zero than far away from zero. That is, the highest "peak" of probability occurs when the interval $(a,b)$ of any specified length $b-a$ is centered around the origin $0$. A picture contributed to Wikimedia by user Cflm001 for $n=$ and $p=0.5$, e.g. flipping a fair coin six times:
illustrates the strong peaking that occurs around the middle (average) of the outcomes.
The DeMoivre-Laplace theorem says that as $n$ increases to infinity, the chunky bar graph of the binomial probabilities will resemble a smooth function familiar to many as "the Bell curve", but called by mathematicians a Gaussian curve or "normal distribution".
For an introductory study of mathematical probability, especially one aimed not at "theorem and proof" understanding but on applications useful to engineering undergraduates, this attempt to say exactly how the chunky (discrete) graphs resemble (converge to) a smooth (continuous) graph will be more than a little awkward. The version you ask about:
$$ \lim_{n\to \infty} P\left(a \lt \frac{S_n - np}{\sqrt{np(1-p)}} \lt b \right) = \frac{1}{\sqrt{2\pi}}\int_{a}^{b}e^{-\frac{t^2}{2}}dt $$
seems to me to strike an admirable balance between precision and simplicity, but at the expense of confronting the student with an expression difficult to parse because of unfamiliarity.
To finish let's note that the characteristics we discussed above about the Bernoulli outcomes are shared with the limiting integral on the right hand side. First, the closer $a$ and $b$ are, the shorter the interval of integration, and thus the smaller the value of "probability" it gives us. Second, for any fixed length of that interval of integration, the largest value will be obtained by centering the interval $(a,b)$ around the largest values of the integrand:
$$ e^{-\frac{t^2}{2}} $$
which is an even function of $t$ that "peaks" (reaches its maximum) at the origin $t=0$.
Leave a Comment if what I've said needs further clarification.