Why is the normal distribution a distribution?

2.3k Views Asked by At

The normal distribution is defined from wikipedia as:

Is a very common continuous probability distribution. Normal distributions are important in statistics and are often used in the natural and social sciences to represent real-valued random variables whose distributions are not known.

But why is it a kind of probability distribution and not a type of probability density?*

The shape of the curve of Probability density function is the shape of the probabilities that the random variable takes, for example in the normal distribution the most probable values are in the highest region of the curve.

Therefore, the PDF gives us information on the form that the possible values of the random variable will take. And the CDF gives us the probability that the random variable takes values less than or equal to a certain value $ n $, so this makes me think,

Why is it a type of distribution and not a density type? Therefore, it should be called normal density

EDIT: I think that something called "Distribution" tells me how the values are distributed. And this I can know just by looking at the graph. And it is precisely this information that I obtain with the density function. So, what error of concepts do I have?

4

There are 4 best solutions below

5
On

The distribution of a given random variable is an assignment of a probability to every possible event related to that variable.

For random variables that take real numbers as values, an "event" is a statement of the form "The variable is contained in $A$" where $A\subseteq \Bbb R$. One can often find a pdf or cdf which may be used to describe the distribution (through integration for the pdf or through subtraction for the cdf).

Fundamentally, it is a distribution which is attributed to a random variable, not a density function. The pdf and cdf are just handy tools for doing calculations on the distributions that are nice enough to have them.

2
On

A distribution function defines a particular probability distribution. Depending on the context, this might be used for a pmf, a pdf and/or a cdf. So it is just a concept that can be applied in different ways considering the characteristics of the specific case -discrete or continuous, for example- we're dealing with.

0
On

The difference between a probability distribution and a probability density is that the latter is a special case of the former. In fact, the reason the normal distribution is commonly is due to the fact it happens to be the distribution one gets in the central limit theorem. In general, a probability distribution need not have a density (the precise property is that the probability distribution is absolutely continuous with respect to the Lebesgue measure). It just turns out that the distribution arising from the central limit theorem has this property, and therefore the normal density exists - with one caveat! Namely, there is such a thing as a normal distribution with variance zero. It describes the distribution of a deterministic number. Its distribution is known as the dirac delta "function", which has no true density. If it did have a density, it would spike to infinity at the deterministic number, and be zero everywhere else.

3
On

The term "probability distribution" is often used loosely or inconsistently, but I think this is the most standard definition: If $X$ is a real-valued random variable, then the distribution of $X$ is the function $\mu$ which takes a set $A \subset \mathbb R$ as input and returns the number $$ \mu(A) = P(X \in A) $$ as output. (Technically, I should assume that $A$ is measurable.)


Comments:

Note that $\mu$ is a probability measure on $\mathbb R$. Conversely, it can be shown that any probability measure on $\mathbb R$ is the distribution of some random variable $X$.

The above definition is used in Folland, for example, so I think it's quite standard. The definition can be generalized to the case where $X$ takes values in a measurable space other than $\mathbb R$. (As a result, it can be shown that any probability measure whatsoever is the distribution of some random variable.)

Some authors (notably, Sheldon Ross) use the term "distribution" to mean "cumulative distribution function" (CDF), which is a different (but related) mathematical object.

Some people use the term "probability distribution" to mean "either a PMF or a PDF", but my impression is that probabilists would object to this use of the term, or say that it's not strictly correct. One might ask, what about random variables that are neither discrete nor continuous? (Here PMF stands for "probability mass function" and PDF stands for "probability density function".)

I think that there has been some genuine confusion caused by inconsistent use of the term "probability distribution", and I'd be happy if anyone who's an expert on probability would let me know if they disagree with my comments here.