I learned in class that distribution $f$ is a continuous mapping from the set of smooth test functions $\phi$ to a real number.
Why is this mapping called a "distribution"? Can you explain to me the intuition/motivation for the name?
What is $f$ a distribution of?
My personal guess, is that distributions were introduced around the same time historically as probability was extended into analysis territory. (I think this was as late as the early 1900s, please correct me if wrong.)
You may be aware of the Probability Density Function (PDF) for continous variables
$$pdf(t) \text{ such that } \int_{-\infty}^{\infty}pdf(\tau)d\tau = 1$$
Cumulative Distribution Function (CDF):
$$cdf(t) = \int_{-\infty}^{t}pdf(\tau)d\tau$$
Now to allow discrete distributions to be described with integrals within this framework, it should be obvious it will not be enough to consider Riemann integrable functions.
Now for a pdf to describe behaviour of 6 sided fair dice, we will need "infinitely thin" slices of probability densities with 1/6 each concentrated around integers $\{1,2,3,4,5,6\}$. This will not be doable with normal Riemann integrable function, because no such function can have the property $$\lim_{\epsilon \to 0}\int_{t-\epsilon}^{t+\epsilon}f(\tau)d\tau \neq 0$$
This would be a great reason to introduce "something" $\delta(t)$ with the property
$$\int_{-\infty}^{\infty}\delta(t)\phi(t) dt = \phi(0)$$
Where $\delta$ is the Dirac delta distribution. Because we need it to be able to join discrete probability distributions within the same framework as the continuous probability distributions.