An exponential family of probability distributions have densities that are defined relative to a measure. What is a measure in this context?

171 Views Asked by At

I would like to understand the nature of measure, using the exponential family of probability distributions as a context (because I understand the latter well). I understand that we want equation (8.1) to integrate (over $x$) to $1$ - otherwise it would not be a probability distribution. Therefore we set that integral to one, and take the $e^{-A(\eta)}$ term out of the integral (since it is not a function of $x$), and rearrange to get equation (8.2).

My point is, I understand what's going on algebraically, but I have no idea what role the measure is playing conceptually. I have tried to learn measure theory many times (i.e., on my own), but I just don't seem to get what it is or its motivation. When the word measure is used, I have no idea what it refers to. The Lebesgue measure is supposed to be a generalization of length to sets that more complicated than intervals. I understand it somehow relates to probability. But what is a measure? In the particular context below, for example, what does it contribute to the description of the exponential family?

enter image description here Source: https://people.eecs.berkeley.edu/~jordan/courses/260-spring10/other-readings/chapter8.pdf

1

There are 1 best solutions below

1
On

You can think of a measure, in general, as providing a notion of 'volume' or 'area'. Why is this useful in probability? Because often the volume and area of subsets correspond to the probability of an event occuring. For example what is the probability you land on the red squares of a dart board? In particular whenever you need to integrate to calculate a probability you use a notion of a measure.

In the early 20th century there was a rebirth of probability by Kolmogorov et al. that used Lebesgue's new measure theory to formalise probability. It works as follows: you have a set $\Omega$ of occurences $\omega$, and another set of subsets of $\Omega$ that each correspond to events- you can view this literally as the event space that could happen in real life. random variables are functions from this event space that take values in real life, so when you talk about a random variable following a distribution, you're talking about the random variable's values when certain combinations of events happen, and the measure of those combinations. We have a function $\mathbb{P}$ that takes events, and gives you a probability of events happening. We call this a probability measure, and it has to satisfy the rules that Lebesgue indicated general measures have.