Dirac delta in Dirichlet Distribution explanation

681 Views Asked by At

I am familiar with the following definition of the Dirac delta function: $$ \delta = \begin{cases} \infty, & \text{if } x=0 \\ 0 & \text{if } x \neq 0 \end{cases} $$

Now, I am reading this paper, and Appendix A defines Dirichlet probability density function as:

$$\text{Dir}(\{p_i\}) = \frac{\Gamma(\alpha)}{\prod_{i=0}^{q-1} \Gamma(\alpha_i)} \delta\left( 1-\sum_{i=0}^{q-1} p_i \right) \prod_{i=0}^{a-1}p_i^{\alpha_i - 1}$$

Such that $p_i \in [0,1]$ and $\sum_{i=0}^{q-1} p_i = 1$. Hyperparameters $\{ \alpha_i\}$ are real and positive, and $\alpha = \sum_{i=0}^{q-1} \alpha_i \tag{1}$.

I am fairly certain I am misunderstanding the meaning of Dirac delta here. I know that it is used to normalize the pdf here. But from the definition of the Dirac delta I am familiar with, equation (1) tells me that the middle term should return infinity when $1-\sum_{i=0}^{q-1}p_i =0$, but we know that that is true. So, we will get infinity for whatever collection of $\{ p_i \}$ that sums to $1$? (clearly not).

Another conundrum for me is that for some reason you supply a collection of probabilities into Dirichlet pdf, instead of some $x$ for which you get a probability. Yeah.. I am a bit confused with this equation.

1

There are 1 best solutions below

13
On

$\{(p_0,\ldots,p_{q-1}):\sum p_i=1\}$ is not intended to be the domain where the Dirichlet probability density is defined. Instead, the density function is defined on the whole space $[0,1]^n$. Since the Dirac delta in the expression vanishes on $\{(p_0,\ldots,p_{q-1}):\sum p_i\ne1\}$, we have $$ \mathsf P\left(\sum_{i=0}^{q-1} p_i=1\right)=1 $$ which is the exact meaning of "the variates must satisfy... $\sum p_i=1$" in the paper.

For the latter question, $\{p_i\}$ is not a "collection of probabilities". It stands for values of the random variables in the distribution. This is indicated in the paper in the sentence before (A1): "the probability density function for $q$ elements". If you're feeling uncomfortable with this notation, you may just change $\{p_i\}$ to $\{x_i\}$.