Question on understanding Dirichlet process

140 Views Asked by At

I have questions on understanding this article about Dirichlet process. If you look at the beginning of section 2.1, it shows three equations 2.1, 2.2, 2.3. The question is I don't understand what exactly those probabilities represents and why we need them. And one thing that the article confuses me is that they removed the subscripts in equations 2.1 and 2.2. They said that $L_i=k$ implies that $X_i \in k$. So if $L_1 = 1$, then $X_1$ belongs to $1$ cluster. How can I interpret these two equations? Also, can anyone suggest articles about Bayesian nonparametrics, Dirichlet process and Indian buffet process?? I am trying to understand the unsupervised clustering method using these processes. Thank you!

1

There are 1 best solutions below

5
On BEST ANSWER

By this model, the data are assumed to be generated in the following way.

  • First, pick a cluster according to the distribution $c_k:=\mathbb{P}\{L=k\}$ (for example, maybe there are $6$ clusters, and you roll a biased die to choose the cluster).
  • Having chosen the cluster $L=k$, pick a point $X$ by drawing from the distribution $P_k(\cdot):= \mathbb{P}[X \in \cdot \mid L=k]$ (for example, a classic example is a multivariate Gaussian, so that most points will be near the mean, and form a "cluster"). Each cluster has a different distribution (so maybe one cluster is a Gaussian centered over here, another cluster is centered over there, and so on).
  • Plot this point. Repeat these steps over and over to generate a dataset. You can replace the $L$ and $X$ with $L_1$ and $X_1$, then $L_2$ and $X_2$ for the next point, and so on.

These are our assumptions, that is, somebody had distributions $\mathbb{P}\{L=k\}$ and $\mathbb{P}[X \in \cdot \mid L=k]$, and followed this procedure to create a dataset.

The usual task is to go backwards: to take a given dataset, and [assuming it was created in this manner], figure out what these two distributions are.