Understanding when probability distributions are in the exponential family.

37 Views Asked by At

I'm starting to study Generalized Linear Models and I need help understanding how to show that a distribution is part of the exponential family. I know that in general, a distribution is a member of the exponential family if it can take on the following form.

$$p(x | \eta) = h(x) \exp(\eta \pi(x) - A(\eta))$$

I get that the basic idea is to take the exponential of the logarithm of the distribution then try to get things to match up.

When I look at some examples like Poisson distribution, I kind of get it, but I'm still left with a lot of questions.

Given $$p(x | \lambda) = (\lambda^x e^{-\lambda}) / x!,$$

it can be rewritten as follows

$$p(x | \lambda) = 1/x! \exp(x \log(\lambda) - \lambda)$$

I understand most of this.

$\log(\lambda^x) = x \log(\lambda)$ and $\log(e^{-\lambda})$ simplifies to $-\lambda$. But I don't understand why doesn't the $(1/x!)$ become $-log(1/x!)$.

I also don't understand how to assign the different values to their respective parts

  • $\eta = \log(\lambda)$

  • $T(x) = x$

  • $A(\eta) = \lambda = e^{\eta}$

  • $h(x) = 1/x!$

I'm familiar with the rules for dealing with logarithms

  • log(mn) = log(m) + log(n)
  • log(m/n) = lom(m) - log(n)
  • log(m^n) = nlog(m)

But it seems like I'm still missing something.

1

There are 1 best solutions below

0
On BEST ANSWER

Suppose $X$ is a member of an exponential family. A standard form for the pdf/pmf of a member of an exponential family is $$f_X(x|\eta) = h(x)\exp\bigl(T(x)\eta-A(\eta)\bigr).$$ We can rewrite this as $$h(x)\cdot e^{T(x)\eta}\cdot e^{-A(\eta)}.$$ This is how we'll assign the different parts: By writing the pdf/pmf as a product of one term that depends only on $x$, one that depends only on $\eta$, and one that depends on both (in a very specific way). Note that $h(x)$ is already on the outside, as is the $1/x!$, so we aren't going to need to do anything with that. However, the $\exp(T(x)\eta)=\lambda^x$, so we have to re-jigger the $\lambda^x$ as $$\lambda^x = \exp(\log(\lambda^x)).$$ This is why we did have to do something with the $\lambda^x$ term that we didn't have to do to the $h(x)$. And similarly, we have to rewrite the $e^{-A(\eta)}$. In this case, it's already going to appear as $e^{-A(\eta)}=e^{-\color{red}{\text{some junk}}},$ so we aren't going to have to do much to $A(\eta)$ in this case.

Generally, if you actually have a member of the exponential family, you have to play around with it until you see how to separate it into the three products above. The product involving only $x$ tells you what $h(x)$ is. The product that has both terms will tell you the relationship between $\lambda$ and $\eta$. The last term, combined with the relationship between $\lambda$ and $\eta$, will give you $e^{-A(\eta)}$.

The $h$ part is essentially a change of measure/density, the $A(\eta)$ is a normalizing factor, and the "interesting" part of the distribution comes from the term $T(x)\eta$.

The parameter $\eta$ is known as the canonical parameter. We can identify it by noting that inside the exponential, there will be a product of a function of $x$ (which is our $T(x)$), and a function of the parameter $\lambda$ (which is our $\eta=\log(\lambda)$). The reason this parameter $\eta$ is canonical is that it (as opposed to merely a function of it) is multiplied by $T(x)$.

The remaining parts will depend on $x$ only or $\lambda$ only. The part that depends on $x$ only will be collected to become $h(x)$ outside the exponential. The part that depends on $\lambda$ only will first be rewritten to depend on $\eta$ only, and then it will become $A(\eta)$ inside the exponential. The fact that $h(x)$ is outside the exponential and $A(\eta)$ is inside is basically the answer to one of your points of confusion (about why we didn't take the logarithm of $1/x!$).

In your example, we have $$p(x|\lambda)=\frac{1}{x!}\cdot \lambda^x \cdot e^{-\lambda}.$$ We can already spot these as multiplicative parts. If this is actually a member of an exponential family (which it is), we will have $$h(x)=\frac{1}{x!},$$ $$\exp(T(x)\eta)=\lambda^x,$$ and $$\exp(-A(\eta))=e^{-\lambda}.$$

The expression for $h(x)$ is already is a good final form, so we don't do anything to it.

We can take logarithms to get $$T(x)\eta=\log(\lambda^x)=x\log(\lambda),$$ so $T(x)=x$ and $\eta=\log(\lambda)$. Inverting yields $\lambda=e^\eta=h(\eta)$.

Finally, we have $$\exp(-A(\eta))=e^{-\lambda},$$ so $$-A(\eta)=-\lambda=-e^\eta,$$ and $A(\eta)=e^\eta$.