What does it mean that "the natural parameter space is convex"?

1.6k Views Asked by At

In a lecture this week my professor stated that

exponential families have convenient mathematical properties due to their natural parameterization such as the natural parameter space being convex.

Question: What does it mean that "the natural parameter space is convex"?


Some "thoughts": Does this suggest maximum likelihood estimators of the parameters always exist? What other mathematical properties of this result are useful?

2

There are 2 best solutions below

0
On BEST ANSWER

That the natural parameter space is convex means that if $\alpha,\beta$ are two different points in the natural parameter space, then every point between $\alpha$ and $\beta$ is also within the natural parameter space.

0
On

Its a long time ago question. But ...

Let us consider the exponential family $$ f_\mathbf{X}(x;\theta) = h(x)\exp\{\langle\theta, T(x)\rangle-A(\theta)\} $$ where $$ A(\theta) = \log\left(\int_X h(x)\exp\{\langle\theta, T(x)\rangle\}~{\rm d}x\right) $$ is the log-partition function.

Now we rewrite the exponential family definition to: $$ f_\mathbf{X}(x;\theta) = \frac{1}{Z(\theta)}h(x)\exp\{\langle\theta, T(x)\rangle\} $$ Then we have partition function $Z(\theta)$: $$Z(\theta) = \int_X h(x)\exp\{\langle\theta, T(x)\rangle\}~{\rm d}x$$

The natural space is defined as: $$ \mathcal N = \left\{\theta:\int_X h(x)\exp\{\langle\theta, T(x)\rangle\}~{\rm d}x \lt \infty\right\} = \left\{\theta:Z(\theta) \lt \infty\right\} $$ Now we prove that the natural space is a convex set:

Consider two distinct parameter $\theta_1,\theta_2\in \mathcal N$ and $0\lt\lambda\lt1$. Let $\theta = \lambda\theta_1+(1-\lambda)\theta_2$ be a convex conbinition of $\theta_1$ and $\theta_2$. $$ \begin{aligned} Z(\theta) &= \exp\{A(\theta)\} = \exp\{A(\lambda\theta_1+(1-\lambda)\theta_2)\}\\ &=\int_X h(x)\exp\{\langle(\lambda\theta_1+(1-\lambda)\theta_2), T(x)\rangle\}~{\rm d}x \\ & = \int_X \left(h(x)^{\lambda}\exp\{\langle\lambda\theta_1, T(x)\rangle \}\right)\left(h(x)^{1-\lambda}\exp\{\langle(1-\lambda)\theta_2, T(x)\rangle\}\right)~{\rm d}x \\ &\leq \left(\int_X h(x)\exp\{\frac1\lambda\langle\lambda\theta_1, T(x)\rangle\} ~{\rm d}x \right)^\lambda \left(\int_X h(x)\exp\{\frac1{1-\lambda}\langle(1-\lambda)\theta_2, T(x)\rangle\} ~{\rm d}x \right)^{1-\lambda} \\ &=Z(\theta_1)^\lambda \cdot Z(\theta_2)^{1-\lambda} \end{aligned} $$ Since $Z(\theta_1), Z(\theta_2)\lt\infty$, therefore $Z(\theta)\lt\infty$, so $\theta\in\mathcal N$. The natural parameter space is a convex set.

Hölder's inequality was used in this proof, check https://mathworld.wolfram.com/HoeldersInequalities.html for more information

Take logarithm to above inequality, you will get log-partition function $A(\theta)$ a convex function.

$$ A(\theta) = A(\lambda\theta_1+(1-\lambda)\theta_2) \leq \lambda A(\theta_1) + (1-\lambda)A(\theta_2) $$