Maximum entropy for continuous distributions, optimization problem

38 Views Asked by At

Reading E. T. Jaynes book. In chapter 12 he introduces an extension to Shannon's entropy for continuous distributions:

(*) $H_{I}^{c} = - \int \text{d}z \, p(z|I) \log{\frac{p(z|I)}{m(z)}}$

which has the nice feature of being invariant under changes of parameters if the "limiting density of discrete points" $m(z)$ (**) transforms in the same way as the probability density $p(z|I)$. The question: If the prior knowledge $I$ does not completely fix the form $p(z|I)$ how do we use the principle of maximum entropy to work out the most ignorant form of $p(z|I)$?

For example: Say, that we are considering a two dimensional distribution and as a result of considering $I$ get the functional form $p(x , y | I) \propto f_{I}(\sqrt{x^2 + y^2})$. It is tempting to use $f_{I}(r)$ with (*), and run some optimization algorithm to get the best (highest entropy) $f_{I}^{\text{best}}$. The problem is that even if also $m(x , y) \propto g(\sqrt{x^2 + y^2})$ then the result of the optimization algorithm will depend on our choice of function $g$ and the result will not be unique.

Is there a way around this or maybe I'm not understanding something? Cheers

(**) $\lim_{n \rightarrow \infty}\frac{1}{n} \text{(number of discrete points x in the interval a < x < b)} = \int_{a}^{b} m(x) \, \text{d}x$