In the discrete case (or bounded support?) why the distribution maximising the entropy is uniform while it is the gaussian in the class of fixed std?

Question

In the discrete case (or bounded support?) why the distribution maximising the entropy is uniform while it is the gaussian in the class of fixed std?

53 Views Asked by Bumbble Comm At 28 Mar 2026 - 11:34

Let $X$ be a random variable with a probability density function $f$ whose support is a set $\mathcal{X}$. The entropy $h(X)$ or $h(f)$ is defined as

$$ H(X)=\mathrm{E}[-\log (f(X))]=-\int_{\mathcal{X}} f(x) \log f(x) d x $$

Consider $dx$, the counting or Lebesgue measure.

From T. M. Cover, J. A. Thomas - Elements of Information Theory (2006) :

We have for the uniform distribution the following:

Theorem 2.6.4 $H(X) \leq \log |\mathcal{X}|$, where $|\mathcal{X}|$ denotes the number of elements in the range of $X$, with equality if and only $X$ has a uniform distribution over $\mathcal{X}$.

Proof: Let $u(x)=\frac{1}{|\mathcal{X}|}$ be the uniform probability mass function over $\mathcal{X}$, and let $p(x)$ be the probability mass function for $X$. Then $$ D(p \| u)=\sum p(x) \log \frac{p(x)}{u(x)}=\log |\mathcal{X}|-H(X) . $$ Hence by the nonnegativity of relative entropy, $$ 0 \leq D(p \| u)=\log |\mathcal{X}|-H(X) $$

We have for the normal distribution the following:

Theorem 8.6.5 Let the random vector $\mathbf{X} \in \mathbf{R}^n$ have zero mean and covariance $K=E \mathbf{X X}^t$ (i.e., $K_{i j}=E X_i X_j, 1 \leq i, j \leq n$ ). Then $h(\mathbf{X}) \leq$ $\frac{1}{2} \log (2 \pi e)^n|K|$, with equality iff $\mathbf{X} \sim \mathcal{N}(0, K)$.

Does Theroem 2.6.4 holds also for continuous uniform distributions ? (bounded support)

The Theorem 8.6.5 by letting $\mathbf{R}^n$ be the support, increase the set of possible distributions still intuitively I would think the continuous uniform distribution to be the extension of the discrete case ?

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Answer 1 · 2023-02-01 19:52:39

Yes, Theorem 2.6.4 holds more generally. It's known as Gibbs' inequality, and is true e.g. for the continuous uniform distribution on any set of finite measure in $\mathbb R^n$.

Theorem 8.6.5 is different in that it fixes the covariance $K$. If $K$ was allowed to vary, there would be no maximum entropy distribution; you can check that e.g. for $X \sim N(0, \sigma^2I)$, the entropy $h(X) \to \infty$ as $\sigma^2 \to \infty$.

Asking what maximises entropy for a fixed covariance $K$ is a fundamentally different question than the unrestricted maximum, which is why you get a qualitatively different result. Even if the support was bounded, fixing $K$ would give you a Gaussian-like distribution, rather than a uniform one.

In the discrete case (or bounded support?) why the distribution maximising the entropy is uniform while it is the gaussian in the class of fixed std?

There are 1 best solutions below

Related Questions in PROBABILITY

Related Questions in PROBABILITY-THEORY

Related Questions in PROBABILITY-DISTRIBUTIONS

Related Questions in ENTROPY

Trending Questions

Popular # Hahtags

Popular Questions