Maximizing Differential Entropy with Limited Support and Power Budget Constraint and Expected sigmoid constraint

44 Views Asked by At

Context

I am delving into the study of probability distributions and their entropy, with a particular focus on identifying a distribution that maximizes entropy under specific constraints. An area of interest is the application of the sigmoid function, widely recognized in logistic regression and neural networks, to this problem. The sigmoid function is defined as:

$$ \sigma(x) = \frac{1}{1 + e^{-x}} $$

Problem Statement

Given a random variable $X$, I am faced with the following constraints:

  • The expectation of $X^2$, denoted as $E[X^2]$, is constrained by a constant $E$.
  • The expectation of the sigmoid of $X$, $E[\sigma(X)]$, is limited to a maximum value $D$, where $D$ is within the range $(0, 1)$.

The goal is to find the probability distribution of $X$ that achieves maximum differential entropy under these constraints.

Hypothesis

My initial conjecture is that a binomial distribution with equal probabilities of success and failure might be the optimal solution for maximizing entropy subject to the aforementioned constraints. This intuition is grounded in the discrete nature and inherent symmetry in the probabilities of the binomial distribution, suggesting it might favor higher entropy under such conditions.

Questions

  1. What is the specific form of the probability distribution that maximizes entropy subject to the constraints $E[X^2] \leq E$ and $E[\sigma(X)] \leq D$? with limited size of constellation of xsupport.
  2. What methodologies or approaches exist for identifying such a distribution?
  3. Does my hypothesis regarding the binomial distribution hold any merit in this context? If not, which distribution could potentially be more suitable?
  4. I would greatly appreciate any references to literature or theoretical discussions relevant to information theory and constrained optimization in relation to this problem.

Your expertise and insights on this intriguing issue would be incredibly valuable to me. Thank you in advance for your guidance and assistance.

1

There are 1 best solutions below

12
On

It’s not entirely clear to me whether you mean discrete or differential entropy, but in either case there’s no maximum because in either case a uniform distribution with values below $D$ can be chosen with arbitrarily large entropy by extending the support towards $-\infty$, increasing the number of points / the length of the interval and thus the entropy.