How to make a low entropy distribution and pitch it to max entropy?

238 Views Asked by At

I have a categorical distribution that can take values $A_1, \ldots,A_n$ with probabilities $p_1,\ldots,p_n$, respectively.

The entropy of a source that generates a sequence of random numbers from the above distribution is defined as:

$H(x)=\sum_{i=1}^{n} - p_i \log2(p_i)$

For simulation, I want to set $p_i$s such that it yields a small entropy. Then, in a few steps, I increase the entropy. At the last step, I want to reach $p_1=\ldots=p_n=1/n$ to get the maximum entropy which is $\log2(n)$. How can I do that?

This is my solution but it is kind of not straightforward. I need a better solution:

Generate $n$ independent random variables $r_1,\ldots,r_n$ according to a Gaussian distribution, then sift them to get positive probabilities $y_i=r_i-\min(r_i)+.1$. Then normalize them to make their sum equal to one: $p_i=y_i/(\sum_{i=1}^{n}y_i)$. This gives a set of probabilities $p_1,\ldots,p_n$ that seem to be Guassian and with a small entropy.

Now, in each step, we update $p_i$s: $p_i^{(new)}=(p_{i-1}+p_{i}+p_{i+1})/3$ in a circular fasion, i.e., $p_{n+1}=p_1$ and $p_{-1}=p_{n}$. After a few steps, all the values become the average of $p_i$ which is $1/n$.

Is there a better solution? Perhaps a probabilities distribution that pitching one of its parameters pitches its entropy and makes it uniform at the end?

1

There are 1 best solutions below

1
On

A simpler way to create the $p_i$ is to just set $p_i = r_i^2/\sum_j r_j^2$. This has significantly lower entropy than your current method ($5.6$ vs $6.4$ for $n =100$), and has the interesting property that the vector $[\sqrt{p_i}]$ is uniformly distributed on the unit sphere in $\mathbb R^n$. It's worth noting that your expected entropy can't be all that low if you want the initial $p_i$ to be IID, since they'll cluster around the mean of whatever distribution you're using.

That method of updating the $p_i$ has the potential problem that once it smooths out local maxima, it's very slow to reach uniform $p_i$. If this is undesirable, you can speed this up by averaging the $p_i$ with a random permutation of them. This quickly mixes the values together. If random permutations aren't available, averaging with two medium-sized relatively prime offsets seems to produce the fastest mixing