What distribution do you get by repeatedly applying beta distribution?

13 Views Asked by At

So I know repeatedly adding 0 mean gaussians gives another 0 mean gaussian with the variances adding.

I wanted to get a better understanding of if there's an analogue to this for variables ranging from 0 to 1.

Start with a random variable, $X$ with $0<p=E[x]<1$.

Say X ~ Beta($np, n(1-p)$). Then Variance in $X$: $var(X) = \frac{p(1-p)}{n+1}$. Unexplained Variance for a Bernoulli with mean X: $E[X(1-X)] = \frac{np(1-p)}{n+1}$.

Start with $n= \infty$, so there's no variance in X and unexplained variance is $p(1-p)$.

Repeatedly apply F(X, m) = Beta($mX, m(1-X)$) and notice that with each application of F(X), the unexplained variance gets multiplied by a factor of $\frac{m}{m+1}$ and the mean stays the same, so unexplained variance + variance in X must equal $p(1-p)$.

So my question is can we prove multiple applications of F results in another Beta distribution. I showed the first and 2nd moments (mean and variance) are consistent with a Beta distribution and by definition the first application of F on X results in a Beta, but I'm not sure if more than 1 application does.

As a follow up, why does the beta distribution work so nicely with variances, $E[X(1-X)]$, but not Shannon entropy $E[-X (ln(X)) -(1-X)ln(1-X)]$, is there a different distribution that works better in terms of the entropy.

I'm generally confused about why Entropy makes more intuitive sense for Bernoulli variables (since log loss corresponds to an actual probability whereas squared error is relatively meaningless), but is harder to work with than variance which corresponds more with gaussians which are a bad approximation for Bernoulli's. It is because the derivative of a quadratic is linear which is much easier to solve/optimize compared to an equation with logs?