I want to know the expectation of a product of independent beta distributed random variables.

94 Views Asked by At

I have an equation of the form $$Z = \frac{\prod_{i=1}^p X_i}{\prod_{i=1}^p Y_i}$$ where $$X_i \sim \mathcal{B}(\alpha_{x_i},\beta_{x_i})$$ and $$Y_i \sim \mathcal{B}(\alpha_{y_i},\beta_{y_i})$$

$X_i$ and $Y_i$ are independent beta distributed random variables.

I want to know $$E[Z]$$ in terms of $$\alpha_{x_i},\beta_{x_i}, \alpha_{y_i},\;and\;\beta_{y_i}\;\forall\;i\in[1,p]$$

The product of expectations is problematic in a computer science perspective because the estimate of $X_i$'s and $Y_i$'s can be 0. Also, $$X_i = P(U_i=a_i|V=0)$$, where $U_i$ and $V$ are binary variables, and $a_i$ is either 0 or 1, which means $X_i$ is a probability. I used the beta distribution because this conditional probability, $X_i$, can be estimated from the count of instances of the events {$U_i$=0 and V=0} and {$U_i$=1 and V=0}.

Formally, let C(A) be the counting function of instances of event A. Then, $$X_i \sim \mathcal{B}(C(U_i=a_i\;and\;V=0)+1, C(U_i=\neg a_i\;and\;V=0)+1)$$ and $$Y_i \sim \mathcal{B}(C(U_i=a_i\;and\;V=1)+1, C(U_i=\neg a_i\;and\;V=1)+1)$$

This problem came from trying to create an algorithm to predict a binary version of the classic Time Series Analysis.

One of its applications is a model for predicting the move of an adversary in a rock, paper, and scissors game.

First, gather a list of past moves of the adversary arranged from later to latest. Second, binarize the moves using the following: rock=0b00, paper=0b01, scissors=0b10. Concatenate and the result will be a stream of bits. Decide on an order q. Take the latest q bits, $U_i$ from i=1 to i=q, at the end of the stream. Use a formula of conditionals of $U_i$ and V where V is the prediction of the future bit which is not in the stream. After some more steps, you will get a prediction of the next move of the adversary.

The application is not limited here. This problem originally came from trying to improve pth-difference models in Time Series Analysis.

I already have progress in this problem, but it is a bit of cheating. I used Monte Carlo Simulation and I just generated a bunch of beta r.v.s, plug it in the formula, and take the average.

Results are promising. For q $\approx$ 10, accuracy is about 55%. 55% accuracy means the prediction is right 55% of the time and wrong 45% of the time. For q $\approx$ 100, accuracy is about 70%, for q $\approx$ 200, accuracy is about 90%. The highest accuracy I have seen is 95%. Take note that my input data did not come from rock, paper, scissors, but from the transformed Time Series Data of the Real GDP of America.

Your help is greatly appreciated.

Update: I got my problem wrong. What I really want to solve is $E[\frac{1}{1+Z}]$ and I wrongly thought it is equal to $\frac{1}{1+E[Z]}$. But don't worry, I am not changing the question entirely. This is because I found a solution. $$\frac{1}{1+Z}=\sum_{i=0}^\infty (-1)^i Z^i$$ This might be false because $0\leq Z\leq 1$ is not necessarily True, but I have a hard time finding another way. Using linearity of Expectations... $$E[\frac{1}{1+Z}]=\sum_{i=0}^\infty (-1)^i E[Z^i]$$

Therefore, I only need the moment generating function of $Z$.

In case you are wondering why I got such a simple fact wrong, this is because I am quite new in this and this is my hobby, so my expertise is all over the place. The current high school curriculum does not give me anything to further my skills.

Update, I just derived a stronger equality: $$E[\frac{1}{1+Z}]=\sum_{i=0}^\infty (\lfloor\frac{i}{n+1}\rfloor+1)(-1)^i E[Z^i]$$

This approximation gets better as n approaches infinity, however the computational demand also scales likewise.

1

There are 1 best solutions below

3
On

Let $$X \sim \operatorname{Beta}(a,b), \quad a, b > 0.$$ Then $$\begin{align} \operatorname{E}[X^k] &= \frac{\Gamma(a+b)}{\Gamma(a)\Gamma(b)} \int_{x=0}^1 x^{a+k-1}(1-x)^{b-1} \, dx \\ &= \frac{\Gamma(a+b)\Gamma(a+k)}{\Gamma(a+b+k)\Gamma(a)} \frac{\Gamma(a+b+k)}{\Gamma(a+k)\Gamma(b)} \int_{x=0}^1 x^{a+k-1} (1-x)^{b-1} \, dx \\ &= \frac{\Gamma(a+b)\Gamma(a+k)}{\Gamma(a+b+k)\Gamma(a)}. \tag{1}\end{align}$$

For $k = 1$, this yields $$\operatorname{E}[X] = \frac{a}{a+b}. \tag{2}$$

For $k = -1$, this yields $$\operatorname{E}[1/X] = \frac{a+b-1}{a-1}, \quad a > 1, \tag{3}$$ where the condition arises since the integral $(1)$ diverges if $a \le 1$.

Consequently, $$\operatorname{E}[Z] \overset{\text{ind}}{=} \prod_{i=1}^p \operatorname{E}[X_i]\operatorname{E}[1/Y_i] = \prod_{i=1}^p \frac{\alpha_{x_i}}{\alpha_{x_i} + \beta_{x_i}} \frac{\alpha_{y_i} + \beta_{y_i} - 1}{\alpha_{y_i} - 1}, \tag{4}$$ where we require $\alpha_{y_i} > 1$ for all $i \in \{1, \ldots, p\}$.