When working on a probability problem involving Binomial distributions I came across this integral:
$P = \int_0^1 dp_1\,\left(p_1^{k_1}(1-p_1)^{n-k_1} \int_{\max(0,p1-\epsilon)}^{p_1}dp_2\,p_2^{k_2}(1-p_2)^{n-k_2} \right) \tag{1}$
Using Wolfram Alpha I found that integral inside the parentheses can be written as
$G(p_1) = \left[\frac{{}_2F_1(1 + k_2, k_2 - n, 2 + k_2, p)\cdot p^{1 + k}}{(1 + k)}\right]_{p=p_1-\epsilon}^{p=p_1}. \tag{2}$
However I can see no way solving the product that follows and I tried Wolfram Alpha to no avail.
What I tried
I tried tackling the problem in a more general way. I recognize that the integral $(1)$ in could be written as is the product of a function with it's definite integral if $k_1=k_2$. However even that would only help me for one of the summands and I have no idea how to tackle the $p1-\epsilon$ bound. Furthermore I would very much like to have an answer for the general case.
Approximations?
I also thought about approximating $G(p_1)$ for small epsilon, because it would become $\epsilon$ times the derivative of the expression inside the square brackets. This is just what I integrated over, so
$$G(p_1)\approx \epsilon p_1^{k_2}(1-p_1)^{n-k_2}$$
if I am not mistaken. So my original integral could be approximated as:
$$P\approx \epsilon \cdot \int_0^1 dp_1\,p_1^{(k_1+k_2)}(1-p_1)^{n-(k_1+k_2)}$$
which can be solved analogous to the inner integral above. But this is probably only useful for very (very) small $\epsilon$.
I would be very thankful for any comments on the approximation as well as a general solution.
Ranges for the variables
- $n$ and $k_1$, $k_2$ are positive integers
- $n$ is the number of tries for two Binomial distributed random variables. The number of tries is equal for both distributions
- $k_1$,$k_2$ are the number of successes in the respective experiments. There is no requirement on their sum having a particular value only individually $k_1$,$k_2 \leq n$
- $p_1$, $p_2$ are probabilities in [0,1]
- $\epsilon$ is in (0,1)
Context: The integral arose in the following context: Say we have two random variables $K_1$ and $K_2$ (number of successes) that are distributed with a binomial distribution where each distribution has $n$ tries. Given the values $k_1$ and $k_2$ I am interested in how close the probabilities of success $p_1$, $p_2$ for each distribution are, given the data. Specifically $P( abs(p_1-p_2) < \epsilon \vert k_1,k_2,n)$