Integral involving Hypergeometric Function 2F1 and its derivative

94 Views Asked by At

When working on a probability problem involving Binomial distributions I came across this integral:

$P = \int_0^1 dp_1\,\left(p_1^{k_1}(1-p_1)^{n-k_1} \int_{\max(0,p1-\epsilon)}^{p_1}dp_2\,p_2^{k_2}(1-p_2)^{n-k_2} \right) \tag{1}$

Using Wolfram Alpha I found that integral inside the parentheses can be written as

$G(p_1) = \left[\frac{{}_2F_1(1 + k_2, k_2 - n, 2 + k_2, p)\cdot p^{1 + k}}{(1 + k)}\right]_{p=p_1-\epsilon}^{p=p_1}. \tag{2}$

However I can see no way solving the product that follows and I tried Wolfram Alpha to no avail.

What I tried

I tried tackling the problem in a more general way. I recognize that the integral $(1)$ in could be written as is the product of a function with it's definite integral if $k_1=k_2$. However even that would only help me for one of the summands and I have no idea how to tackle the $p1-\epsilon$ bound. Furthermore I would very much like to have an answer for the general case.

Approximations?

I also thought about approximating $G(p_1)$ for small epsilon, because it would become $\epsilon$ times the derivative of the expression inside the square brackets. This is just what I integrated over, so

$$G(p_1)\approx \epsilon p_1^{k_2}(1-p_1)^{n-k_2}$$

if I am not mistaken. So my original integral could be approximated as:

$$P\approx \epsilon \cdot \int_0^1 dp_1\,p_1^{(k_1+k_2)}(1-p_1)^{n-(k_1+k_2)}$$

which can be solved analogous to the inner integral above. But this is probably only useful for very (very) small $\epsilon$.

I would be very thankful for any comments on the approximation as well as a general solution.

Ranges for the variables

  • $n$ and $k_1$, $k_2$ are positive integers
  • $n$ is the number of tries for two Binomial distributed random variables. The number of tries is equal for both distributions
  • $k_1$,$k_2$ are the number of successes in the respective experiments. There is no requirement on their sum having a particular value only individually $k_1$,$k_2 \leq n$
  • $p_1$, $p_2$ are probabilities in [0,1]
  • $\epsilon$ is in (0,1)

Context: The integral arose in the following context: Say we have two random variables $K_1$ and $K_2$ (number of successes) that are distributed with a binomial distribution where each distribution has $n$ tries. Given the values $k_1$ and $k_2$ I am interested in how close the probabilities of success $p_1$, $p_2$ for each distribution are, given the data. Specifically $P( abs(p_1-p_2) < \epsilon \vert k_1,k_2,n)$