CDF of the maximum of i.i.d. gamma random variable in closed form (or in power series)

819 Views Asked by At

Suppose that $X_i$ is an i.i.d. Gamma random variable for $ \forall i \in \{ 1, \dots, N \}$ having CDF as follows: \begin{align} F_{X_i}(x; \alpha, \beta) = \frac{\gamma(\alpha, \beta x) }{\Gamma(\alpha)} \end{align} where $\Gamma(\cdot)$ is the Gamma function and $\gamma(\cdot,\cdot)$ is the lower incomplete gamma function (https://en.wikipedia.org/wiki/Incomplete_gamma_function).

I want to get the CDF of the maximum of $X_i$, i.e., $Z = \max \{ X_1, \cdots, X_N \}$. I know that $F_Z(z) = \left( F_{X_i}(z; \alpha, \beta) \right)^N$.

Here is my question. Does anyone know the closed form (or at least form of power series) of the CDF of $Z$? (Or does $Z$ follow any famous distribution?)

*What I tried is to use the followings: \begin{align} \gamma(s,x) &= \Gamma(s) - \Gamma(s,x), \\ \Gamma(s, x) &= (s-1)!\exp(-x) \sum_{k=0}^{s-1} \frac{x^k}{k!}, \\ F_{X_i}(z; \alpha, \beta) &= \frac{\gamma(\alpha, \beta z) }{\Gamma(\alpha)} = 1-\frac{\Gamma(\alpha, \beta z) }{\Gamma(\alpha)}, \end{align} where $\Gamma(s,x)$ is the upper incomplete Gamma function.

1

There are 1 best solutions below

1
On BEST ANSWER

Here's an expression for the power series, although the power series coefficients themselves end up containing a difficult probability. The following is valid for the case of $\beta=1$ and integer $\alpha$. Application to other $\beta$ is straightforward since, if $X$ has a gamma distribution with parameters $\alpha$ and $1$, then $P[X\le m] = P[\beta X \le \beta m]$ and $\beta X$ is has a gamma distribution with parameters $\alpha$ and $\beta$. However, integer $\alpha$ seems like a harder restriction to remove, as you'll see.

I derive the power series using the properties of a Poisson process. Suppose we have a Poisson process with rate parameter 1. Then, the time $T$ until $\alpha$ events has a gamma distribution with parameters $\alpha$ and $1$. The statement $T\le m$, then, means that there were $\alpha$ events in the time interval $[0,m]$.

We want $T_1\le m, T_2 \le m, \dots , T_k \le m$ to be simultaneously true. So, that's $k$ intervals of size $m$, each of which must have $\alpha$ or more events. Let $N_i$ be the number of events in interval $i$. Our event of interest can then be re-expressed as $P[N_1 \ge \alpha, N_2 \ge \alpha, \dots, N_k \ge \alpha]$.

Let $S=\sum_{i=1}^k N_i$ be the total number of events. We can expand this using the law of total probability:

$$P[N_1 \ge \alpha, N_2 \ge \alpha, \dots, N_k \ge \alpha] = \sum_{s=k\alpha}^\infty P[N_1 \ge \alpha, N_2 \ge \alpha, \dots, N_k \ge \alpha|S=s]P[S=s]$$

$N_i$ are Poisson with rate $m$, so $S$ is Poisson with rate $km$. Conditional on $S=s$, all the $N_i's$ jointly follow a multinomial distribution with total $s$ and cell probabilities $1/k$. So this sum becomes

$$\sum_{s=k\alpha}^\infty M_{s,k,\alpha} e^{-km} \frac{(km)^s}{s!}$$

where $M_{s,k,\alpha}$ is the probability that, in a multinomial distribution with $k$ equiprobable cells and a total of $s$, you get $\alpha$ or more in each cell. Some cosmetic rearrangement to make it look more like a power series: $$e^{-km} \sum_{s=k\alpha}^\infty M_{s,k,\alpha} k^s \frac{m^s}{s!}$$ And that's our expression for the CDF of the maximum of $k$ gamma distributions with shape parameter $\alpha$ and rate parameter $1$, as a function of $m$.

I never trust my math, so here's some R code I used to test it. First, I wrote and tested some code for those multinomial probabilities:

# Probability that k multinomial cells will be each have alpha or more items,
# with a total of t items and equal probabilities for each cell
fillprob <- function(t, alpha, k)
{
  if (t < alpha*k) return(0)
  if (k==0) return(1)
  # First cell values
  fsv <- alpha:t
  sum(dbinom(fsv, t, 1/k) * sapply(fsv, function(taken)
      fillprob(t-taken, alpha, k-1)))
}
fillprob(30, 4, 5)
# [1] 0.4607502
set.seed(326)
mean(apply(rmultinom(10^4, 30, rep.int(1/5, 5)), 2, function(x) all(x>=4)))
# [1] 0.4576

Then, comparing the power series to the "raw" calculation of the CDF:

raw <- function(alpha, k, m)
{
  pgamma(m, alpha, 1)^k
}
expansion <- function(alpha, k, m, order=30)
{
  sapply((k*alpha):order, function(t) fillprob(t, alpha, k) * dpois(t, k*m))
}
raw(2, 3, 5)
# 0.8835541
sum(expansion(2, 3, 5))
# 0.8833568

I could only test for pretty small alpha and k, since this algorithm for the multinomial probabilities scales really poorly, so I can't calculate that many terms in the expansion.