Computing the distribution of the sum of independent gamma random variables

127 Views Asked by At

I am trying to implement the analytic expression for the distribution of the sum of independent gamma random variables using the expression given in Moschopoulos (1985). More specifically, I would like to implement the analytic expression for the distribution of $Y = \sum_{i=1}^{n} X_i$, where $X_i \sim \text{Gamma}(\alpha_i, \beta_i) \left(i=1 \cdots, n\right)$ are the independent gamma random variables and $\alpha_i$ and $\beta_i$ are respectively the shape and scale parameters. The analytic expression is a weighted linear sum of gamma series and can be described as

\begin{equation} p_Y(y) = \sum_{k=0}^{\infty} w_k \frac{y^{\rho+k-1}\exp{\left(-y/\beta_\text{min}\right)}} {\beta_\text{min}^{\rho+k}\Gamma\left(\rho+k\right)}, \end{equation}

where $\rho = \sum_{i=1}^{n}\alpha_i>0$, $\beta_\text{min} = \min_i{\left(\beta_i\right)}$. The weight of each gamma series, $\omega_k = Cd_k, k=0, 1, 2, \cdots, \infty$, where $C = \prod_{i=1}^{n}\left(\beta_\text{min}/\beta_i\right)^{\alpha_i}$. The other term that consists the weight, $d_k = \left(1/k\right)\sum_{i=1}^{k} i g_i d_{k-i}$ can be calculated recursively with $d_0=1$ and $g_i = \left(1/i\right)\sum_{j=1}^{n}\alpha_j\left(1-\beta_\text{min}/\beta_j\right)^i$. This expression is directly adopted from Barbani (2017), where the original expressions are reorganized for easier interpretation.

The issue here is that the analytic distribution is not consistent with random data that is generated as shown in the following figure.

Random synthetic data (blue shaded patch) and the corresponding analytic expression for the distribution of the sum of independent random gamma random variables (black solid line)

Below is the matlab script that I used. I think there is something wrong with my implementation of Moschopoulos' formulae.

%% Generate gamma random variables and their sum

nInstant = 1e5;
y = rand(1,nInstant);
gammaShapeVec = 20;
gammaMeanVec = rand(1,10);
nRV = numel(gammaShapeVec)*numel(gammaMeanVec);

[gammaShape, gammaMean] = ndgrid(gammaShapeVec, gammaMeanVec);
gammaShape = gammaShape(:);
gammaMean = gammaMean(:);
gammaScale = gammaMean./gammaShape;
gammaRate = gammaShape./gammaMean;
gammaVar = (gammaMean.^2)./gammaShape;

gammaRV = zeros(nRV, nInstant);
for ii=1:numel(gammaMean)
    xTmp = linspace(0,1e3*abs(gammaMean(ii)+3*sqrt(gammaVar(ii))),1e5);
    cdfGammaTmp = gammainc(gammaRate(ii)*xTmp, gammaShape(ii));
    [cdfGamma, idx] = unique(cdfGammaTmp);
    x = xTmp(idx);
    gammaRV(ii,:) = interp1(cdfGamma, x, y);
end

% Sum of the gamma random data
Y = nansum(gammaRV,1);


%% Analytic expression for the distribution of the sum of the gamma random variables
% (Moschopoulos, 1985)
yVec = linspace(max(0,mean(Y)-10*std(Y)),mean(Y)+10*std(Y),4e2);
yVec = yVec(:);
kVec = 1:1e4;

rho = sum(gammaShape);
beta1 = min(gammaScale);
lnC = nansum(gammaShape.*(log(beta1)-log(gammaScale)));
d0 = 1;

gVec = zeros(size(kVec));
for kk=1:length(kVec)
    gVec(kk) = 1/kk*nansum(gammaShape.*(1-beta1./gammaScale).^kk);
end

dVec = zeros(size(kVec));
for kk=1:length(kVec)
    dVec(kk) = 1/kk*nansum((1:kk).*gVec(1:kk).*[dVec(kk-1:-1:1) d0]);
end


% k>0
lnw = lnC + log(dVec);
lnTerm1 = log(yVec)*(rho+kVec-1);
lnTerm2 = -yVec./beta1;
lnTerm3 = (rho+kVec)*log(beta1);
lnTerm4 = gammaln(rho+kVec);

% k=0
lnw0 = lnC;
lnTerm1_0 = log(yVec)*(rho-1);
lnTerm2_0 = -yVec./beta1;
lnTerm3_0 = rho*log(beta1);
lnTerm4_0 = gammaln(rho);

% k>=0
lnwAll = [lnw0 lnw];    lnwAll = repmat(lnwAll,[length(yVec) 1]);
lnTerm1All = [lnTerm1_0 lnTerm1];
lnTerm2All = repmat(lnTerm2,[1 length(kVec)+1]);    
lnTerm3All = [lnTerm3_0 lnTerm3];   lnTerm3All = repmat(lnTerm3All,[length(yVec) 1]);
lnTerm4All = [lnTerm4_0 lnTerm4];   lnTerm4All = repmat(lnTerm4All,[length(yVec) 1]);


fyAll = exp(lnwAll + lnTerm1All + lnTerm2All - lnTerm3All - lnTerm4All);
fy = nansum(fyAll,2);

%% Plot the distribution and compare it with synthetic data
figure; hold on;
hh = histogram(Y(:),'edgecolor','none','normalization','pdf');
lh = plot(yVec,fy,'k-','LineWidth',1);