I study a course in probability theory these days and i have came across the notion of the Gamma distribution, which describes the probability distribution of the sum $Z_n=X_1+X_2+\cdots+X_n$, where $X_i$ is a random variable which has the exponential distribution $f_X(x)=\lambda e^{-\lambda x}$ ($f_X(x)=\Gamma(1,\lambda)$). The standard derivation of the probability density function for $\Gamma(n,\lambda)$ is by first deriving the cumulutive probability function using a summation of probabilities that are calculated from the theory of Poisson process (each term in the sum is the probability that there are $k$ events ($0\le k\le n $) in the interval $[0,x]$), and than taking a derivative of the cumulutive distribution to get the probability density function.
However, i find this technical derivation not transparent enough, and i tried to think on it in a different way. But i was only partially succesful in my way and i'll be glad if someone will help me complete it.
Intuitive explanation of the Gamma distribution
The "intuitive way" is by induction. The event $z_0\le Z_n\le z_0+\epsilon$ is composed of all the sub-events of the form $(Z_{n-1} = t)\cap (X_n = z_0-t)$, where $t$ is a dummy variable running on the interval $[0,z_0]$. Each such event has a probability $(\frac{\lambda^{n-1}e^{-\lambda t} t^{n-1}}{\Gamma(n-1)}dt)\cdot (\lambda e^{-\lambda (z_0-t)} \, dt)= \frac{\lambda^n e^{-\lambda z_0} t^{n-1}}{\Gamma(n-1)}(dt)^2$. Therefore, by summing up all those probabilities when $t$ is running on $[0,z_0],$ one gets a constant times $\int_0^{z_0}t^{n-1} \, dt = \frac{t^n}{n},$ and hence the dependency of $\Gamma(n,\lambda)$ on the Gamma function $\Gamma(n)$ and the $n$-th power of $z_0$.
My problem is not in the induction step but rather in the basis of induction (moving from $n=1$ to $n=2$). I can't understand why $\Gamma(2,\lambda) = \frac{\lambda^2e^{-\lambda x} x}{2}$ rather than $\Gamma(2,\lambda)=\lambda^2e^{-\lambda x} x$?
By the fundamental rule of integration:
$$\begin{align} \dfrac{\mathrm d ~~}{\mathrm d x}\int_0^x\int_0^{x-t} g(s,t)\,\mathrm d t\,\mathrm d s &=\int_0^x g(x-t, t)\,\mathrm d t\end{align}$$
Now the sum of one exponential distributed random variable is of course exponentially distributed.
$$\begin{aligned}\Gamma(x\mid 1,\lambda)&=\lambda\mathrm e^{-\lambda x}\mathbf 1_{0\leqslant x}\\[1ex] &=\dfrac{\lambda^1\mathrm e^{-\lambda x}\,x^0}{\Gamma(1)}\mathbf 1_{0\leqslant x}\end{aligned}$$
So, the pdf for the sum of two iid exponential random variables is:
$$\begin{align}\Gamma(x\mid 2,\lambda) &= \mathbf 1_{0\leqslant x}\, \dfrac{\mathrm d ~~}{\mathrm d x}\int_0^x\int_0^{x-t} \Gamma(s\mid 1,\lambda)\cdot\Gamma(t\mid 1,\lambda)\,\mathrm d s\,\mathrm d t\\[1ex]&=\mathbf 1_{0\leqslant x}\, \dfrac{\mathrm d ~~}{\mathrm d x}\int_0^x\int_0^{x-t} \lambda^2\mathrm e^{-\lambda (s+t)}\,\mathrm d s\,\mathrm d t\\[1ex] &= \mathbf 1_{0\leqslant x}\, \int_0^x \lambda^2\mathrm e^{-\lambda (x-t+t)}\,\mathrm d t\\[1ex]&=\lambda^2\mathrm e^{-\lambda x}\mathbf 1_{0\leqslant x}\, \int_0^x\mathrm d t\\[1ex] &= \lambda^2\mathrm e^{-\lambda x}\,x\,\mathbf 1_{0\leqslant x}\\[2ex] &=\dfrac{\lambda^2\mathrm e^{-\lambda x}\,x^{2-1}}{\Gamma(2)}\,\mathbf 1_{0\leqslant x} \end{align}$$
Likewise, if indeed $\Gamma(x\mid n,\lambda)=\dfrac{\lambda^{n}\mathrm e^{-\lambda x}\,x^{n-1}}{\Gamma(n)}\mathbf 1_{0\leqslant x}$
$$\begin{align}\Gamma(x\mid n+1,\lambda) &= \mathbf 1_{0\leqslant x}\, \dfrac{\mathrm d ~~}{\mathrm d x}\int_0^x\int_0^{x-t} \Gamma(t\mid n,\lambda)\cdot \Gamma(s\mid 1,\lambda)\,\mathrm d s\,\mathrm d t\\[1ex] &= \mathbf 1_{0\leqslant x}\, \int_0^x \dfrac{\lambda^{n+1}\mathrm e^{-\lambda x}\,t^{n-1}}{\Gamma(n)}\,\mathrm d t\\[1ex]&=\dfrac{\lambda^{n+1}\mathrm e^{-\lambda x}}{\Gamma(n)}\mathbf 1_{0\leqslant x}\, \int_0^x t^{n-1}\mathrm d t\\[1ex] &= \dfrac{\lambda^{n+1}\mathrm e^{-\lambda x}x^n}{\Gamma(n+1)}\,\mathbf 1_{0\leqslant x} \end{align}$$
Therefore, for all $n\geq 1$, $\Gamma(x\mid n,\lambda)=\dfrac{\lambda^n\,\mathrm e^{-\lambda x}\,x^{n-1}}{\Gamma(n)}\mathbf 1_{0\leqslant x}$