combining continuous probability distributions.

66 Views Asked by At

I'm trying to brush up on my continuous stats by working on a toy project, a small project planner.

I am modeling each task as a distribution (specifically a PERT distributions, but let's stay generic shall we) to handle uncertainty in the duration of each task.

Now I'm wondering how can I "combine" two distributions in the two cases I have to manage if I want to build some sort a Gantt chart of a project (I consider tasks to be completely independent of one another):

  • a task blocked by another one: what is the distribution of the result? (I'm thinking a convolution could be the right tool here, but not sure at all)
  • two tasks are parallel to each other, to the result would be the probability to have the two done at a given time. Here I a lot less certain. I've read about Joint Probability Distribution, but I'm not even sure if it's what I'm after (and I'm not sure I understand the computation either)
1

There are 1 best solutions below

0
On BEST ANSWER

Let the random variable $T_1$ be the time to complete task 1, with cumulative distribution function $F_1(t)$. Since you seem interested in continuous distributions, I'll assume that probability density function $f_1$ also exists. Let $T_2$, $F_2$, and $f_2$ be the analogous objects for task 2. Assuming that $T_1$ and $T_2$ are dependent, we will need the joint CDF and PDF, denoted $F_{12}$ and $f_{12}$. These are bivariate functions: integrating $f_{12}(t, s)$ over a region of 2d space gives the probability that the point $(T_1, T_2)$ lies in that region. Meanwhile, $F_{12}(t, s)$ is the probability that $T_1 < t$ and $T_2 < s$. Therefore, one can see that

$$ F_{12}(t, s) = \int_0^t \int_0^s f_{12}(u, v) \text{ d}v \text{ d}u. $$

If $T_1$ and $T_2$ are independent (perhaps a reasonable assumption in your case), then $f_{12}(t, s) = f_{1}(t)f_{2}(s)$ and $F_{12}(t, s) = F_{1}(t)F_{2}(s)$.

  • If the tasks are done consecutively, then the completion time is $T_1 + T_2$, which has density given by

    $$ \int_0^t f_{12}(s, t-s) \text{ d}s. $$

    Intuitively, the above integral is aggregating over all times $s \in [0, t]$ the likelihood that task 1 takes $s$ units of time, and then that task 2 takes the remaining t-s units. Note that under independence, this integral indeed equals the convolution of $f_1$ and $f_2$, $$ \int_0^t f_{1}(s)f_2(t-s) \text{ d}s. $$ The probability of being done by time $t$ is the integral of the density up to $t$,

    $$ \int_0^t \int_0^s f_{12}(r, r-s) \text{ d}r \text{ d}s. $$

  • The other case is actually easier. If tasks 1 and 2 can be parallelized then the time to completion is just the maximum of $T_1$ and $T_2$. The probability of being done by time $t$ is the probability that both $T_1$ and $T_2$ are less than $t$, given by $F_{12}(t, t)$, or with independence, $F_1(t)F_2(t)$.