Poisson: Probability distribution of amount of fuel per day based on arrival rate of cars at gas station.

68 Views Asked by At

I am modeling a hydrogen refueling station with an average of 2.5 cars arriving per day. This can be modeled using a Poisson distribution.

Each car has the following (discrete) probability distribution for the amount of hydrogen that it refuels. This probability distribution is based on a histogram derived from historical data.

How can I create a probability distribution for the total amount of hydrogen tanked per day?

I have been thinking how I can "add" probability distributions but cannot find a sensible way to do this. I assume it has something to do with the convolution of a Poisson distribution with another distribution.

The purpose is to create a discrete probability distribution and use it as an input to the Stochastic Dual Dynamic Programming (SDDP) algorithm implemented in SDDP.jl in order to optimize an energy system. So, my desired outcome is something like this, but then of course with correct data (i.e. no uniform distribution).

2

There are 2 best solutions below

0
On BEST ANSWER

The resulting distribution is a discrete compound Poisson distribution.

The python code below can create the distribution. Be aware that this code is not specific to the Poisson process but can take any distribution for the amount of cars per day.

import pandas as pd
import numpy as np

def poisson_plus_discrete(n_cars_dist, kg_dist, kg_step=1, verbose=False):
    #lower bound of the amount of cars and kg must be zero!
    #discrete compound poisson distribution
    if sum(n_cars_dist) != 1:
        print('ERROR: probabilities do not add up to 1.')
    max_cars = len(n_cars_dist) - 1

    cols = np.arange(0, max_cars*kg_step*(len(kg_dist)-1) + kg_step/2, kg_step).tolist()
    condf = pd.DataFrame(columns=['prob']+cols)
    condf.index.name = 'n_cars' 
    kg_row_0 = padding([1.0], len(condf.columns) - 1)
    condf.loc[0] = np.insert(kg_row_0, 0, n_cars_dist[0])
    kg_row_1 = padding(kg_dist, len(condf.columns) - 1)
    condf.loc[1] = np.insert(kg_row_1, 0, n_cars_dist[1])
    probabilities = n_cars_dist[0] * kg_row_0 + n_cars_dist[1] * kg_row_1

    convolved = kg_dist
    for i in range(2, max_cars + 1):
        convolved = np.convolve(convolved, kg_dist) #new length is N + M - 1
        kg_row = padding(convolved, len(condf.columns) - 1)
        condf.loc[len(condf)] = np.insert(kg_row, 0, n_cars_dist[i])
        probabilities = probabilities + n_cars_dist[i] * kg_row

    if verbose:
        print(condf)
        print('\nprobabilities:', probabilities.tolist())
        print('kg of hydrogen:', cols)

    return probabilities

n_cars_dist = [0.3, 0.3, 0.4]
#n_cars_dist = [0.3, 0.3, 0.2, 0.1, 0.05, 0.05] # for the values [0,1,2]
kg_dist = [0.1, 0.6, 0.3] # for the values [0,1,2]
kg_step = 1
poisson_plus_discrete(n_cars_dist, kg_dist, kg_step, verbose=True)
2
On

What is described is a compound Poisson distribution.

It's straightforward to numerically generate a discrete probability distribution, by successive convolutions of the refueling distribution with itself and truncating the number of cars arriving per day at a relatively large number (sorry, it's in R, not Julia):

library(data.table)
set.seed(34836697)

# dummy refuel mass probabilities for 0.2:20 kg
m <- runif(50)
m <- m/sum(m)

lambda <- 2.5 # average cars/day

system.time({
  g200_0 <- 1:50 # tanked amount/car in units of 200 g
  # Truncate the number of cars/day. The probability of seeing more cars than
  # this is less than machine precision.
  maxcars <- qpois(.Machine$double.eps, lambda, 0)
  # initialize a list to hold data.tables of total daily tanked probabilities
  # for each of 0:maxcars number of cars in a day
  ldt <- vector("list", maxcars + 1L)
  # if no cars come the tanked amount is 0 with probability 1
  ldt[[1]] <- data.table(g200 = 0L, p = 1, cars1 = 1L)
  # get the distribution of tanked amount conditioned on the total number of
  # cars arriving in a day
  for (i in 2:(maxcars + 1L)) {
    ldt[[i]] <- ldt[[i - 1L]][
      ,.(p = c(outer(p, m)), g200 = c(outer(g200, g200_0, "+")))
    ][, .(p = sum(p), cars1 = i), g200]
  }
  
  d <- dpois(0:maxcars, lambda) # total daily car visit probabilities
  # get the marginal distribution of mass tanked
  dt <- setnames(
    setorder(rbindlist(ldt)[, .(p = sum(p*d[cars1])), g200], g200)[
      , g200 := g200/5 # convert to kg
    ], "g200", "kg"
  )
})
#>    user  system elapsed 
#>    0.07    0.00    0.08

dt contains the probabilities of daily tanked amounts in kg

sum(dt$p)
#> [1] 1

dt[]
#>          kg            p
#>    1:   0.0 8.208500e-02
#>    2:   0.2 3.700003e-03
#>    3:   0.4 8.753526e-04
#>    4:   0.6 8.869384e-03
#>    5:   0.8 3.139474e-03
#>   ---                   
#> 1197: 239.2 2.296300e-48
#> 1198: 239.4 5.492619e-49
#> 1199: 239.6 1.250848e-49
#> 1200: 239.8 1.860561e-50
#> 1201: 240.0 3.030906e-51