Simulating Insurance Risk Process

180 Views Asked by At

Though this type of problem has been asked before, my specific issue at hand is slightly different.

The insurance Risk Process is defined as:

$R_t = a + c(t) - \sum\limits^{N(t)}_{i=1} X_i$

  1. a is the initial capital (set at 10)
  2. c(t) is the premiums collected and is set to $5(t)$ - as in, income is collected at a rate of 5 per time unit
  3. N(t) is the aggregate claim process, following a Poisson process with $\lambda = 10$
  4. $X_i$, the claim severity, is an iid exponential random variable with mean $\mu = 2$, and are independent of the claim process

So far, I have set up my claim process, computed my premiums per unit of time (since they are simply 5 multiplied by the unit of time), and attempted to compute the total claims $\sum\limits^{N(t)}_{i=1} X_i$ per each $N(t)$ that I generated.

Afterwards, I sum everything up to get my risk process $R_i$ for each ith simulation.

For reference, here is my code in Python:

import numpy as np
import scipy as sp
from scipy import stats
import matplotlib.pyplot as plt

T = 10
a = 10
mean = 2
lam = 10
simulations = 100
days = 365*3
n = list(range(days+1))

U = np.array(np.random.uniform(size=(simulations, days)))

x = (-1/lam)*np.log(U)
x1 = np.cumsum(x, axis=1)
x1 = np.insert(x1, 0, np.zeros(shape=(1,simulations)), axis=1)

plt.figure(figsize=(10,6))
plt.step(n, x1[0])
plt.step(n, x1[1])
plt.step(n, x1[2])
plt.ylabel("Number of Events")
plt.xlabel("Time (in days)")
plt.xticks(np.arange(0, days, step = 365))
plt.title("Cumulative Events Across 3 year Period")
plt.show()

Premieums_Collected = []
for t in range(days+1):
    Premieums_Collected.append(t*5)

E = list(np.random.exponential(scale=mean, size=(1, 2000)))  

Total_Claims = []
for i in x1:
    Claim_Value_List = []
    for j in np.nditer(i):
        Claim_Value_List.append(np.sum(E[0][:int(j)]))
    Total_Claims.append(Claim_Value_List)

Risk_Process =list(np.array(Premieums_Collected) - np.array(Total_Claims))
Risk_Process = a + np.array(Risk_Process)

plt.figure(figsize=(10,6))
plt.step(n, Total_Claims[0])
plt.step(n, Total_Claims[1])
plt.step(n, Total_Claims[2])
plt.ylabel("Total Claims")
plt.xlabel("Time (in days)")
plt.xticks(np.arange(0, days, step = 365))
plt.title("3 Sample Paths Of Cumulative Claims Over 3 Year Period")
plt.show()

plt.figure(figsize=(10,6))
plt.step(n, Risk_Process[0])
plt.step(n, Risk_Process[1])
plt.step(n, Risk_Process[2])
plt.ylabel("Risk Process Value")
plt.xlabel("Time (in days)")
plt.xticks(np.arange(0, days, step = 365))
plt.title("Three Risk Process Sample Paths For 3 Year Period")
plt.show()

This is the plot for 3 sample paths of the claim process $N(t)$ enter image description here

This is the plot for the 3 sample paths showing the cumulative claims $X_i$ enter image description here

Here is the plot for 3 sample paths of the overall risk process $R_t$ This, I feel, is not the correct enter image description here

1

There are 1 best solutions below

0
On

I have been trying to understand the insurance risk process model (see here for what I referred to, especially its page 14-15 and the figures therein), and thought that the parameters your question provides, especially the relation between $c$, $\lambda$ and $\mu$, seems somewhat inconsistent with the original intention of this model. I have been confusing about this, but let me try to answer the question here regardless of my doubts.

Understanding the model

First of all, it seems crucial to understand what $$ R_t=a+c(t)-\sum_{j=1}^{N_t}X_j $$ really means in practice, as it has significant influence on the simulation of the $\sum_{j=1}^{N_t}X_j$ term.

In this model, each parameter or variable means as follows.

  • $R_t$ is called the risk of the insurance at time $t$, or mathematically a risk process. This value, identical to the current capital, measures how risky the insurance is at time $t$. If $R_t$ tends to increase, it is a good news for the insurance company because its total capital tends to increase; if $R_t$ tends to decrease, it might be a warning for the company because its total capital tends to decrease; if $R_t$ hits $0$ or becomes negative, it is definitely a bad news because the total capital vanishes or the company is losing its money.
  • $a$ is called the initial capital, namely $R_0=a$.
  • $c(t)$ is called the premium at time $t$, meaning the premium that the insurance company asks from the customer to ensure its own survival. The survival here corresponds to $\mathbb{E}R_t\ge a$ for all $t>0$, i.e., the total capital should be, in the sense of expectation, no less than the initial capital. The determination of $c(t)$ is thus a crucial part of this model, which will be discussed in the next section.
  • $N_t$ is a counting process, telling how may claims from the customer have been exercised by the time $t$. It is assumed that $N_t$ is an independent increment process, i.e., any $N_t-N_s$ with $t>s\ge 0$ is independent from all $N_u$ with $0\le u\le s$. Also, it is assumed that $\mathbb{E}\left(N_t-N_s\right)=\lambda\left(t-s\right)$ for all $t>s\ge 0$, i.e., $\lambda$ tells the number of new claims in each unit of time.
  • $\left\{X_j\right\}_{j=1}^{\infty}$ is a sequence of independent and identically distributed random variables, each of which corresponds to a claim severity, i.e., how much money a customer asks the insurance company for reimbursement in a single claim. These are independent from the number of accumulated claims $N_t$, and time $t$ as well. Assume $\mathbb{E}X_j=\mu$.

How to determine $c(t)$?

With the above understanding of the model, a natural question is how to figure out a reasonable $c(t)$. That is, how to find a $c(t)$ such that $\mathbb{E}R_t\ge 0$. This is obviously a mathematical question, and we have \begin{align} \mathbb{E}R_t&=\mathbb{E}\left(a+c(t)-\sum_{j=1}^{N_t}X_j\right)\\ &=a+c(t)-\mathbb{E}\left(\sum_{j=1}^{N_t}X_j\right)\\ &=a+c(t)-\mathbb{E}\left[\mathbb{E}\left(\sum_{j=1}^{N_t}X_j\Bigg|N_t\right)\right]&&\text{(conditional expectation)}\\ &=a+c(t)-\mathbb{E}\left[\sum_{j=1}^{N_t}\mathbb{E}\left(X_j|N_t\right)\right]\\ &=a+c(t)-\mathbb{E}\left[\sum_{j=1}^{N_t}\mathbb{E}X_j\right]&&\text{(independence between $X_j$ and $N_t$)}\\ &=a+c(t)-\mathbb{E}\left[\sum_{j=1}^{N_t}\mu\right]\\ &=a+c(t)-\mathbb{E}\left(\mu N_t\right)\\ &=a+c(t)-\mu\mathbb{E}N_t\\ &=a+c(t)-\mu\lambda t. \end{align} Consequently, $$ \mathbb{E}R_t\ge a\iff c(t)-\lambda\mu t\ge 0. $$ Therefore, a conscientious insurance company would choose $$ c(t)=\lambda\mu t, $$ or taking some business charges into consideration, $$ c(t)=\left(1+\theta\right)\lambda\mu t, $$ where $\theta>0$ is called a relative safety loading, or some kind of interest rate.

Approach to numerical simulations: A naïve way

With the understanding from above, it is now clear how to perform numerical simulations for $R_t$. Let us first state a naïve way, straightforward but tedious, and then move to a implementation-friendly approach.

Let $0=t_0<t_1<t_2<\cdots<t_M=T$ be the equi-spaced sampling of times, with $t_j=jk$ and $k=T/M$ the unit time. For example, if $T=365$ days and if the unit time for claim is a single day, then $M=365$ and $k=1$.

At the initial moment $t_0$, we have $R_{t_0}=R_0=a$.

For each moment $t_j$ with $j\ge 1$, generate the number of new claims $N_{t_j}-N_{t_{j-1}}$, which, according to the assumption, has a mean of $\lambda\left(t_j-t_{j-1}\right)$. Say, $N_{t_j}-N_{t_{j-1}}=5$. Then generate $5$ independent claim severities $X_1$, $X_2$, ..., $X_5$. These amount $\sum_{i=1}^5X_i$ should be taken away from $R_{t_{j-1}}$. And we also have income from the premium, which is $\left(1+\theta\right)\lambda\mu\left(t_j-t_{j-1}\right)$. Therefore, $$ R_{t_j}=R_{t_{j-1}}+\left(1+\theta\right)\lambda\mu\left(t_j-t_{j-1}\right)-\sum_{i=1}^5X_i. $$

Approach to numerical simulations: An implementation-friendly approach

First of all, generate values for the $M$ random variables \begin{align} N_{t_j}^{\Delta}:=N_{t_j}-N_{t_{j-1}},&&j=1,2,\cdots,M. \end{align} These variables are independent and identically distributed, with a common mean of $\lambda\left(t_j-t_{j-1}\right)=\lambda k$. As such, there are $N_T=N_{t_M}=\sum_{j=1}^MN_{t_j}^{\Delta}$ claims at time $M$.

Secondly, generate values for the $N_T$ random variables \begin{align} X_i,&&i=1,2,\cdots,N_T. \end{align} These variables are also are independent and identically distributed, with a common mean of $\mu$.

Thirdly, make up the value for each $R_{t_j}$ by using $$ R_{t_j}=a+c(t_j)-\sum_{i=1}^{N_{t_j}}X_i $$ or by using $$ R_{t_j}=R_{t_{j-1}}+c(t_j)-c(t_{j-1})-\sum_{i=N_{t_{j-1}}}^{N_{t_j}}X_i. $$

Preliminary results and discussions

Now I will make use of the given data and provide respective results.

The question goes that $\lambda=10$, which I will understand it as "ten claims per month". So set the unit of time as "month". Suppose we are to simulate for three consecutive years, then total time $T=36$ months. Suppose we are to simulate the claims in each day, then time step $k=1/30$ month.

The question also states that $c(t)=5t$, which is quite a strange condition. In fact, according to the calculation from above, the company is likely to pay $\lambda\mu t=20t$ units of money at time $t$. Obviously, $5$ is quite too small to meet the need. We may expect that the company is almost surely to go bankruptcy very soon. Here is a set of $t$-$R_t$ simulation results.

enter image description here

In the above figure, there are five trajectories for $R_t$. As can been seen, all trajectories hit the $x$ axis very soon, and then go negative. That is, all simulations are telling a story of bankruptcy.

By contrast, if we give up the $c(t)=5t$ statement, and use $c(t)=\left(1+\theta\right)\lambda\mu t$ as mentioned above instead, the results appear to be more acceptable (here we set $\theta=0$, meaning no charge for insurance processing).

enter image description here

In the above figure, all the other parameters are kept as the previous ones. As can be seen, only one trajectory (the green one) unfortunately hits the $x$-axis, meaning that the company goes bankruptcy. The rest trajectories are all above the $x$-axis, witnessing some breathtaking and constant blooming moments.

However, even this second figure also confused me. How come that a company, with a initial capital of 10 units, could earn a total capital of roughly $40$ (blue curve), $60$ (red curve) and even $80$ (purple curve) units?! I would assume that probably it is because those parameters are artificially given, for which the simulation results do not have practical meanings. However, even if using the parameters specified in the reference mentioned above (page 14), the results are still quite wired and even bizarre... The parameter therein reads $a=10$, $\lambda=4.81$ (monthly-based), $\mu=12.6795$ (it assumes a log-normal-distributed claim, but this parameter also applies to our exponential-distributed case), and $\theta=0.05$. In this case, the simulations also failed me as follows.

enter image description here

The overall trend of each trajectory in above is, well, somewhat acceptable. But if we focus on the scale of the vertical coordinate (which corresponds to $R_t$), this is even more confusing than the previous case. In addition, these results differ significantly from what the reference provided (page 15).

I tried to understand the parameter $\mu$ carefully, since the reference did not mention its unit. This $\mu$ is an average amount of each claim, but I am not sure how it is evaluated in the $\mu=12.6795$ example. Interestingly, if I reset $\mu\to\mu/30$, the results appear to be really nice and consistent with those on page 15.

enter image description here

This $30$, interestingly, is exactly the rescaling factor of time (I took "month" as my time unit. For "day", there shall be a factor of $30$). Nevertheless, I completely lost my mind how this factor comes across here...

Back to the original question of this poster. The accumulated number of claims are fine.

enter image description here

With a monthly average of $\lambda=10$, the total number of claims after $36$ months should be around $360$. The figure above definitely validates this estimate.

Moreover, the total amount of claims are also well-behaved.

enter image description here

With an average amount of $\mu=2$ for each claim, it is straightforward that the total amount should be around $2\times 360=720$. The figure above provides consistent simulations with this estimate.