I hope this is the right section for this kind of questions.
I am trying to simulate, with MATLAB, a diffusion model starting from a Random Walk. I am using a Random Walk with information increment X normally distributed ($\mu, \sigma$ ). I also have a boundary $\alpha $, and $\alpha > \mu$. The starting point is 0.
If I understood this right, this should be an approximation of the Wiener Process. As wikipedia says (http://en.wikipedia.org/wiki/Inverse_Gaussian_distribution), the first passage time for a fixed level $\alpha > 0 $by $X_t$ is distributed according to an inverse-gaussian: $ T_\alpha = \inf\{ 0 < t \mid X_t=\alpha \} \sim IG(\tfrac\alpha\nu, \tfrac {\alpha^2} {\sigma^2}).\, $
What I am trying to do is to simulate a Random Walk and to get the first passage time distribution, verifying that it is actually a Inv. Gaussian with those parameters.
This is the code I have done: http://pastebin.com/E1N58sJ4
Notice that the myHist function is commented, but I normally use it: in that function I fit the resulting distribution to an inverse gaussian. Then I compare the fitted parameters with the two parameters obtained by the formula showed above, $\mu= \alpha/v $ and $ \lambda= \alpha^2/\sigma^2 $
However, the two results are NOT THE SAME, and they differ consistently across simulation. For example, with the parametrers used in the code:
Simulation - mu:29.1771 s: 116.7757 Expected - mu:26.3158 s: 100
This difference is consistent across repetition of simulation.
Can anyone spot the mistake?
{I wrote this question in CrossValidated, but I deleted the post there and I put it here, because it seems more appropriate. One -not definitive- answer that I've got regarded the time step size of my function. Unfortunately, I don't know how to change that, so if you could provide some help with the implementation that would be much appreciated}
I did not spot any mistakes in the code. The discrepancy is probably due to low sampling rate used in simulating the Wiener process. Based on comments, you also need some background information about what is going on when simulating the Wiener process by discrete steps. In this answer, I first give some background information and then illustrate the error caused by simulating the boundary crossing time by too low sampling rate.
Background
I denote by $W_t$ the standard Wiener process (with no drift), whence \begin{equation} X_t = W_t + \nu\,t,~t\geq 0 \end{equation} is the process you want to simulate. So, it suffices to simulate $W_t$ and adding the drift is simple. The Wiener process (http://en.wikipedia.org/wiki/Wiener_Process) is a continuous-time stochastic process with $W_t=0$ and the increments are distributed as \begin{equation} W_t - W_s \sim N(0, t-s) \end{equation} furthermore, increments over separate intervals (say, $W_3-W_2$ and $W_2-W_1$) are independent. Note that $t-s$ here is the variance of the increment, not standard deviation. We cannot actually simulate a continuous realization from this process using a computer + random number generator, but must resort to discrete samples, e.g., at times $\Delta t, 2\Delta t, 3\Delta t$ etc. Based on the independent increments property, we can obtain these samples by the following algorithm
For Wiener process, this method is exact in the sense that the distribution of thse $W_{k\Delta t}$ will be correct (also the joint distribution). The values of $X_t$ at the same points are obtained by simply adding the drift.
The cause of the error described in the question
Your code uses the method I described previously, with $\Delta t = 1$, to simulate $X_t$ for $t=1,2,\ldots$. Then, you find the first $X_t$ that exceeds the boundary, and estimate the boundary-crossing time using linear interpolation between $X_{t-1},X_t$. This corresponds to approximating the process $X_t$ by linear interpolation for non-integer $t$ and finding where this interpolated process crosses the boundary. However, it turns out that this approximation overestimates $T_\alpha$ on average, because it misses cases where the continuous process $X_t$ crosses the boundary and returns below the boundary before next sampled value.
See the following figure for an illustration, I simulated $X_t$ with $\Delta t=0.01$ (the red curve). The blue curve is simulation with $\Delta t = 1$ (which I obtained by taking every 100th value of the red curve). The blue curve corresponds to what your code simulates. The vertical black lines describe the times when the simulated processes cross the boundary. The red curve crosses the boundary at $T_\alpha = 20.1$, but as it returns below the boundary before $t = 21$, the blue curve does not cross the boundary here but only much later at $\hat{T}_\alpha = 29.5$. Apparently, this is typical enough to cause the discrepancy described in your code. However, note that I 'cheated' and simulated several datasets before obtaining a discrepancy this big, it doesn't happen every time).
Implementation tip
Your code has a line amounting to $X_{k+1} = X_k + N(v,1)$. Change this to $X_{\Delta t(k+1)} = X_{\Delta t\,k} + N(\Delta t\,v,\Delta t)$ to achieve higher sampling rate. When finding the boundary crossing time, note that the times are now $\Delta t,2\Delta t,\ldots$ instead of $1,2,\ldots$.