Considering an M/G/1 queue with Poisson arrivals of rate $\lambda$ - this comes from Cox and Miller's (1965) "The Theory of Stochastic Processes" (pp 240 - 241) and also Cox and Isham's 1986 paper "The Virtual Waiting-Time and Related Processes" (http://www.jstor.org/stable/1427312).
My question is what is the difference between (using the authors' notation) $p_0(t)$ and $p(0,t)$? The authors' results give $p_0 =\lambda p(0)$ but I don't follow the reasoning - as maybe what follows makes clear.
In the 1965 book (the 1986 paper presents the differentials of the same equations), $X(t)$ is the "virtual waiting time" of a process and the book writes of "a discrete probability $p_0(t)$ that $X(t)=0$, i.e., that the system is empty, and a density $p(x,t)$ for $X(t)>0$".
The system consumes virtual waiting time in unit time, i.e., if $X(t)\leq\Delta t$ and there are no arrivals in time $\Delta t$ then $X(t + \Delta t) = 0$.
The distribution function of $X(t)$ is then given by: $$F(x,t)=p_0(t)+\int_{0}^{\infty}p(z,t)dz$$
They then state: $$p_0(t+\Delta t)=p_0(t)(1-\lambda\Delta t) +p(0,t)\Delta t(1 - \lambda\Delta t) + o(\Delta t)$$
The first term of the RHS seems clear - the probability that the system is empty at $t$ multiplied by the probability there will be no arrivals in $\Delta t$, but the second is not clear to me at all.
I assume this term accounts for the probability of the system "emptying" during $\Delta t$ but I don't see how that works, is anyone able to explain?
In other words, how does $p(0,t)\Delta t(1 - \lambda\Delta t)$ represent this draining? Presumably $(1 - \lambda\Delta t)$ again represents the possibility of zero arrivals in $\Delta t$, so how does $p(0, t)\Delta t$ represent the $X(t) \leq \Delta t$ situation?
If we take the equilibrium situation where $p_0(t) = p_0$ and $p(x, t) = p(x)$ then, if we differentiate and as $p^{\prime}_0 = 0$, we get $p_0 = \lambda p(0)$ - so, again, what does $p(0)$ represent?
First, $p_{0}(t)$ is the discrete part of the probability distribution and $p(0,t)$ is the continuous part: wikipedia on continuous vs. discrete probability distributions. Here, there's a finite (i.e. nonzero) probability that someone's wait time is exactly 0, but no actual probability that their wait time will be exactly, say, 0.3 seconds. You'd have to ask instead what the probability is that they have a wait time between, say, 0.3 and 0.4 seconds. The former is discrete and the latter continuous (usually represented by a probability density function which you have to integrate to get a probability, a bit like you have to integrate velocity to get the displacement). The probability distribution they're considering is the sum of the two. (People will sometimes think of the sum as a probability density with spikes/atoms/dirac delta functions where there is a finite probability of an event occurring. In terms of the cumulative probability distribution, the discrete part will be jumps or steps. Here, you only have one at $0$.)
This also explains the term you're wondering about. The two terms are the contributions of the two parts of the probability distribution function. The first is as you described. It's also necessary to consider the contribution of the continuous part of the function. Your intuition is reasonable: you can think of it as the probability that the waiting time is very short, smaller than $\Delta t$. This is given by the integral $$\int_{0}^{\Delta t}p(\tilde{t},t)d\tilde{t}$$ (integrate over the density for all $\tilde{t}$ between $0$ and $\Delta t$). This can be approximated by taking a value of p(x,t) near here and multiplying by $\Delta t$ (like if you know the instantaneous velocity and assume it doesn't change much, you can multiply by the time interval as though it were a constant velocity). Approximating the integrand by constant value $p(0,t)$, we get $p(0,t)\Delta t$, which, like in the first term, gets multiplied by $1-\lambda\Delta t$ to give the probability that noone arrived.