According to queueing theory, the average queueing delay for an M/M/1 queue can be calculated as $\frac{1}{\mu-\lambda}$, where $\mu$ is the average service rate and $\lambda$ is the average arrival rate.
Is there an intuitive explanation for what happens at full utilization, i.e. for an arrival rate equal to the service rate? Why does the expected delay become infinite? Or is that formula not applicable for $\mu = \lambda$?
The average queueing delay for an M/M/1 queue is difficult to define when $\lambda=\mu$ since the length of the queue has no stationary distribution in this case. Recall that when $\lambda\lt\mu$, there exists a stationary distribution $\pi$ for the length of the queue, that the distribution of the length of the queue converges to $\pi$ for every initial distribution, and that $\pi$ also describes the ergodic averages of the length of the queue. None of this survives at $\lambda=\mu$.
What is true however is that $\pi\to\infty$ when $\lambda\to\mu$, in the following sense. For every $n$ finite, $\pi_{\lambda,\mu}([0,n])\to0$ when $\lambda\to\mu$, $\lambda\lt\mu$. In this sense the typical length of the queue becomes large in this limit.