My friend and I were discussing a simple problem. Consider the following M/M/1 queue where there are at most $K$ places for waiting customers. Let the arrival rate be $\lambda$ and the departure rate $\mu$. We can easily calculate the stationary distribution $p_i$ that there are $i$ customers in the system.
We want to calculate the probability that a customer is blocked. I said, due to PASTA property that is just $p_K$. But my friend has another idea, let $$B:=\{\text{ customer arrives before service is finished } \}$$ The idea is then $$\tag{A1}\mathbb P(\text{ blocking })=\mathbb P( B\mid \text{ system is full })\mathbb P(\text{ system is full })$$ Since an arrival happening before finishing service is the probability that one exponential random variable is smaller than the other, we get $$\tag{A2}\mathbb P(\text{ blocking })=\frac{\lambda}{\lambda+\mu}\cdot p_K$$ Clearly our answers are different. Checking a similar problem in Erlang-B model suggests that my answer is correct, but the question is
Question: What probability is actually calculated in (A2) (clearly not the one we are interested in)? How can one use the idea in (A1) correctly so that it yields the right answer?
When the queues fills, the probability that a customer will be blocked before the job finishes is given by A2. In fact, at any point before the job finishes, the same is true.
But this is not calculated from the point of view of an arriving customer. Several customers may be blocked before the job finishes. Your approach seems correct to me.
I don't know the answer to the second part of your question.
EDIT
There are two different questions. One is from the point of view of the customer: "What is the probability that I will be turned away?" That is the question you answered, and I, think, the correct interpretation of blocking probability.
The other question is from the point of view of the server. "What is the probability that I will have to turn away at least one customer before the current job finishes?" That is the question answered by A2.
I don't see any real relation between the questions. For one thing, A2 only applies when the queue is full, and the customer wants to know the problem that he won't find it full.