Pugh's Skip List Algorithm (Substitution and Simplification Part)

61 Views Asked by At

While reading the Article Skip Lists: A Probabilistic Alternative to Balanced Trees, written by William Pugh, I am stuck on page 672, at the part:

Let C(k) = the expected cost (i.e., length) of a search path that climbs up k levels in an infinite list:
$C(0) = 0$
$C(k) = (1-p)$ (cost in situation b) + $p$ (cost in situation c)
By substituting and simplifying, we get:
$C(k) = (1-p)(1+C(k)) + p(1 + C(k-1))$
$C(k) = 1/p + C(k-1)$
$C(k) = k/p$

Note: He refers to Figure 6 on p.671.

In his world, those steps probably feel natural and trivial, so no further explanation seemed to be needed for Mr. Pugh, but I don't understand it at all. Why does (cost in situation b) get resolved to (1+(C(k)), for example, not only C(k)? The same for cost in situation c. Is it due to fact, that the level is 1 if the list is empty? (see p. 669):

Doing some calculation, I reached this point:

$C(k) = 1+C(k)-p-pC(k)+p+pC(k-1)$
$C(k) = 1+C(k)-pC(k)+pC(k-1)$ |:p
$C(k) = (1/p)+(C(k)/p)-C(k)+C(k-1)$

But how do I reach the term:

$C(k) = (1/p) + C(k-1)$

That would mean, that this part is 0, right? But why:

$(C(k)/p)-C(k)$

And even if I would have reached that part, how is the last term reached?

$C(k) = k/p$

I would be really thankful for a step-by-step explanation, this is a typical three-liner, only a real mathematician can produce.

EDIT:

Furthermore, I don't understand the conclusion of the probability that the maximum level of the list is greater than k:

$1-(1-p^{k})^{n}$
and that it is at most $np^{k}$

As far as I got it, the probability of level k is $p^{k}$. On the other hand, $1-p^{k}$ would mean everything but $k$ (levels below and above k). Saying $(1-p^{k})^{n}$ I would interpret as a probability above all values n. But if I subtract everything from 1, that would mean the probability for k above all values. But Pugh tries to calculate the probability for all values greater than k. I am confused.

1

There are 1 best solutions below

4
On BEST ANSWER

We have $$ C(k)=1+C(k)−p−pC(k)+p+p(k−1) $$

Subtracting $C(k)$ from both sides, we get $$ 0 = 1 - pC(k) + pC(k-1) $$

Adding $pC(k)$ to both sides, we get $$ pC(k) = 1 + pC(k-1)$$

Dividing through by $p$, we get the desired answer.

The $\frac{k}{p}$ comes from considering what happens when you add it all up. $C(0) = 0, C(1) = 1/p + C(0), C(2) = 1/p + C(1) = 1/p + 1/p + C(0), \dots$, and then unfurling it all, noting that this happens $k$ times.