I am trying to understand this paper about documents and sentences. At the end of page three, they say:
Let g(wi, wj ) be the distance between two events (1 if in the same sentence, 2 in neighboring, etc). Let Cdist(wi, wj ) be the distance-weighted frequency of two events occurring together, where D is all documents.

I understand that part. I see how the cumulitive distance of all events in all documents is generated by that equation.
But then they say:

I don't understand how they generate this because I don't understand the notation. I understand the concept of point-wise mutual information (PMI). But what does the numerator mean in statement 2? The probability of the distance between the two events? I don't understand what that means. Can someone explain statements 2 - 4?
Lines $(3)$ and $(4)$ provide the definition for the notation in line $(2)$ for $P_{dist}()$ and $P()$. The only function/notation left undefined in these lines is $C(w_i)$ in line $(3)$, but I'm pretty sure this is just the count of $w_i$ in the documents (i.e. the number of times $w_i$ occurs).
To explain what I think is going on here, I'll start by substituting those definitions from $(3)$ and $(4)$ into $(2)$ and then do some re-arranging:
\begin{eqnarray*} pmi(w_i, w_j) &=& \dfrac{C_{dist}(w_i, w_j)}{\sum_{k}{\sum_{l}{C_{dist}(w_k, w_l)}}} \Bigg/ \dfrac{C(w_i) C(w_j)}{\sum_{k}{C(w_k)} \sum_{l}{C(w_l)}} \\ &=& \dfrac{C_{dist}(w_i, w_j)}{C(w_i) C(w_j)} \Bigg/ \dfrac{\sum_{k}{\sum_{l}{C_{dist}(w_k, w_l)}}}{\sum_{k}{C(w_k)} \sum_{l}{C(w_l)}} \end{eqnarray*}
The numerator,
$$\dfrac{C_{dist}(w_i, w_j)}{C(w_i) C(w_j)}$$
is just the average "distance", as measured by the $C_{dist}$ function, per occurrence of the pair of words $w_i$ and $w_j$. It makes sense to do this because $C_{dist}(w_i, w_j)$ itself is a "frequency" rather than a distance - it increases simply by having more occurrences of words $w_i$ and $w_j$.
The denominator,
$$\dfrac{\sum_{k}{\sum_{l}{C_{dist}(w_k, w_l)}}}{\sum_{k}{C(w_k)} \sum_{k}{C(w_k)}}$$
is somewhat similar. It is the "distance" per occurrence of word-pairs averaged over all words ($w_k, w_l$) in the documents. Note that $\sum_{k}{C(w_k)} = \sum_{l}{C(w_l)}$ is the total number of words in the documents.
So the PMI itself, being the quotient of these two values, is a measure of the "distance" of words $w_i$ and $w_j$ compared to the overall average "distance".