Assume we have a set of documents $D = \{d_0, d_1, ..., d_n\}$, and another set of independent artefacts $A = \{a_0, a_1, ..., a_{m-1}\}$ where the likelihood of an artefact $a_x$ appearing in a document is $p_x$.
If we have two documents $(d_a, d_b)$ for which we know they share a set of artefacts $\{a_x, a_y, ..., a_z\}$ what is the probability of a this being just a coincidence (or the converse that a copying happened)?
How about the more general question: The case where we have a tuple of documents $(d_a, d_b, ..., d_l)$ for which we know they share a $\{a_x, a_y, ..., a_z\}$?
If 2 documents are independent, then they both have a common set with probability = $\left( p_{x} \cdot p_{y} \cdot \dots \cdot p_{z} \right)^2$