Measuring the similarity of 2 subsets of $\mathbb{N}$ with the same upper bound.

30 Views Asked by Bumbble Comm At 01 Apr 2026 - 11:32

The context is comparing 2 features in a DNA sequence (but the solution does not require any understanding of DNA, its features, or any kind of knowledge in biology).

For example, if a DNA sequence consists of 100 nucleotides (whatever a nucleotide might be), feature A might cover nucleotides [2-7, 17-32, 50-52, 54-57, 60-90] and feature B might cover [10-15, 49-56, 76-98]. Now, we can take each feature as simply a subset of [1-100], and forget that we are talking about DNA.

Finally, by similarity we mean any measure that might show some correlation between the features, perhaps indicating some kind of causation relation. The measure should tend to 0 when we take random subsets, and max out when the subsets are identical.

Original Q&A

Measuring the similarity of 2 subsets of $\mathbb{N}$ with the same upper bound.

Related Questions in PROBABILITY

Related Questions in STATISTICS

Related Questions in CORRELATION

Trending Questions

Popular # Hahtags

Popular Questions