How to measure local density of A's in a sequence of A's and B's?

19 Views Asked by Bumbble Comm At 10 May 2026 - 2:07

Let me preface this by stating that I am not a mathmagician. I have a DNA sequencing problem, but it boils down to a math problem. But I don't have the math skills to adequately describe the issue, which makes it hard to google.

Let's assume we have a random DNA sequence of length N. We are interested in the A's, but there are also G, C, and T. Let's simplify it to A or Not A, where Not A = B. Therefore, a sequence of length N will have a number of possible sequences equal to $2^N$. So a sequence of 5 bases will have 32 possibilities. I want a method of scoring those possible sequences according to the number of A's and their spacing in the sequence. I want this score to estimate how well that sequence will bind to a sequence of all T's.

The 32 Possible Sequences for a 5-mer

The number of A's in the sequence is easy. More A's should give a higher score. But spacing is harder to quantify. A segment of consecutive A's should score higher than separated A's. A sequence of separated A's with small gaps should score higher than separated A's with large gaps.

Some Examples:

AAAAA - Highest possible score
BBBBB - Lowest possible score
ABBBB, BABBB, BBABB, BBBAB, and BBBBA should have the same score.
AAAAB should score higher than AAABA, which should be higher than AABAA though all are 80% A.
BAAAB should score higher than ABABA though both are 60% A.
BABAB should score higher than ABBBA though both are 40% A.

I don't need a single metric either, I can calculate multiple scores and see which correlates best with binding affinity.

Original Q&A

How to measure local density of A's in a sequence of A's and B's?

Related Questions in SEQUENCES-AND-SERIES

Related Questions in MATHEMATICAL-BIOLOGY

Trending Questions

Popular # Hahtags

Popular Questions