Change of state in a binary string?

60 Views Asked by At

I have been playing with the following idea, and I'm curious how one might go about addressing it in a rigorous manner (if its at all possible). I apologize in advance if this is too vague.

Imagine a binary string with an arbitrary length. The string has two phases or states, consisting of consecutive runs of either 0 or 1.

Example: 0000000|1111111

I'm interested in confidently calling the position in the string at which the "phase change" occurs. In an ideal scenario (above) this is simple. However, what if errors are randomly distributed throughout the length of the string?

Example: 0010000|1101111 or 1001000|1001111 etc...

2

There are 2 best solutions below

1
On BEST ANSWER

A possible model would be: $x_n \in \{0,1\}$ is the "clean" signal, which follows a Markov chain with known transition probabilities (equivalently: the distribution of the zero-one runs are geometric). The observed signal is $y_n = x_n + z_n$ where $z_n \in \{0,1\}$ is a white (independent) noise with known probabilities, and the sum is done modulo $2$. This is a HMM.

You are interested in, given $y_n$, find out the most probable $x_n$. This can be solved with Viterbi.

2
On

One intuitive way could be to find a divide position that minimizes (the number of digits that $1$ and is on the left of the divide position + the number of digits that are $0$ and is on the right of the divide position).

If we assume every digits are independent, and their distribution, we could formally formulae the likelihood function, and maximize the likelihood of it.

For example:

Assume each digit is independent, The $n$ digit is a r.v. $X_n$ with Bernoulli distribution of probability $p_n(\theta)$ to have outcome of $1$,

If we further assume $p_n(\theta) \sim logit (\theta, \sigma)$, where $\theta$ is the position you place your divide.

Then based on the observation, we could get likelihood function and maximize it to find $\theta$