say, i have got 3x'a', 5x'b', 2x'c',4x'd', as char collection.
2 strings are formed, each consists of all the chars given. eg. string A = 'aaabbbbbccdddd' B='abcdabcdabbbdd'
so both strings have length 14
now, we can randomly choose 3 consecutive chars from each string. say, I choose X ='aaa' from A, and Y = 'abc' from B.
define a function good(), which compares the two selected string segments. original flag = false. if the first char from X = the second char from Y then flag = true. if the second char from X = the third char from Y then flag = true. if the second char from X = the second char from Y then flag = true (yes it is not symmetric). good() returns 'good' only if the flag remains false after all the checks.
I know i can write a computer program to figure out the 'good' probability given A and B. But now I would like to find out the max and min of the good probability of different A and B from the char collection. Iterate all possible permutation would be physically infeasible. so I seek an approximate approach.
Any thoughts?
All three comparisons involve one of the two second characters. Only segments that differ at the second character can be good, and we want to minimize/maximize the chance that if they do, the second character also differs from the neighbouring characters on the other segment. That means we should try to minimize/maximize the number of repetitions. Also, the last character in A and the first character in B never get matched; you should put rare/frequent characters there to minimize/maximize the chances of the remaining ones differing from others. I suspect that any strings that obey these two constraints (minimal/maximal number of repetitions, most rare/most frequent characters in the two special slots) will have the same minimal/maximal chance of producing good pairs, or at least to a good approximation.