I'm solving a coding problem, and I wanted to mathematically prove that the answer exists. Essentially, we are given a string of lowercase letters, and we want to know if it can be rearranged such that no character repeats. Example: aab can be written aba, but aaab can't be rearranged in such a way.
Let $N$ be the length of the string, and let there be $n$ unique characters $\{c_1,\cdots, c_n\}$, and $f_m$ is the frequency of $c_m$ in the string. The solution exists if $\max\{f_m\}\leq \lceil N/2\rceil$. This part I'll take for granted -- I'm curious about a subproblem. Suppose I repeat each distinct character $k\leq \min\{f_m\}$ times and form a valid string. For instance, if the input string is aabbccdddd, and $k=1$, we would form a string abcd repeating each unique character once, and the remaining substring to rearrange would be abcddd.
Is it still guaranteed that the remaining $N-km$ characters (abcdd) can be rearranged into a string that has no repeating characters? (For instance, abdcd).
Let $f^*=\max{\{f_m\}}$. The new string length is $N-km$. I want to prove, given $f^*\leq \lceil N/2\rceil$, the inequality $f^*-k\leq\lceil N/2 -mk/2 \rceil$ holds. From the first inequality, I have
$$ f^*-k \leq \lceil N/2\rceil-k, $$
so the second inequality is not necessarily true, no? Intuitively, it should be true and proveable, but I can't do it, and I'm confused. What am I missing?
No - if some character is a large enough proportion of the whole, what's left can have more of that character than half its length. For example, applying this algorithm to
abcddd, we start withabcdbut thenddis left over.A summary of another algorithm which will work: Order the groups of letters from the most frequent to least frequent. Split this string in half, and interleave them in the odd and even positions in the final string. For example,
abcdddd$\to$dddd,abc$\to$dadbdcd.