I was trying to understand one of the solutions for the following string reduction problem (from hackerrank).
Given a string consisting of the letters a, b and c, we can perform the following operation:
Take any two adjacent distinct characters and replace them with the third character. Find the shortest string obtainable through applying this operation repeatedly.
For example, given the string aba we can reduce it to a 1 character string by replacing ab with c and ca with b.
I think the proof looks as follows:
First, a lemma: all the letters must be the same in the final answer. Assumption leading to the contradiction: assume there is a minimal length string where not all the letters are the same. Then there must be two adjacent characters that are not equal in that string, so we could combine those characters to form a shorter string, meaning this string is not minimal length.
Then, show that the there are three cases
Case 1 - final answer is length of string: all the letters are equal to start
In this case, no operations can be performed so the string stays the same length.
Case 2 - the final answer is 2
Assume that it is possible to have a final answer greater than 2 where not all the letters are the same to start. Using lemma 1, we know that the final answer must contain letters that are all the same. That means that right before the last reduction we performed, the string must have had the form of:
(using
cas the "letter that's the same," without loss of generality):where at least one of the
c...cis of length 1 or greater.Note that in both cases there is always a more or equally optimal choice than merging the a and the b together to form a c.
In the first case:
The resulting string will be two characters in length, either both a's, both b's or an a and a b which can be merged to form a c.
In the second sub-case we can apply the exact same logic.
So now we know the final answer must consist of 2 or fewer letters.
Case 3 - the final answer is 1 letter long
We also know from the problem description that the input string is always of length at least 1. Because the merging operation cannot produce a string shorter than length 1, we know that the final answer must have 1 or greater letters.
Computing case membership
So now how do we decide which case a given string belongs to?
Deciding if a string belongs to case 1 is easy. If it's all the same letter, it must be in case 1.
For case 2, consider the set of reachable strings from a given 2-letter string. We know from lemma 1 that this two letter string must consist of the same letter. WOLOG, let it be
aa. A reachable string is one that can be produced by expanding one of theainto eitherbcorcb, or performing the same operation for one of the resultingbs orcs. Note that expanding a letter will change the parity of the number of times that letter occurs, as well as the parity of the other letters. Because an expansion reduces the number of occurrences of a given letter by exactly 1, and increases the number of occurrences of other letters by exactly 1, the parity of all the letters changes. Notice also that all three letters have the same parity to start (they are all occurring an even number of times). Therefore, any reachable string from the two-letter starting string must contain letters with the same parity. So to test whether a given string will have a minimal string of two letters, we just need to check that the parities of all the letters in that string are equal.For case 3, WOLOG assume the string is
a. Use the same definition of reachable, but now notice that the parities of the letters are not equal.aoccurs an odd number of times, butbandcoccur an even number of times. Thus, when we perform an expansion, there is no way to make the parities of the three letters equal. So to test whether a given string will have a minimal string of one letter, we just need to check that not all the parities of all the letters in that string are equal.