Why is this string reduction solution correct?

220 Views Asked by At

I was trying to understand one of the solutions for the following string reduction problem (from hackerrank).

Given a string consisting of the letters a, b and c, we can perform the following operation:

Take any two adjacent distinct characters and replace them with the third character. Find the shortest string obtainable through applying this operation repeatedly.

For example, given the string aba we can reduce it to a 1 character string by replacing ab with c and ca with b.

1

There are 1 best solutions below

0
On

I think the proof looks as follows:

First, a lemma: all the letters must be the same in the final answer. Assumption leading to the contradiction: assume there is a minimal length string where not all the letters are the same. Then there must be two adjacent characters that are not equal in that string, so we could combine those characters to form a shorter string, meaning this string is not minimal length.

Then, show that the there are three cases

Case 1 - final answer is length of string: all the letters are equal to start

In this case, no operations can be performed so the string stays the same length.

Case 2 - the final answer is 2

Assume that it is possible to have a final answer greater than 2 where not all the letters are the same to start. Using lemma 1, we know that the final answer must contain letters that are all the same. That means that right before the last reduction we performed, the string must have had the form of:

(using c as the "letter that's the same," without loss of generality):

  1. \[c...cabc...c\] or
  2. \[c...cbac...c\]

where at least one of the c...c is of length 1 or greater.

Note that in both cases there is always a more or equally optimal choice than merging the a and the b together to form a c.

In the first case:

  1. If there is a c on the left side, merge the c and the a. Continue merging until there are no c's left, which will always yield either an a or a b.
  2. If there is a c on the right side, merge the c and the b. Continue merging until there are no c's left, which will always yield either an a or a b

The resulting string will be two characters in length, either both a's, both b's or an a and a b which can be merged to form a c.

In the second sub-case we can apply the exact same logic.

So now we know the final answer must consist of 2 or fewer letters.

Case 3 - the final answer is 1 letter long

We also know from the problem description that the input string is always of length at least 1. Because the merging operation cannot produce a string shorter than length 1, we know that the final answer must have 1 or greater letters.

Computing case membership

So now how do we decide which case a given string belongs to?

Deciding if a string belongs to case 1 is easy. If it's all the same letter, it must be in case 1.

For case 2, consider the set of reachable strings from a given 2-letter string. We know from lemma 1 that this two letter string must consist of the same letter. WOLOG, let it be aa. A reachable string is one that can be produced by expanding one of the a into either bc or cb, or performing the same operation for one of the resulting bs or cs. Note that expanding a letter will change the parity of the number of times that letter occurs, as well as the parity of the other letters. Because an expansion reduces the number of occurrences of a given letter by exactly 1, and increases the number of occurrences of other letters by exactly 1, the parity of all the letters changes. Notice also that all three letters have the same parity to start (they are all occurring an even number of times). Therefore, any reachable string from the two-letter starting string must contain letters with the same parity. So to test whether a given string will have a minimal string of two letters, we just need to check that the parities of all the letters in that string are equal.

For case 3, WOLOG assume the string is a. Use the same definition of reachable, but now notice that the parities of the letters are not equal. a occurs an odd number of times, but b and c occur an even number of times. Thus, when we perform an expansion, there is no way to make the parities of the three letters equal. So to test whether a given string will have a minimal string of one letter, we just need to check that not all the parities of all the letters in that string are equal.