How to calculate correlation between number of characters in a string and its length?

47 Views Asked by At

Compared to what I'm seeing here on the forum I am a total mathematical noob.

I'm solving an algorithmic problem which I know how to solve but there has to be a better and much simpler solution using mathematics.

So a string is considered valid if it contains the same number of characters OR if ONE of the characters from the string is removed.

For example:

aabbcc - valid - same number of characters

aaabbcc - invalid - number of occurrences of "a" is 3 and for the rest is 2

abcdefghhgfedcba - valid - all characters occur 2 times

abcdfghhgfedcba - valid - I removed "e" so one character was removed.

aabbccddeefghi - invalid - fghi each char in this substring occurs only one time so we would have to change it to either ffgghhii or a combination of it without one of the letters.

So the bottom line here is that the difference between the average number of characters and a missing character cannot be greater than one, so if all the characters occur let's say 3 times than the missing char count has to be at least 2.

Now, there has to be a correlation I should be able to calculate between the length of the string and the average number of occurrences of a character in the string. So depending on the length and the average I should be able to calculate a deviation which would indicate if only one character has been removed or more.

I'm sorry if this is very vague but I really don't know how to define the problem in mathematical terms.