How easy is it to break this encryption system a buddy of mine and I just discovered?

102 Views Asked by At

This is gonna take a little bit of background to explain, so here goes:

In base $11$, we have $11$ numerals to form numbers: $0, 1, 2, 3, 4, 5, 6, 7, 8, 9$, and since there is no single symbol standing for $10$, then we represent $10$ by the letter a. In base $16$, basically the same happens, but this time the "numerals" we have to work with are $0$ through $9$ and $a$ through $f$. You could do this all the way up to base $35$, where the numerals we would have to work with are $0$ through $9$ and the letters $a$ through $z$, where again $a$ stands for $10$ and z for $35$. Well, then, what if we wanted to work with bases greater than $35$? I looked it up and there was no standard convention for this so I made my own, which is the following: after $z$, our next numerals to work with will be a&, b&, ..., z&, where now a& will stand for $36$, b& for $37$ and so on. This does the job up until base $61$ of course. Now, to work with arbitrarily large integer positive bases, we keep adding the symbol & to the end of the letters, so a&& would stand for $62$, b&& for $63$ and so on. We can obviously keep doing this indefinitely.

Under this convention, the word "john" could represent basically any number. For example, in base $35$, the word "john" wound represent the number $844643$ in base $10$ (because j stands for $19$, o stands for $24$, h stands for $17$ and n stands for $23$ and $19 \cdot 35^3 + 24 \cdot 35^2 + 17 \cdot 35 + 23 \cdot 35^0 = 844643$). To illustrate another example, the number $844643$ in base $10$ can be represented under this convention in base $87$ as $\text{"1op&l&"}$, because $1 \cdot 87^3 + 24 \cdot 87^2 + 51 \cdot 87^1 + 47 \cdot 87^0 = 844643$, where again we have that o stands for $24$, "p&" stands for $51$ and "l&" stands for $47$.

Therefore, you could define a function, let's call it encrypt, which takes three inputs (an initial base and a final base and a word) and returns an output of an encrypted word in return. For example, $\text{encrypt("john", 35, 87)}$ would equal $\text{"1op&l&"}$. With an initial base of $35$ and final base of $87$, the phrase "john smith has 5 bananas" would turn into $\text{"1op&l& d&&nt&&3 2h&&a&& 5 4ec&&9m&&i&&"}$. I have indeed defined such a function in Python which is how I know what these outputs are.

Then I thought "what if all I had to work with was the encrypted phrase, how would I go about deciphering it?". One could obviously brute force through as many possible initial and final bases as possible and that would eventually get me the original phrase. But brute force could also (conceivably, I'm not sure when that would actually happen) lead one to another reasonable phrase, such as "leopold ate 10 apples". And depending on how big the initial bases and final bases are and how big the original encrypted text is, I assume this could take a very long time. So this would be a very crude way of breaking the encryption which I want to exclude as a solution.

What other solutions to breaking this encryption are there? Since people always (with good reason) say never to use self-made encryption systems, I assume there is some way this would be trivially easy to break without resorting to brute force. Am I right?