How Breakable is this Cipher?

75 Views Asked by At

Working with a colleague of mine in undergrad, we created a cipher that we think is hard to break. Of course, the only real technique we know of is frequency analysis (which our cipher is immune.) The cipher works like this. (note, I'll be representing this as I'd do on paper, not with code.)

First, pick three inputs, a Raw Key, and Alphabet, and an Order for the Alphabet. I will use the Raw Key 'deduce', and the standard English alphabet ordered A-Z.

Now, lay out the alphabet in a 2 row by 13 column, first row A - M, second row N - Z. Now, circle each letter within your Raw Key. We will use this to generate a Refined Key by simply reading the elements in order of our alphabet. This new Refined Key should return CDEU. Essentially, you've alphabetized it and removed duplicate letters.

We will now create two sets, A and S. A is constructed by comparing the two sets, specifically their ordering. Compare the first element of your Refined Key to it's corresponding element in the Raw Key. The first element of the Refined Key is C, and C shows up in 'deduce' in position five. So, we say the first element in A is 5. Then, the second element in our Refined Key is D, and D is the first element of 'deduce', so the second element of A is 1. Continue until you exhaust the Refined Key. Note that if an element appears twice, we take the one we would read first (D was 1, not 3.)

S is constructed in an entirely different manner. This one is concerned about the spacing of each element. Find the first element of our Refined Key, and find the distance to the next element. The distance from C to D is 1, so the first element of S is 1. Then the distance from D to E is also 1, and so is the second element of S. Continue until you exhaust the Refined Key. Note that on the final element, you'll run out of an alphabet. Just wrap around the the start and continue as normal.

So, you should obtain that for the Raw Key 'deduce', you obtain A = {5,1,6,4} and S = {1,1,16,8}.

Now, to use these to create a ciphertext, I want to establish some more definitions.

We will define P to be our plaintext, and C to be our ciphertext. We will also define a function of sorts also named A and S that correspond to the sets of the same name. Furthermore, let x be equal to the product of all the elements in A, and y be equal to the product of all the elements in S.

Now, let P be the plaintext 'example'.

A(P) will be defined as shifting each element in P with respect to A. That is to say, we will shift the first element in P according to the first element in A, and likewise with the second, third, and fourth. However, A does not have a fifth element, so we wrap back to the first element and run through A again. Repeat this until you've reach the length of P.

So, step by step, The first element in P in this case is 'e', and the first element in A is '5'. Now, for the sake of making the cipher stronger, we count 'shifting by 1' as an identity. That is to say, we don't move. So, we count up from 1 to 5, starting at e, and moving to I. So, e has mapped to I. Now, for the second element, we start at 'x', and shift by 1. But recall, we defined that to be the identity, so x maps to X. Continue through the plaintext, and you should get 'IXFPTLJ'. This in NOT the ciphertext, but is the first step to getting there.

Now, hopefully that made sense, and if not I'd be happy to follow up with some more explanation. But for now, we press on. Now, thankfully S(P) works much the same way, just that it uses S instead of A. Alright, now some more claims and I can reveal the ciphertext!

Claim 1 : A(S(P)) = S(A(P)) Hopefully this can be accepted without proof for now. I am quite tired, and it is quite obvious since all we do is shift. What this claim really means is that shifting with A then S is the same as shifting with S then with A.

Claim 2 : A(P) and S(P) are invertible. This should also be clear. We can define A^-1(P)) as just using A but moving backward through the alphabet instead of forward, and likewise for S.

Now, for a definition I hope most can be behind

A(A(P)) = A^2(P)

and

S(S(P)) = S^2(P)

Now, finally, we define C to equal A^x(S^y(P))

So, as established before, frequency analysis shouldn't hold up with this cipher, since each letter is shifted by a different amount. My friend and I think the cipher is strong, but we aren't sure if it's unbreakable or if there is a flaw. My friend is a CS major, so he wrote some code to expedite the process. I can share the program if you like, but I do not believe he has implemented a decryption yet.

I'd also like to point out that the choice of using the English alphabet was arbitrary. In reality, you can use whatever symbols you'd like, and you could choose an order as well. Say, a good portion of the ASCII table perhaps...

Also, the Raw Key does not need to be simple either, that was just for clarity's sake.

Of course, there are improvements to be made, but I'd like to know if there's any technique that sticks out to anyone on how to break this cipher just from having the ciphertext and nothing else, and knowing that this was the method of encryption.

Again, if any of this needs clarification, I'd be happy to oblige.

-Popalofiti

1

There are 1 best solutions below

0
On

For long enough messages, this is susceptible to frequency analysis. All in all, you use the same rotational cypher for the 1st, 5th, 9th, 13th, etc. letter, similar for the 2nd, 6th, 10th, etc. The period depends on your keyword, but cannot be longer than 26. Hence under the usual assumption that the attacker knows your method, but not your keyword, a sufficiently long message can be decoded relatively easily by guessing (i.e., looping over) the period length and performing frequency analysis on all these equidistant subsequences.

For example, if F occurs suspiciously often among the 1st, 10th, 19th, etc. positions, followed in frequency by O and S, then it is very likely that the length of your raw key is 9 (or perhaps 3?) and that F, O, S stand for E, N, R; this immediately tells us all letter in 1st, 10th, etc. position.

In addition, it is quite likely that $x$ and $y$ are even, which means that you most likely shift by an even amount, thus often making use only of half of the available rotation.