I'm doing a password cracking challenge right now, and I know several restrictions of the password.
- The password is 8 characters long
- The first character is a lowercase letter, the second character is a digit
- The seventh and eighth characters are both lowercase letters
- The remaining characters (third, fourth, fifth, sixth) are all uppercase letters, EXCEPT for one, which is a lowercase letter, and another, which is a digit.
I'm considering taking two approaches to cracking this password: a masked brute-force approach and a hybrid rules/brute force approach.
The masked brute force approach would see me processing a maximum of $$26\times10\times62\times62\times62\times62\times26\times26 \approx \bf{2.597\times10^{12}},$$ candidates, which for a decent GPU processing at $8000 \text{ kH/s}$ would take around 90 hours.
This approach doesn't take advantage of the knowledge that the third, fourth, fifth, and sixth characters can only contain one digit and one lowercase letter, so I figured I could optimize this brute force approach by generating a much more restricted set of candidates and using rules to generate possibilities based on that.
I did this by generating a list of $26*10*26*26*26*26*26*26 \approx 8*10^{10}$ candidates, using lowercase letters for the first, seventh, and eighth characters, a digit for the second character, and uppercase letters for the third, fourth, fifth, and sixth characters. This list of candidates is then put through a list of 120 rules (40 possible permutations of a random digit in positions 3, 4, 5, or 6 times 3 possible ways to lowercase a singe remaining letter) which cover all combinations of one added digit + one lowercase character, giving me $8*10^{10} * 120 = 9.6*10^{12}$ total candidates.
My question is:
How is it possible that the first, brute-forced list, including candidates that could not possibly be correct (too many lowercase chars or digits), is smaller by almost four times than the tailored list that should exclusively contain candidates that fit the restrictions?
Am I unknowingly creating tons of duplicates, or did I make a mistake somewhere in my math that I missed?
I'll focus only on the third, fourth, fifth and sixth characters, as the rest is the same in both cases.
In your first approach, where you overestimate the number of passwords, you simply consider all options for all four characters, yielding $$(26+26+10)^4=62^4,$$ options. In your second approach, which is less clear, you start from $26^4$ options, suggesting $4$ characters and then consider some permutations: You consider $40$ options for the digit, and then $3$ options for the lowercase letter, yielding $$26^4\times40\times3.$$ But then you should have started with only $3$ characters, i.e. with $26^3$ instead of $26^4$.
A more structured approach would be to first choose the positions of the digit and the lowercase letter; there are $4\times3=12$ options. Then choose $2$ uppercase letters, $1$ lowercase letter and a digit, yielding $$12\times26^3\times10.$$