I'm making a greedy Hangman algorithm such that the letter with the highest frequency in a word list is recommended to be chosen for the next guessed letter. It is easy to calculate the frequency of a letter in the list: frequency = (# words with letter) / (total # words). If we calculate the ratio for each letter, then the letter with the highest frequency is the best choice. Explanation:
- If it is correct, then we don't lose a life and gain more insight into the final word.
- If it is wrong, then we eliminate the largest number of possible answers from our list.
This solution works great and is easy to implement in code, but I theorize that doing multiple passes (meaning look more than 1 letter into the future) may result in different recommendations or at least provide more insight into what the best guessing strategy is. Knowing the frequencies of ab, ac, ad, and so on seems like it would allow me to make a more educated guess than just the frequency of "a" could.
In a perfect world, I would calculate the frequency for each letter and then the frequency for 1 or 2 more letter guesses after. But this is expensive because it requires practically 26^n number of passes over the data set. So, I was thinking about only considering the top 3 results for each subsequent guess. See here:

Question
If I were to implement multiple passes, how would that affect the best guessing strategy? Specifically, what would the math look like for this? I'm lost here...