Hangman2020/01/20
Rules of the Game
The game of Hangman involves one player thinking of a word or phrase, and another player guessing that word by guessing letters one at a time to reveal information. If the guesser guesses too many letters incorrectly, that player loses.
What is the most difficult Hangman word to guess?
Strategy 1 - Most Common Letter
In this article, I assume that the only valid words are those listed in a certain version of SOWPODS. This dictionary does not include proper nouns and does not include words longer than 15 letters.
A typical strategy involves
- determining all words that match a given template, then
- determining which letter appears in most of the valid words.
If multiple letters have the same rank in Step 2, then any is chosen with equal probability. The difficulty of a word under this strategy is the average number of misses made before the word is guessed. The most difficult words by length are listed in the following table:
BI 8.50 | DI 8.50 | GI 8.50 | HI 8.50 | KI 8.50 | LI 8.50 | MI 8.50 | PI 8.50 | QI 8.50 | SI 8.50 |
JAI 17.49 | KOI 17.17 | COZ 16.88 | EAU 16.36 | FOU 16.33 | VEG 16.09 | JEU 16.01 | JOE 15.88 | HEH 15.84 | ZEK 15.84 |
DIFF 14.50 | JIFF 14.50 | KOIS 14.42 | VAGI 14.30 | BABU 14.20 | BUZZ 14.00 | EAUS 14.00 | FOUS 14.00 | LACS 14.00 | MIME 14.00 |
HUZZY 14.85 | JIGGY 13.50 | WIGGY 13.50 | JUTTY 13.48 | VUTTY 13.48 | CUTTY 13.28 | HAJES 13.18 | FUZZY 13.09 | DUDDY 13.04 | BACHS 13.00 |
FOXING 13.06 | JIBING 13.00 | JOYING 12.96 | YUCKED 12.44 | YUKKED 12.11 | CACHED 12.10 | JAZZED 12.10 | HUGGED 12.06 | JUGGED 12.06 | MOMZER 12.00 |
WUDDING 12.33 | YUKKING 12.15 | CACHING 12.00 | JAZZING 12.00 | WAXWING 12.00 | COZYING 11.00 | HUTTING 11.00 | JUTTING 11.00 | HUZZIES 10.88 | HUGGERS 10.75 |
CUPPIEST 9.05 | BUZZIEST 9.02 | MUMMINGS 8.97 | CUTTLING 8.75 | MUFFLING 8.75 | CUFFLING 8.50 | FUZZLING 8.50 | JAZZIEST 8.50 | MUZZLING 8.50 | PUZZLING 8.50 |
JUDDERING 7.20 | FULLERING 6.71 | JAZZINESS 6.66 | CAMPINESS 6.38 | HAMMINESS 6.21 | HAPPINESS 6.21 | RUBBISHLY 6.17 | RUTTISHLY 6.17 | YUCKINESS 6.13 | UNPUZZLED 6.10 |
BUCKJUMPER 5.00 | JOKINESSES 5.00 | PICKPOCKET 5.00 | BACKSTALLS 4.83 | FLUFFINESS 4.75 | TITTUPPING 4.58 | RUBBISHING 4.57 | CURFUFFLED 4.50 | KURFUFFLED 4.50 | PACKSTAFFS 4.50 |
JAZZINESSES 6.66 | CAMPINESSES 6.38 | HAMMINESSES 6.21 | HAPPINESSES 6.21 | YUCKINESSES 6.13 | HUFFINESSES 6.07 | FUZZINESSES 5.85 | TUBBINESSES 5.61 | POTTINESSES 5.45 | PUFFINESSES 5.37 |
FLUFFINESSES 4.75 | KLUTZINESSES 4.25 | SULPHHYDRYLS 4.00 | UNTRUTHFULLY 4.00 | GRUBBINESSES 3.89 | CLUBBINESSES 3.79 | FRUMPINESSES 3.67 | CHATTINESSES 3.50 | QUAGGINESSES 3.50 | SMUDGINESSES 3.50 |
CATTISHNESSES 4.17 | BULLISHNESSES 4.00 | CURRISHNESSES 3.79 | RUTTISHNESSES 3.63 | CULTISHNESSES 3.33 | RAFFISHNESSES 3.17 | RAMMISHNESSES 3.17 | WAGGISHNESSES 3.17 | BLACKCURRANTS 3.00 | BRACHYDACTYLY 3.00 |
QUADRANGULARLY 3.00 | UNTRANSLATABLY 3.00 | WAGELESSNESSES 3.00 | WOODLESSNESSES 2.88 | HARMLESSNESSES 2.75 | HOPELESSNESSES 2.75 | CHROMATOGRAPHS 2.50 | CHROMATOGRAPHY 2.50 | CONGRATULATORS 2.50 | CONGRATULATORY 2.50 |
ULTRASTRUCTURAL 3.00 | ANTHROPOPHAGOUS 2.00 | ASTROPHOTOGRAPH 2.00 | AUTOCHTHONOUSLY 2.00 | BRACHYDACTYLISM 2.00 | BRACHYDACTYLOUS 2.00 | CHROMATOPHOROUS 2.00 | CHROMOXYLOGRAPH 2.00 | CRYSTALLOGRAPHY 2.00 | GNATHOSTOMATOUS 2.00 |
Typically, shorter words are more difficult than longer words, with the most difficult being JAI (or KOI for a common word). However, a human player may find longer words to be harder, because Step 1 is extremely difficult to calculate. I asked several players to guess two words:
- VISUALIZED—a 10-letter word with difficulty 0.
- BAKING—a 6-letter word with difficulty 11.58.
One player reached the state
of which only 2 valid words remained.1 The player was unable to find either easily and ended with 12 misses. The difficulty of this word also came from the presence of a V and Z which players tend to overlook. The best score I saw was 7.
On the other hand, many players reached the state
and they were aware that many, many words fit this template.2 Guessing the word became a game of chance, which a computer cannot overcome either. The best score I saw was 8.
Strategy 2 - All Valid Letters
An alternate strategy generalizes Step 2 of Strategy 1. Instead of guessing only the letter that is most common, the computer will guess any valid letter with probability proportional to the letter’s frequency in the remaining words.
Some “easy” words became more difficult, because the computer now had a small chance to guess uncommon letters that missed. But “hard” words also became easier, because if the computer happened to guess an uncommon letter that was correct, the number of valid words would decrease dramatically. Compare the distribution of difficulties between Strategy 1 and Strategy 2 by word length.3
Strategy 2 plays more consistently so it is harder to outplay, but it is worse overall. These are the best words for Strategy 2:4
ZO 10.03 | QI 9.79 | KI 9.63 | LO 9.42 | PO 9.40 | GO 9.39 | LI 9.28 | AL 9.03 | DO 8.92 | OF 8.85 |
ZUZ 14.60 | QAT 13.50 | VAV 13.41 | ZAX 13.40 | ZZZ 12.90 | SAX 12.82 | KAK 12.70 | HOX 12.46 | FAG 12.24 | FAW 12.20 |
ZIZZ 14.23 | JAZZ 13.90 | JIZZ 13.49 | JILL 13.33 | ZILL 13.16 | MIZZ 12.93 | MOZZ 12.89 | QATS 12.76 | ZZZS 12.68 | MUZZ 12.42 |
ZILLS 12.50 | JILLS 12.36 | JAZZY 12.31 | JUJUS 12.19 | BUZZY 12.18 | VILLS 11.95 | WILLS 11.81 | ZIFFS 11.57 | FIZZY 11.34 | FAZED 10.96 |
JAZZED 12.22 | ZIZZED 11.96 | JAZZER 11.92 | JAGGED 11.49 | ZAGGED 11.36 | FAFFED 11.13 | FOXING 11.08 | JOGGED 10.94 | BUZZER 10.84 | FUZZED 10.76 |
JAZZING 11.68 | ZIZZING 11.27 | JAZZERS 10.66 | FAFFING 10.49 | BUZZING 10.48 | JAPPING 10.40 | ZAPPING 10.30 | JAZZIER 10.04 | BIZZIES 9.72 | JAGGING 9.61 |
ZIZZLING 9.43 | JAZZIEST 8.91 | CHIZZING 8.80 | WUZZLING 8.69 | QUIZZING 8.61 | WHIZZING 8.59 | WHIFFING 8.07 | BUZZINGS 8.05 | WHIPPING 7.79 | WAZZOCKS 7.66 |
JAZZINESS 7.71 | BEZZAZZES 7.09 | PAZZAZZES 7.01 | FUZZINESS 6.92 | KOOKINESS 6.83 | BOWWOWING 6.69 | MUZZINESS 6.69 | KAVAKAVAS 6.66 | JIBBERING 6.37 | QUIZZINGS 6.31 |
JOKINESSES 6.78 | FOXINESSES 6.43 | BOXINESSES 6.37 | FOZINESSES 6.29 | WAVINESSES 6.18 | HOKINESSES 6.13 | COXINESSES 5.84 | HAZINESSES 5.66 | PIPINESSES 5.43 | WIGWAGGING 5.07 |
JAZZINESSES 7.58 | FUZZINESSES 6.88 | KOOKINESSES 6.73 | MUZZINESSES 6.61 | HUFFINESSES 6.08 | WOOZINESSES 5.98 | PUFFINESSES 5.91 | BOOZINESSES 5.91 | BUGGINESSES 5.88 | BATTINESSES 5.32 |
FLUFFINESSES 5.17 | STUBBINESSES 4.94 | QUIZZINESSES 4.93 | STUFFINESSES 4.80 | SHABBINESSES 4.77 | SNAZZINESSES 4.56 | SCABBINESSES 4.41 | CHUMMINESSES 4.33 | JAGGEDNESSES 4.29 | FLAGGINESSES 4.09 |
FADDISHNESSES 4.50 | WAGGISHNESSES 4.33 | CADDISHNESSES 4.25 | FOPPISHNESSES 4.25 | RAFFISHNESSES 4.20 | DOGGISHNESSES 4.17 | JOBLESSNESSES 4.15 | JOYLESSNESSES 4.04 | BOOKISHNESSES 4.03 | HUFFISHNESSES 3.95 |
WAGELESSNESSES 4.03 | WOODLESSNESSES 3.71 | FACELESSNESSES 3.58 | BOOTLESSNESSES 3.54 | FOODLESSNESSES 3.52 | HOPELESSNESSES 3.50 | WORKLESSNESSES 3.49 | FECKLESSNESSES 3.39 | WORDLESSNESSES 3.20 | SHEEPISHNESSES 2.94 |
SENSELESSNESSES 2.95 | SHAMELESSNESSES 2.88 | SMOKELESSNESSES 2.84 | SHAPELESSNESSES 2.81 | STATELESSNESSES 2.64 | PULSELESSNESSES 2.56 | INEFFABLENESSES 2.52 | POSSESSEDNESSES 2.50 | INDIVIDUALIZING 2.48 | ENJOYABLENESSES 2.41 |
Strategy 2 does the worst when the word contains very uncommon letters such as J and Z, and like Strategy 1, it finds repeated letters difficult.
Strategy 3 - Most Knowledge
For another strategy, the focus is on the number of valid words remaining after every guess. To judge the value of guessing the letter A, first all currently valid words are assumed to be in play with equal probability. The guess of A is applied to each valid word, and all the valid words after each application are counted up. The guess that results in the fewest number of valid words is chosen. Again, if multiple letters tie, then either is chosen with equal probability.
This strategy was too computationally expensive for me to calculate anything meaningful. It seems to perform the worst too, since it disregards whether a guess is likely to be right or wrong.5
Data
All the data and code used can be found on the GitHub repo.
The other word is MISCLAIMED.
102 words match _A_ING, and the frequencies of remaining letters are somewhat evenly distributed.
B-6, C-11, D-11, E-7, F-10, H-11, J-3, K-10, L-13, M-12, O-1, P-10, Q-0, R-23, S-12, T-15, U-0, V-8, W-19, X-5, Y-10, Z-6.While the difficulty values for Strategy 1 were computed exactly, the values for Strategy 2 were approximated. Computing the probabilities of every possible path became too time-consuming, so instead the computer played every word 100 times and averaged the results. In some instances, the difficulty could fluctuate by a few points.
After computing difficulties using 100 trials, the top 50 words in each category had their difficulties recalculated using 1000 trials for more accurate rankings. As a result, these values may differ from the data file where only the results using 100 trials were recorded.
Using a dictionary of ABC, BAC, DDD, DDE, Strategy 3 is on average 3.94× worse than Strategy 1 and 3.06× worse than Strategy 2.