Wednesday, October 19, 2022

Dictionary Improvements

Dictionary

We recently made some big improvements to the Plates Across America® dictionary. Our game's dictionary is the critical starting point for the word puzzles that appear on the license plates.

Starting State

Before the latest refresh, our dictionary had 99,857 words. This generated 14,900 unique word puzzles with a little over 8 million possible solutions across them. These numbers sound relatively good, but as you will see below, we needed an even better game dictionary.

We formed the initial dictionary from multiple non-commercial sources since we had a very limited budget when we first started creating the game. These "free" dictionaries all had some quality problems, so we opted to use them all in a "voting" scheme where we (conservatively) only included words that we saw appearing in multiple dictionaries.

In Game Improvements

Dictionary Badge

Because we took a conservative approach, we knew the dictionary had gaps, so we added a feature to allow users to suggest new entries from within the game. If the game flagged your answer as not matching (because it was not in our dictionary), you could click a link to notify us and we would then review it. We built tools and a process to review player suggestions and add them to the dictionary. We even added a special "badge" (at right) to recognize the player's contributions.

As more players joined the game, our queue of suggested dictionary additions got busy. Reviewing these confirmed our suspicion that we had gaps. We realized that the gaps might be bigger than we expected. The dictionary issues were becoming our biggest problem to address.

Dictionary Improvements

Since we first built the dictionary, we have found a few other freely available dictionaries. The best one we found had some issues of allowing some hyphenated words and capitalized words (proper nouns), which are not valid answers according to our game rules. However, the number of good words it had was significant, so we decided to take the bad with the good. We'll eventually remove the bad words, but we felt being more liberal about what is included in the dictionary was a better decision. Our first attempt was way too conservative.

Our resulting dictionary has 172,584 words. On first glance it looks like we added 72,727 (172,584 - 99,857). However, during the dictionary upgrade process, we detected 21,072 "bad" words in the original dictionary. Thus, this new dictionary source effectively added 93,799 new words: double the size.  This new dictionary bumped up the number of possible word puzzles to nearly 16,000 and the total solution count across them to just shy of 15 million.


Plates Across America

Conclusion

All in all, our dictionary was significantly transformed and improved with our latest efforts. For those that played our game and were flagged for wrong answers for valid words, we wish we could go back in time to prevent that, but at least we have made some strides to prevented it in the future.

If you have never tried our game before, please try it out here:

https://platesacrossamerica.com

Happy Travels!






No comments:

Post a Comment