-
-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Larger wordlists #17
Comments
towards #17 also modified existing top250 list to remove non-alphabetic words (added an extra word in place of the one removed) word lists added: - top500 - top1000 - top2500 - top5000 all of them are from the same source as the initial top250 list
The source I was using had only 5000 words (for free). Added 500-5000 words lists in 7c049c5, which is coming in v0.4.0. |
I think the word list size is misleading - textgen.rs only looks at words between 2 and 8 characters. There are 927 words in the 5000 wordlist, for example, that didn't meet this criteria (925/927 were larger than 8 chars). |
Good catch! The 2 to 8 chars filter was quite arbitrary. |
What and why?
Currently, the only built-in word list is the top 250 words list. This is very limiting as words will often repeat again in the same line and multiple times throughout a test.
It would be nice to have these word lists too:
How?
More info about the existing word list: https://docs.rs/toipe/latest/toipe/wordlists/constant.TOP_250.html
The word list needs to be added in this directory: https://github.com/Samyak2/toipe/tree/main/src/word_lists
and it needs to be listed here: https://github.com/Samyak2/toipe/blob/main/src/wordlists.rs
The text was updated successfully, but these errors were encountered: