Rust library for checking against the Wikipedia word frequency corpus. The library is fast, memory efficient, and secure. The data structure used to do full lookups is the Hashmap. A Suffix Array data structure suffix is used to perform quick lookups of sub-patterns over the dictionary.
pip3 install pywordfreq
import pywordfreq
# On the first use of library, the engine is loaded with the dictionary.
# It is worth to mention that there is a significant ammount
# of memory overhead for the engine.
# This function checks the frequency of the word "the" in the corpus
pywordfreq.full_frequency(
word="the",
)
# This function checks the frequency of the word "inter" as a pattern
# in other words of the dictionary.
pywordfreq.partial_frequency(
pattern="inter",
)
Distributed under the MIT License. See LICENSE
for more information.
Gal Ben David - [email protected]
Project Link: https://github.com/intsights/pywordfreq