- Update dictionaries; see issue #171 and discussion 172
- Update
en
,es
,fr
, andit
to include country names; see issue #168
- Leveraged the dictionary files from levidromelist to attempt to clean up the
en
,es
,fr
,pt
,'de
, andnl
dictionaries; Attempts to resolve issues #164, #155, #150, #140, #115, and #107; see issue #126 - Added
Italian
language support; see #167
- Remove relative imports in favor of absolute imports
- Add
Path
support for files - Added
Dutch
language support; see #162
- Add
py.typed
to enable mypy support
- Backwards Combatibility Change:
spell.candidates
andspell.correction
now returnNone
if there are no valid corrections or candidates
- Remove misspelled words from issue #120
- Update all default language dictionaries after updating the minimum frequency to 50 in
scripts/build_dictionary.py
- Fix float("nan") issue; see #125
- Include Wikipedia's common typo list to the exclude listing; see #124
- Added
Arabic
language support; see #129
- Added class method to be able to get a listing of all supported languages
- Added type hinting
- Updated English dictionary to remove incorrect
cie
words; see #112
- Add ability to load multiple languages at once; see discussion
- Fix default tokenizer to not enforce lower case; #99
- Deprecated
spell.word_probability
since the name makes it seem that it is building a true probability; usespell.word_usage_frequency
instead - Added Russian language dictionary; #91 Thanks @sviperm
- Include
__iter__
to both theSpellChecker
andWordFrequency
objects
- Removed python 2.7 support
- Updated automated
scripts/build_dictionary.py
script to support adding missing words - Updated
split_words()
to attempt to better handle punctuation; #84 - Load pre-built dictionaries from relative location for use in
PyInstaller
and other executable tools; #64
- NOTE: Last planned support for Python 2.7
- All dictionaries updated using the
scripts/build_dictionary.py
script
- Remove
encode
from the call tojson.loads()
- Reduce words in
__edit_distance_alt
to improve memory performance; thanks @blayzen-w
- Handle memory issues when trying to correct or find candidates for extremely long words
Ensure input is encoded correctly; resolves #53
Handle windows encoding issues #48 Deterministic order to corrections #47
- Add tokenizer to the Spell object
- Add Support for local dictionaries to be case sensitive see PR #44 Thanks @davido-brainlabs
- Better python 2.7 support for reading gzipped files
- Add support for a tokenizer for splitting words into tokens
- Add full python 2.7 support for foreign dictionaries
- Ensure all checks against the word frequency are lower case
- Slightly better performance on edit distance of 2
- Minor package fix for non-wheel deployments
- Ignore case for language identifiers
- Changed
words
function tosplit_words
to differentiate with theword_frequency.words
function - Added Portuguese dictionary:
pt
- Add encoding argument to
gzip.open
andopen
dictionary loading and exporting - Use of slots for class objects
- Remove words based on threshold
- Add ability to iterate over words (keys) in the dictionary
- Add setting to to reduce the edit distance check see PR #17 Thanks @mrjamesriley
- Added Export functionality:
- json
- gzip
- Updated logic for loading dictionaries to be either language or local_dictionary
- Ability to easily remove words
- Ability to add a single word
- Improved (i.e. cleaned up) English dictionary
- Better handle punctuation and numbers as the word to check
- Add support for language dictionaries
- English, Spanish, French, and German
- Remove support for python 2; if it works, great!
- Move word frequency to its own class
- Add basic tests
- Readme documentation
- Initial release using code from Peter Norvig
- Initial release to pypi