-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improvements in word splitting #61
Comments
The reason for this is that words are classified as either "camel case" or "snake case" with a simple set of heuristics. So in the case of Admittedly, this could be improved but it would mean introducing some potential future corner cases and further processing (although I am open to the idea if you want to give a go at opening a PR to try fix this).
Yep this is not currently catered for in the plugin. Again, not an impossible issue to solve, but it would introduce some corner cases of its own. A few approaches could be:
Agreed that sounds annoying. It's unfortunately quite tough to cover these cases correctly without some human intervention though because this is essentially impossible to parse correctly without running it through an AST. I would suggest marking the line as ignored for the specific flake8 error code this plugin gives. |
Just wondering if there is any reason why camel case words aren't always split when checking spelling.
In particular if there are camelcase words in a comment then they flag with SC100
But also if there is some terribly named variable like TestData_apple then SC200 will flag 'TestData' as a spelling error whereas I would have expected 'Test' and 'Data' to have been checked separately by flake8-spellcheck
The other word splitting choice I am curious about is "words" that have digits in them. I have an electronics library with a fairly ridiculous whitelist containing things like 100n, 10m, 10n5, 1n, 1k, 200m, 250V, 250VAC, 33R4, 470R, 6V3, etc, etc which is a bit cumbersome.
Edit: Another odd word splitting thing.
If I have
pd.Timestamp('2017-06-01T12'))
then everything is fine, however if I have# pd.Timestamp('2017-06-01T12'))
then I get a misspelt word of '2017. I understand single quotes are really apostrophes (I have argued within my team for double quoted strings with no luck) so splitting up words on single quotes isn't possible, but it is a bit annoying having to put things like '2017 in my whitelist.The text was updated successfully, but these errors were encountered: