-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lemmatisation of "did" is no longer "do" in the context of n't #13098
Comments
hello @chrisjbryant
|
your code should look like this : `nlp = spacy.load("en_core_web_sm") for tok in doc: |
Output |
I -PRON- |
My bad, that was a typo in my example code. I have updated it now. The error is still present in spacy. |
Hi @chrisjbryant, can you set up a fresh virtual environment and run this script there? I can't reproduce this. If I run import spacy
print(spacy.__version__)
nlp = spacy.load("en_core_web_sm")
sent = "I didn't go to the bank at the weekend."
for tok in nlp(sent):
print(tok.text, tok.lemma_) my output is:
|
I was previously using a conda env, but just tried with venv and same result. :/
And just for sanity checking, here is what happens when I remove the negation (i.e. it's correct).
Update Part 3: It seems the same thing happens with hadn't, so not sure what's going on.
|
Huh, after noticing I ran this with Python 3.9 I set up a clean venv with Python 3.10 - there I can reproduce this. We'll look into it. |
I can confirm that this is a bug. A few of the attribute ruler patterns were switched from You can add the old rule back to the current nlp.get_pipe("attribute_ruler").add(
patterns=[[{"TAG": "VBD", "LOWER": "did"}]],
attrs={"POS": "AUX", "LEMMA": "do"},
index=0,
) |
Just to make sure, don't forget to check other cases too, since I now also recreated it with |
Yeah, we'll review all the changes. (The underlying idea was to use |
Thanks again for the report, we've updated this in the internal attribute ruler data and added a test to |
This issue has been automatically closed because it was answered and there was no follow-up discussion. |
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
I just updated from spacy 3.6 to 3.7 and "did" is no longer getting lemmatised as "do" when followed by n't.
Output:
How to reproduce the behaviour
The text was updated successfully, but these errors were encountered: