Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: ignore tokens should be set to -100, not -1 #19

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

maxlchen
Copy link

Issue #, if available:

Description of changes:
Correcting the value of unmasked indices in the target tensor in mask_tokens().
Currently, line 348 is: labels[~masked_indices] = -1
But, the value of -1 breaks the loss function. From BertForMaskedLM documentation: "Indices should be in [-100, 0, ..., config.vocab_size] ... Tokens with indices set to -100 are ignored (masked), the loss is only computed for the tokens with labels in [0, ..., config.vocab_size]"
This value constraint appears to hold in every version of the HuggingFace API which has documentation for BertForMaskedLM.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

maxlchen added 2 commits May 25, 2022 15:12
Loss cannot be computed if  labels[~masked_indices] = -1.
From BertForMaskedLM documentation: "Indices should be in [-100, 0, ..., config.vocab_size] ... Tokens with indices set to -100 are ignored (masked), the loss is only computed for the tokens with labels in [0, ..., config.vocab_size]"
Fix: ignore tokens in target tensor should be set to -100, not -1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant