-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] XLNET-CLM eval recall metric value does not match with custom np based recall metric value #719
Comments
If I use dev branch, I am getting much higher CLM accuracy metrics (~2.5x higher) compared to MLM from |
Is this bug already fixed in some T4R version? I am currently experiencing similar discrepancies when it comes to evaluating NDCG and MRR metrics on my dataset. My question is: is it worth creating a reproducible example, or are you already working on it?" |
@SPP3000 can you please provide more details about what model you are using? and how do you evaluate? are you using our evaluation method |
@SPP3000 are you seeing same issue with XLNet MLM? did you test MLM? |
Bug description
When we train an XLNet model with
CLM
masking, the model prints out its own evaluation metrics (ndcg@k, recall@k, etc.) fromtrainer.evaluate()
step. If we want to apply our own custom metric func using numpy something like below, the metric values do not match, but they match if we useMLM
masking instead.Steps/Code to reproduce bug
coming soon.
Expected behavior
Environment details
Additional context
The text was updated successfully, but these errors were encountered: