Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Logits shift in loss computation #39

Open
shivamag125 opened this issue Jul 14, 2024 · 1 comment
Open

Logits shift in loss computation #39

shivamag125 opened this issue Jul 14, 2024 · 1 comment

Comments

@shivamag125
Copy link

While the computing the loss L136, shouldn't the logits and targets be rolled to account for next token prediction?

Similar to https://github.com/huggingface/transformers/blob/main/src/transformers/models/llama/modeling_llama.py#L1092

@shivamag125
Copy link
Author

Edit- I see that you took care of it while preparing the targets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant