Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix params_with_grad in FSDP when the model has frozen parameters #1167

Open
wants to merge 1 commit into
base: ngoyal_changes_for_pp_fp8_akadian
Choose a base branch
from

Conversation

whbldhwj
Copy link

@whbldhwj whbldhwj commented Mar 4, 2024

What does this PR do?

For frozen parameters, params_with_grad will silent fail as the param's grad is None and there is no attribute of main_grad for it.
This could lead to false behavior in the gradient clipping when params_with_grad is called.

This PR fixes this issue by checking if the grad has set the requires_grad to True at first.

Before submitting

  • Did you have fun?
    • Make sure you had fun coding 🙃
  • Did you read the contributor guideline?
  • Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
    • N/A
  • Did you make sure to update the docs?
    • N/A
  • Did you write any new necessary tests?
    • N/A
  • Did you update the changelog? (if needed)
    • N/A

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 4, 2024
@whbldhwj whbldhwj requested a review from ngoyal2707 March 4, 2024 18:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants