Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance fine-tuning recipes for llama3 8b + 70b #11046

Merged
merged 10 commits into from
Oct 29, 2024

Conversation

vysarge
Copy link
Collaborator

@vysarge vysarge commented Oct 25, 2024

What does this PR do ?

Add a one line overview of what this PR aims to accomplish.

Collection: llm

Changelog

  • Add specific line by line info of high level changes in this PR.

Usage

  • You can potentially add a usage example below
# Add a code snippet demonstrating how to use this 

GitHub Actions CI

The Jenkins CI system has been replaced by GitHub Actions self-hosted runners.

The GitHub Actions CI will run automatically when the "Run CICD" label is added to the PR.
To re-run CI remove and add the label again.
To run CI on an untrusted fork, a NeMo user with write access must first click "Approve and run".

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • New Feature
  • Bugfix
  • Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

  • Related to # (issue)

vysarge and others added 3 commits October 25, 2024 15:43
Signed-off-by: Valerie Sarge <[email protected]>
Signed-off-by: Valerie Sarge <[email protected]>
Signed-off-by: Valerie Sarge <[email protected]>
Signed-off-by: Valerie Sarge <[email protected]>
Signed-off-by: Valerie Sarge <[email protected]>
@vysarge vysarge marked this pull request as ready for review October 28, 2024 19:15
Copy link
Collaborator

@cuichenx cuichenx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few small questions

nemo/collections/llm/recipes/llama3_70b.py Outdated Show resolved Hide resolved
@@ -273,7 +285,82 @@ def finetune_recipe(
recipe.optim.config.lr = 5e-6
elif peft_scheme.lower() == 'lora':
recipe.peft = run.Config(LoRA)
recipe.peft.dim = 8
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have you verified that dim=8 would not compromise accuracy?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rank 8 follows the default from torchtune here; in tests with 8b and 70b loss looks to decrease smoothly across 50-100 train steps.

Use this method with caution and only when you need maximum performance.
It may not be suitable for all hardware configurations or use cases.
"""
recipe.trainer.strategy.tensor_model_parallel_size = 1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just checking, TP1 works for full finetuning as well?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With BF16 grad it should. Let me double-check.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, both fit.


# Sequence length settings in the model and dataset must agree
recipe.model.config.seq_length = seq_length
recipe.data.seq_length = seq_length
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

without packed sequence, we usually use seq_length=2048 (because most samples in Squad are well shorter than 2048 tokens)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just altered this to use 4K for packed and 2K for unpacked by default.

Signed-off-by: Valerie Sarge <[email protected]>
@vysarge vysarge requested a review from cuichenx October 28, 2024 22:18
cuichenx
cuichenx previously approved these changes Oct 28, 2024
Copy link
Collaborator

@cuichenx cuichenx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Signed-off-by: Valerie Sarge <[email protected]>
@JimmyZhang12 JimmyZhang12 merged commit ce9f1dd into NVIDIA:main Oct 29, 2024
152 of 154 checks passed
ko3n1g pushed a commit that referenced this pull request Oct 29, 2024
* llama3 finetuning perf recipes progress capture

Signed-off-by: Valerie Sarge <[email protected]>

* Small syntax fix

Signed-off-by: Valerie Sarge <[email protected]>

* syntax

Signed-off-by: Valerie Sarge <[email protected]>

* Apply isort and black reformatting

Signed-off-by: vysarge <[email protected]>

* Correct ddp setting

Signed-off-by: Valerie Sarge <[email protected]>

* Fix hasattr check

Signed-off-by: Valerie Sarge <[email protected]>

* bf16 grad

Signed-off-by: Valerie Sarge <[email protected]>

* Update configs for 8b + 70b

Signed-off-by: Valerie Sarge <[email protected]>

* Set wgrad_deferral_limit

Signed-off-by: Valerie Sarge <[email protected]>

---------

Signed-off-by: Valerie Sarge <[email protected]>
Signed-off-by: vysarge <[email protected]>
Co-authored-by: vysarge <[email protected]>
ko3n1g added a commit that referenced this pull request Oct 29, 2024
* llama3 finetuning perf recipes progress capture



* Small syntax fix



* syntax



* Apply isort and black reformatting



* Correct ddp setting



* Fix hasattr check



* bf16 grad



* Update configs for 8b + 70b



* Set wgrad_deferral_limit



---------

Signed-off-by: Valerie Sarge <[email protected]>
Signed-off-by: vysarge <[email protected]>
Co-authored-by: Valerie Sarge <[email protected]>
Co-authored-by: vysarge <[email protected]>
hainan-xv pushed a commit to hainan-xv/NeMo that referenced this pull request Nov 5, 2024
* llama3 finetuning perf recipes progress capture

Signed-off-by: Valerie Sarge <[email protected]>

* Small syntax fix

Signed-off-by: Valerie Sarge <[email protected]>

* syntax

Signed-off-by: Valerie Sarge <[email protected]>

* Apply isort and black reformatting

Signed-off-by: vysarge <[email protected]>

* Correct ddp setting

Signed-off-by: Valerie Sarge <[email protected]>

* Fix hasattr check

Signed-off-by: Valerie Sarge <[email protected]>

* bf16 grad

Signed-off-by: Valerie Sarge <[email protected]>

* Update configs for 8b + 70b

Signed-off-by: Valerie Sarge <[email protected]>

* Set wgrad_deferral_limit

Signed-off-by: Valerie Sarge <[email protected]>

---------

Signed-off-by: Valerie Sarge <[email protected]>
Signed-off-by: vysarge <[email protected]>
Co-authored-by: vysarge <[email protected]>
Signed-off-by: Hainan Xu <[email protected]>
lilyw97 pushed a commit to lilyw97/NeMo that referenced this pull request Nov 13, 2024
* llama3 finetuning perf recipes progress capture

Signed-off-by: Valerie Sarge <[email protected]>

* Small syntax fix

Signed-off-by: Valerie Sarge <[email protected]>

* syntax

Signed-off-by: Valerie Sarge <[email protected]>

* Apply isort and black reformatting

Signed-off-by: vysarge <[email protected]>

* Correct ddp setting

Signed-off-by: Valerie Sarge <[email protected]>

* Fix hasattr check

Signed-off-by: Valerie Sarge <[email protected]>

* bf16 grad

Signed-off-by: Valerie Sarge <[email protected]>

* Update configs for 8b + 70b

Signed-off-by: Valerie Sarge <[email protected]>

* Set wgrad_deferral_limit

Signed-off-by: Valerie Sarge <[email protected]>

---------

Signed-off-by: Valerie Sarge <[email protected]>
Signed-off-by: vysarge <[email protected]>
Co-authored-by: vysarge <[email protected]>
HuiyingLi pushed a commit to HuiyingLi/NeMo that referenced this pull request Nov 15, 2024
* llama3 finetuning perf recipes progress capture

Signed-off-by: Valerie Sarge <[email protected]>

* Small syntax fix

Signed-off-by: Valerie Sarge <[email protected]>

* syntax

Signed-off-by: Valerie Sarge <[email protected]>

* Apply isort and black reformatting

Signed-off-by: vysarge <[email protected]>

* Correct ddp setting

Signed-off-by: Valerie Sarge <[email protected]>

* Fix hasattr check

Signed-off-by: Valerie Sarge <[email protected]>

* bf16 grad

Signed-off-by: Valerie Sarge <[email protected]>

* Update configs for 8b + 70b

Signed-off-by: Valerie Sarge <[email protected]>

* Set wgrad_deferral_limit

Signed-off-by: Valerie Sarge <[email protected]>

---------

Signed-off-by: Valerie Sarge <[email protected]>
Signed-off-by: vysarge <[email protected]>
Co-authored-by: vysarge <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants