Contrastive Reranker/Reward model #9171

arendu · 2024-05-11T18:22:20Z

What does this PR do ?

enables training of a contrastive reranker model based on GPT architecture and PEFT. Used for reranking the results of an embedding model.

Collection: [NLP]

Changelog

Add specific line by line info of high level changes in this PR.

Usage

You can potentially add a usage example below

# Add a code snippet demonstrating how to use this

GitHub Actions CI

The Jenkins CI system has been replaced by GitHub Actions self-hosted runners.

The GitHub Actions CI will run automatically when the "Run CICD" label is added to the PR.
To re-run CI remove and add the label again.
To run CI on an untrusted fork, a NeMo user with write access must first click "Approve and run".

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you add or update any necessary documentation?
Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
- Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

New Feature
Bugfix
Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

Related to # (issue)

Signed-off-by: arendu <[email protected]>

nemo/collections/nlp/models/information_retrieval/megatron_gpt_reranker_model.py

examples/nlp/information_retrieval/megatron_gpt_reranker_finetuning.py

Signed-off-by: arendu <[email protected]>

examples/nlp/information_retrieval/megatron_gpt_reranker_finetuning.py

examples/nlp/information_retrieval/megatron_gpt_reranker_generate.py

nemo/collections/nlp/models/information_retrieval/megatron_gpt_reranker_model.py

nemo/collections/nlp/models/language_modeling/megatron_gpt_model.py

nemo/collections/nlp/modules/common/megatron/adapters/mcore_mixins.py

nemo/collections/nlp/modules/common/megatron/adapters/parallel_adapters.py

Signed-off-by: arendu <[email protected]>

nemo/collections/nlp/modules/common/megatron/adapters/mcore_mixins.py

Signed-off-by: arendu <[email protected]>

github-actions · 2024-06-16T01:51:38Z

This PR is stale because it has been open for 14 days with no activity. Remove stale label or comment or update or this will be closed in 7 days.

github-actions · 2024-06-24T01:49:28Z

This PR was closed because it has been inactive for 7 days since being marked as stale.

Signed-off-by: Adi Renduchintala <[email protected]>

Signed-off-by: arendu <[email protected]>

cuichenx

LGTM. Looks like pipeline parallel to be verified and supported in a subsequent PR.

* wip contrastive reranker Signed-off-by: arendu <[email protected]> * wip Signed-off-by: arendu <[email protected]> * wip Signed-off-by: arendu <[email protected]> * working reranker training and validation Signed-off-by: arendu <[email protected]> * default peft for reranker Signed-off-by: arendu <[email protected]> * validation time update Signed-off-by: arendu <[email protected]> * reranker test Signed-off-by: arendu <[email protected]> * reranker inference Signed-off-by: arendu <[email protected]> * reranker inference Signed-off-by: arendu <[email protected]> * Apply isort and black reformatting Signed-off-by: arendu <[email protected]> * updates Signed-off-by: arendu <[email protected]> * Apply isort and black reformatting Signed-off-by: arendu <[email protected]> * updates Signed-off-by: arendu <[email protected]> * Apply isort and black reformatting Signed-off-by: arendu <[email protected]> * also can support rlhf style reward model loss Signed-off-by: arendu <[email protected]> * Apply isort and black reformatting Signed-off-by: arendu <[email protected]> * Apply isort and black reformatting Signed-off-by: arendu <[email protected]> * typo in cicd Signed-off-by: arendu <[email protected]> --------- Signed-off-by: arendu <[email protected]> Signed-off-by: arendu <[email protected]> Signed-off-by: Adi Renduchintala <[email protected]> Co-authored-by: arendu <[email protected]>

* wip contrastive reranker Signed-off-by: arendu <[email protected]> * wip Signed-off-by: arendu <[email protected]> * wip Signed-off-by: arendu <[email protected]> * working reranker training and validation Signed-off-by: arendu <[email protected]> * default peft for reranker Signed-off-by: arendu <[email protected]> * validation time update Signed-off-by: arendu <[email protected]> * reranker test Signed-off-by: arendu <[email protected]> * reranker inference Signed-off-by: arendu <[email protected]> * reranker inference Signed-off-by: arendu <[email protected]> * Apply isort and black reformatting Signed-off-by: arendu <[email protected]> * updates Signed-off-by: arendu <[email protected]> * Apply isort and black reformatting Signed-off-by: arendu <[email protected]> * updates Signed-off-by: arendu <[email protected]> * Apply isort and black reformatting Signed-off-by: arendu <[email protected]> * also can support rlhf style reward model loss Signed-off-by: arendu <[email protected]> * Apply isort and black reformatting Signed-off-by: arendu <[email protected]> * Apply isort and black reformatting Signed-off-by: arendu <[email protected]> * typo in cicd Signed-off-by: arendu <[email protected]> --------- Signed-off-by: arendu <[email protected]> Signed-off-by: arendu <[email protected]> Signed-off-by: Adi Renduchintala <[email protected]> Co-authored-by: arendu <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]>

* wip contrastive reranker Signed-off-by: arendu <[email protected]> * wip Signed-off-by: arendu <[email protected]> * wip Signed-off-by: arendu <[email protected]> * working reranker training and validation Signed-off-by: arendu <[email protected]> * default peft for reranker Signed-off-by: arendu <[email protected]> * validation time update Signed-off-by: arendu <[email protected]> * reranker test Signed-off-by: arendu <[email protected]> * reranker inference Signed-off-by: arendu <[email protected]> * reranker inference Signed-off-by: arendu <[email protected]> * Apply isort and black reformatting Signed-off-by: arendu <[email protected]> * updates Signed-off-by: arendu <[email protected]> * Apply isort and black reformatting Signed-off-by: arendu <[email protected]> * updates Signed-off-by: arendu <[email protected]> * Apply isort and black reformatting Signed-off-by: arendu <[email protected]> * also can support rlhf style reward model loss Signed-off-by: arendu <[email protected]> * Apply isort and black reformatting Signed-off-by: arendu <[email protected]> * Apply isort and black reformatting Signed-off-by: arendu <[email protected]> * typo in cicd Signed-off-by: arendu <[email protected]> --------- Signed-off-by: arendu <[email protected]> Signed-off-by: arendu <[email protected]> Signed-off-by: Adi Renduchintala <[email protected]> Co-authored-by: arendu <[email protected]> Signed-off-by: tonyjie <[email protected]>

* wip contrastive reranker Signed-off-by: arendu <[email protected]> * wip Signed-off-by: arendu <[email protected]> * wip Signed-off-by: arendu <[email protected]> * working reranker training and validation Signed-off-by: arendu <[email protected]> * default peft for reranker Signed-off-by: arendu <[email protected]> * validation time update Signed-off-by: arendu <[email protected]> * reranker test Signed-off-by: arendu <[email protected]> * reranker inference Signed-off-by: arendu <[email protected]> * reranker inference Signed-off-by: arendu <[email protected]> * Apply isort and black reformatting Signed-off-by: arendu <[email protected]> * updates Signed-off-by: arendu <[email protected]> * Apply isort and black reformatting Signed-off-by: arendu <[email protected]> * updates Signed-off-by: arendu <[email protected]> * Apply isort and black reformatting Signed-off-by: arendu <[email protected]> * also can support rlhf style reward model loss Signed-off-by: arendu <[email protected]> * Apply isort and black reformatting Signed-off-by: arendu <[email protected]> * Apply isort and black reformatting Signed-off-by: arendu <[email protected]> * typo in cicd Signed-off-by: arendu <[email protected]> --------- Signed-off-by: arendu <[email protected]> Signed-off-by: arendu <[email protected]> Signed-off-by: Adi Renduchintala <[email protected]> Co-authored-by: arendu <[email protected]>

* wip contrastive reranker Signed-off-by: arendu <[email protected]> * wip Signed-off-by: arendu <[email protected]> * wip Signed-off-by: arendu <[email protected]> * working reranker training and validation Signed-off-by: arendu <[email protected]> * default peft for reranker Signed-off-by: arendu <[email protected]> * validation time update Signed-off-by: arendu <[email protected]> * reranker test Signed-off-by: arendu <[email protected]> * reranker inference Signed-off-by: arendu <[email protected]> * reranker inference Signed-off-by: arendu <[email protected]> * Apply isort and black reformatting Signed-off-by: arendu <[email protected]> * updates Signed-off-by: arendu <[email protected]> * Apply isort and black reformatting Signed-off-by: arendu <[email protected]> * updates Signed-off-by: arendu <[email protected]> * Apply isort and black reformatting Signed-off-by: arendu <[email protected]> * also can support rlhf style reward model loss Signed-off-by: arendu <[email protected]> * Apply isort and black reformatting Signed-off-by: arendu <[email protected]> * Apply isort and black reformatting Signed-off-by: arendu <[email protected]> * typo in cicd Signed-off-by: arendu <[email protected]> --------- Signed-off-by: arendu <[email protected]> Signed-off-by: arendu <[email protected]> Signed-off-by: Adi Renduchintala <[email protected]> Co-authored-by: arendu <[email protected]> Signed-off-by: Hainan Xu <[email protected]>

arendu added 5 commits May 8, 2024 23:47

wip contrastive reranker

ff88d62

Signed-off-by: arendu <[email protected]>

wip

6a2b005

Signed-off-by: arendu <[email protected]>

wip

f818c3c

Signed-off-by: arendu <[email protected]>

working reranker training and validation

52203b9

Signed-off-by: arendu <[email protected]>

default peft for reranker

4ac7642

Signed-off-by: arendu <[email protected]>

arendu requested review from cuichenx and gabrielspmoreira May 11, 2024 18:22

github-actions bot added the NLP label May 11, 2024

github-advanced-security bot found potential problems May 11, 2024

View reviewed changes

arendu added 2 commits May 11, 2024 18:37

validation time update

b67d0e7

Signed-off-by: arendu <[email protected]>

reranker test

7ecddf6

Signed-off-by: arendu <[email protected]>

github-actions bot added the CI label May 11, 2024

arendu added the Run CICD label May 11, 2024

arendu added 2 commits May 12, 2024 05:15

reranker inference

cebd9d8

Signed-off-by: arendu <[email protected]>

reranker inference

d1c44a8

Signed-off-by: arendu <[email protected]>

arendu added Run CICD and removed Run CICD labels May 12, 2024

Merge branch 'main' into adithyare/embedding_reranking

6b3a56f

arendu added Run CICD and removed Run CICD labels May 12, 2024

Apply isort and black reformatting

13a6f31

Signed-off-by: arendu <[email protected]>

cuichenx reviewed May 13, 2024

View reviewed changes

updates

05b4d17

Signed-off-by: arendu <[email protected]>

arendu requested a review from cuichenx May 13, 2024 20:57

Apply isort and black reformatting

2bf75c7

Signed-off-by: arendu <[email protected]>

arendu added Run CICD and removed Run CICD labels May 13, 2024

github-advanced-security bot found potential problems May 13, 2024

View reviewed changes

nemo/collections/nlp/modules/common/megatron/adapters/mcore_mixins.py Fixed Show fixed Hide fixed

nemo/collections/nlp/modules/common/megatron/adapters/mcore_mixins.py Fixed Show fixed Hide fixed

arendu and others added 2 commits May 14, 2024 16:48

updates

4c81ec2

Signed-off-by: arendu <[email protected]>

Apply isort and black reformatting

af79ef8

Signed-off-by: arendu <[email protected]>

Apply isort and black reformatting

c1e43d8

Signed-off-by: arendu <[email protected]>

github-actions bot added the stale label Jun 16, 2024

github-actions bot closed this Jun 24, 2024

arendu reopened this Jul 9, 2024

arendu and others added 2 commits July 9, 2024 12:39

Merge branch 'main' into adithyare/embedding_reranking

badebe8

Signed-off-by: Adi Renduchintala <[email protected]>

Apply isort and black reformatting

173b93b

Signed-off-by: arendu <[email protected]>

arendu added Run CICD and removed Run CICD labels Jul 9, 2024

Merge branch 'main' into adithyare/embedding_reranking

97b8dfd

arendu added Run CICD and removed Run CICD labels Jul 9, 2024

arendu added 2 commits July 10, 2024 01:42

typo in cicd

db3934d

Signed-off-by: arendu <[email protected]>

Merge branch 'main' into adithyare/embedding_reranking

3df4c64

arendu added Run CICD and removed Run CICD stale labels Jul 10, 2024

arendu requested review from cuichenx and suiyoubi July 10, 2024 13:54

cuichenx approved these changes Jul 10, 2024

View reviewed changes

arendu merged commit 74e32c8 into main Jul 10, 2024
204 checks passed

arendu deleted the adithyare/embedding_reranking branch July 10, 2024 16:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Contrastive Reranker/Reward model #9171

Contrastive Reranker/Reward model #9171

arendu commented May 11, 2024

github-actions bot commented Jun 16, 2024

github-actions bot commented Jun 24, 2024

cuichenx left a comment

Contrastive Reranker/Reward model #9171

Contrastive Reranker/Reward model #9171

Conversation

arendu commented May 11, 2024

What does this PR do ?

Changelog

Usage

GitHub Actions CI

Before your PR is "Ready for review"

Who can review?

Additional Information

github-actions bot commented Jun 16, 2024

github-actions bot commented Jun 24, 2024

cuichenx left a comment

Choose a reason for hiding this comment