forked from NVIDIA/NeMo
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Context Parallel SFT Support for dataset in THD format (NVIDIA#10688)
* Add context parallel support for packed dataset in THD format * add changes with debug print * remove debug print Signed-off-by: Lifu Zhang <[email protected]> * Apply isort and black reformatting Signed-off-by: tomlifu <[email protected]> * fix cu_seqlens and cu_seqlens_padded Signed-off-by: Lifu Zhang <[email protected]> * cu_seqlens and cu_seqlens_padded fix Signed-off-by: Lifu Zhang <[email protected]> * more fix on cu_seqlens and cu_seqlens_padded Signed-off-by: Lifu Zhang <[email protected]> * Apply isort and black reformatting Signed-off-by: tomlifu <[email protected]> * addressing Xiaowei's review Signed-off-by: Lifu Zhang <[email protected]> * addressing more review comments Signed-off-by: Lifu Zhang <[email protected]> * fix for the case where cp=1 Signed-off-by: Lifu Zhang <[email protected]> * Apply isort and black reformatting Signed-off-by: tomlifu <[email protected]> * more fix to address Xiaowei's comment Signed-off-by: Lifu Zhang <[email protected]> * fix the loss_mask for THD formated data Signed-off-by: Lifu Zhang <[email protected]> * Apply isort and black reformatting Signed-off-by: tomlifu <[email protected]> * fixed eos_idx[0][0] out of bounds issue Signed-off-by: Lifu Zhang <[email protected]> * fixed CP=1 case Signed-off-by: Lifu Zhang <[email protected]> * fixed thd_get_partitioned_indices assertion issue when pp=1 Signed-off-by: Lifu Zhang <[email protected]> * Apply isort and black reformatting Signed-off-by: tomlifu <[email protected]> * fixed data packing issue Signed-off-by: root <[email protected]> * fixed an issue where cp>1 has different loss curves Signed-off-by: Lifu Zhang <[email protected]> * remove redudant check for cu_seqlens Signed-off-by: Lifu Zhang <[email protected]> * fixed NeMo CI failure issue due to old TE version in CI Signed-off-by: Lifu Zhang <[email protected]> * Apply isort and black reformatting Signed-off-by: tomlifu <[email protected]> --------- Signed-off-by: Lifu Zhang <[email protected]> Signed-off-by: tomlifu <[email protected]> Signed-off-by: root <[email protected]> Signed-off-by: tomlifu <[email protected]> Co-authored-by: tomlifu <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: Xiaowei Ren <[email protected]>
- Loading branch information
1 parent
a6b08a6
commit 48349e4
Showing
4 changed files
with
150 additions
and
28 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters