-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Changes to enable CUDA graph for LLM (#8955)
* Changes to enable CUDA graph for LLM (#8751) * Use next instead of get_batch Signed-off-by: Vasudevan Rengasamy <[email protected]> * CUDA graph changes Signed-off-by: Vasudevan Rengasamy <[email protected]> * Change to enable CG with weight caching Signed-off-by: Vasudevan Rengasamy <[email protected]> * Revert "Use next instead of get_batch" This reverts commit 0021bb4. Signed-off-by: Vasudevan Rengasamy <[email protected]> * Copy jbaczek/mcore_parallel_state_api_change branch leaving out changes to nemo/export/quantize/quantizer.py Signed-off-by: Jan Baczek <[email protected]> Signed-off-by: Vasudevan Rengasamy <[email protected]> * Revert "Copy jbaczek/mcore_parallel_state_api_change branch leaving out changes to nemo/export/quantize/quantizer.py" This reverts commit b4f736e. Signed-off-by: Vasudevan Rengasamy <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Vasudevan Rengasamy <[email protected]> * Remove skip_weight_update argument Signed-off-by: Vasudevan Rengasamy <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Vasudevan Rengasamy <[email protected]> * Bug fix + cleanup Signed-off-by: Vasudevan Rengasamy <[email protected]> * Cleanup Signed-off-by: Vasudevan Rengasamy <[email protected]> * Use new TE API for FP8 Param transpose Signed-off-by: Vasudevan Rengasamy <[email protected]> * Change config param cuda_graph to enable_cuda_graph Signed-off-by: Vasudevan Rengasamy <[email protected]> * Enable TE RNGStatesTracker through config Signed-off-by: Vasudevan Rengasamy <[email protected]> * Change te_rng_tracker to use_te_rng_tracker Signed-off-by: Vasudevan Rengasamy <[email protected]> * FP8 weight transpose handled inside TE Signed-off-by: Vasudevan Rengasamy <[email protected]> * Cleanup Signed-off-by: Vasudevan Rengasamy <[email protected]> * Revert "Revert "Copy jbaczek/mcore_parallel_state_api_change branch leaving out changes to nemo/export/quantize/quantizer.py"" This reverts commit e318624. Signed-off-by: Vasudevan Rengasamy <[email protected]> * Fix merge conflicts Signed-off-by: Vasudevan Rengasamy <[email protected]> * Fix merge conflicts Signed-off-by: Vasudevan Rengasamy <[email protected]> * Fix merge conflicts Signed-off-by: Vasudevan Rengasamy <[email protected]> --------- Signed-off-by: Vasudevan Rengasamy <[email protected]> Signed-off-by: Jan Baczek <[email protected]> Co-authored-by: Jaemin Choi <[email protected]> Co-authored-by: Jan Baczek <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Apply isort and black reformatting Signed-off-by: ericharper <[email protected]> --------- Signed-off-by: Vasudevan Rengasamy <[email protected]> Signed-off-by: Jan Baczek <[email protected]> Signed-off-by: ericharper <[email protected]> Co-authored-by: vasunvidia <[email protected]> Co-authored-by: Jaemin Choi <[email protected]> Co-authored-by: Jan Baczek <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: ericharper <[email protected]> Signed-off-by: Jan Lasek <[email protected]>
- Loading branch information
Showing
4 changed files
with
75 additions
and
25 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters