Skip to content

Commit

Permalink
PTQ example for NeMo 2.0 (#10642)
Browse files Browse the repository at this point in the history
* initial commit

Signed-off-by: Piotr Kaminski <[email protected]>

* create Quantizer for NeMo 2.0

Signed-off-by: Piotr Kaminski <[email protected]>

* refactor

Signed-off-by: Piotr Kaminski <[email protected]>

* Call quantize on an unwrapped mcore model

Signed-off-by: Piotr Kaminski <[email protected]>

* Apply isort and black reformatting

Signed-off-by: Laplasjan107 <[email protected]>

* Add tests, adjust unwrapping

Signed-off-by: Piotr Kaminski <[email protected]>

* Apply isort and black reformatting

Signed-off-by: Laplasjan107 <[email protected]>

* fix export

Signed-off-by: Piotr Kaminski <[email protected]>

* Apply isort and black reformatting

Signed-off-by: Laplasjan107 <[email protected]>

* Apply isort and black reformatting

Signed-off-by: artbataev <[email protected]>

* Fix output_path argument for HF import

Signed-off-by: Piotr Kamiński <[email protected]>

* fix fabric ckpt loading

Signed-off-by: Piotr Kaminski <[email protected]>

* Apply isort and black reformatting

Signed-off-by: Laplasjan107 <[email protected]>

* code review suggestions

Signed-off-by: Piotr Kaminski <[email protected]>

* Apply isort and black reformatting

Signed-off-by: Laplasjan107 <[email protected]>

* remove unused import

Signed-off-by: Piotr Kaminski <[email protected]>

* use cnn dataset in github ci

Signed-off-by: Piotr Kaminski <[email protected]>

* applied code review

Signed-off-by: Piotr Kaminski <[email protected]>

* code review changes

Signed-off-by: Piotr Kaminski <[email protected]>

* Apply isort and black reformatting

Signed-off-by: Laplasjan107 <[email protected]>

* simplify interface for data iterator

Signed-off-by: Piotr Kaminski <[email protected]>

* Apply isort and black reformatting

Signed-off-by: Laplasjan107 <[email protected]>

* (partial) PP fix

Signed-off-by: Piotr Kaminski <[email protected]>

* Apply isort and black reformatting

Signed-off-by: Laplasjan107 <[email protected]>

---------

Signed-off-by: Piotr Kaminski <[email protected]>
Signed-off-by: Laplasjan107 <[email protected]>
Signed-off-by: Piotr Kamiński <[email protected]>
Signed-off-by: artbataev <[email protected]>
Co-authored-by: Piotr Kaminski <[email protected]>
Co-authored-by: Laplasjan107 <[email protected]>
Co-authored-by: artbataev <[email protected]>
  • Loading branch information
4 people authored and yaoyu-33 committed Oct 25, 2024
1 parent 7a4b544 commit 55aa6f9
Show file tree
Hide file tree
Showing 10 changed files with 586 additions and 3 deletions.
16 changes: 16 additions & 0 deletions .github/workflows/cicd-main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4308,6 +4308,21 @@ jobs:
SCRIPT: |
bash tests/collections/llm/bitexact/mixtral/run.sh
L2_NeMo_2_PTQ_Llama2_FP8:
needs: [cicd-test-container-setup]
uses: ./.github/workflows/_test_template.yml
if: contains(fromJSON(needs.cicd-test-container-setup.outputs.test_to_run), 'L2_NeMo_2_PTQ_Llama2_FP8') || needs.cicd-test-container-setup.outputs.all == 'true'
with:
RUNNER: self-hosted-azure
SCRIPT: |
python tests/collections/llm/test_hf_import.py --hf_model /home/TestData/nlp/megatron_llama/llama-ci-hf --output_path /tmp/nemo2_ckpt
python scripts/llm/ptq.py -nc /tmp/nemo2_ckpt -algo fp8 -out /tmp/nemo2_ptq_engine
AFTER_SCRIPT: |
rm -rf /tmp/nemo2_ckpt
rm -rf /tmp/nemo2_ptq_engine
Nemo_CICD_Test:
needs:
- pre-flight
Expand Down Expand Up @@ -4455,6 +4470,7 @@ jobs:
- L2_Speech_Transcription_Canary_Transcribe_Audio_Dir
- L2_Megatron_GPT_Reranker
- L2_NeMo_2_NeMo_Mcore_Mixtral_bitexact
- L2_NeMo_2_PTQ_Llama2_FP8
if: always()
runs-on: ubuntu-latest
steps:
Expand Down
3 changes: 3 additions & 0 deletions nemo/collections/llm/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,7 @@
MistralConfig7B,
MistralModel,
MistralNeMoConfig12B,
MixtralConfig,
MixtralConfig8x3B,
MixtralConfig8x7B,
MixtralConfig8x22B,
Expand Down Expand Up @@ -104,6 +105,7 @@
gpt_data_step,
gpt_forward_step,
)
from nemo.collections.llm.quantization import Quantizer, get_calib_data_iter
from nemo.collections.llm.t5.model import T5Config, T5Model, t5_data_step, t5_forward_step

__all__ = [
Expand All @@ -120,6 +122,7 @@
"MistralConfig7B",
"MistralNeMoConfig12B",
"MistralModel",
"MixtralConfig",
"MixtralConfig8x3B",
"MixtralConfig8x7B",
"MixtralConfig8x22B",
Expand Down
2 changes: 2 additions & 0 deletions nemo/collections/llm/gpt/model/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@
)
from nemo.collections.llm.gpt.model.mistral import MistralConfig7B, MistralModel, MistralNeMoConfig12B
from nemo.collections.llm.gpt.model.mixtral import (
MixtralConfig,
MixtralConfig8x3B,
MixtralConfig8x7B,
MixtralConfig8x22B,
Expand Down Expand Up @@ -105,6 +106,7 @@
"MixtralConfig8x3B",
"MixtralConfig8x7B",
"MixtralConfig8x22B",
"MixtralConfig",
"MixtralModel",
"Starcoder2Config",
"Starcoder2Model",
Expand Down
25 changes: 25 additions & 0 deletions nemo/collections/llm/quantization/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# Copyright (c) 2024, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

from .quantizer import ExportConfig, QuantizationConfig, Quantizer, create_data_iterator_getter, get_calib_data_iter
from .utils import load_with_modelopt_layer_spec

__all__ = [
"Quantizer",
"QuantizationConfig",
"ExportConfig",
"get_calib_data_iter",
"load_with_modelopt_layer_spec",
"create_data_iterator_getter",
]
Loading

0 comments on commit 55aa6f9

Please sign in to comment.