Release 2.0.0rc1 #9786

ko3n1g · 2024-07-18T09:23:11Z

🚀 PR to release NeMo 2.0.0rc1.

📝 Please remember the following to-do's before merge:

Fill-in the comment Highlights
Review the comment Detailed Changelogs

🚨 Please also keep in mind to not delete the headings of the task commits. They are required by the post-merge automation.

🙏 Please merge this PR only if the CI workflow completed successfully.

* Nemotron ONNX export fixed Signed-off-by: Boris Fomitchev <[email protected]> * Cleanup Signed-off-by: Boris Fomitchev <[email protected]> * Addressing code review comments Signed-off-by: Boris Fomitchev <[email protected]> --------- Signed-off-by: Boris Fomitchev <[email protected]> Co-authored-by: Eric Harper <[email protected]>

* Docker cleanup

Signed-off-by: huvunvidia <[email protected]>

* add slurm files to .gitignore * add differentiable decode to SDXL VAE * Optionally return predicted noise during the single step sampling process * also change `get_gamma` as a new function to use inside other functions which may interact with sampling (e.g. draft+) * debugging sdunet converter script * Added SD/SDXL conversion script from HF to NeMo * added 'from_nemo' config for VAE * tmp commit, please make changes (oci is super slow, cannot even run vim) * new inference yaml works * add logging to autoencoder * !(dont squash) Added enabling support for LinearWrapper for SDLoRA * added samples_per_batch and fsdp arguments to SDXL inference * added extra optionally wrapper to FSDP * remove unncessary comments * remove unnecessary comments * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: Rohit Jena <[email protected]> Co-authored-by: Yu Yao <[email protected]> Co-authored-by: yaoyu-33 <[email protected]>

* add NemoQueryLLMPyTorch class for triton query of in-framework models * nemo_export.py changes to better support in-framework models * separate out in-framework version of triton deploy script * add generate() function to MegatronLLMDeployable to allow for direct use in export tests * use NemoQueryLLMPyTorch in deploy tests * add warning message for when MegatronLLMDeployable overrides transformer_engine * remove enable_streaming argument from deploy_inframework_triton.py since MegatronLLMDeployable does not support streaming add query_inframework.py since original query.py does not work with in-framework deployments * Apply isort and black reformatting Signed-off-by: jukim-nv <[email protected]> * skip trtllm support check if in_framework testing * remove unused imports * run_existing_checkpoints was passing wrong prompts argument for in-framework mode * fix unused import in query_inframework.py --------- Signed-off-by: jukim-nv <[email protected]> Co-authored-by: jukim-nv <[email protected]> Co-authored-by: Onur Yilmaz <[email protected]>

* Use FP8 in GPT TP2 test Signed-off-by: Jan Baczek <[email protected]> * Add hydra options to use TE, TP overlap and FP8 Signed-off-by: Jan Baczek <[email protected]> * Override presence checks in hydra Signed-off-by: Jan Baczek <[email protected]> * WIP: Add debug code Signed-off-by: Jan Baczek <[email protected]> * Apply isort and black reformatting Signed-off-by: jbaczek <[email protected]> * Add more debug code Signed-off-by: Jan Baczek <[email protected]> * Apply isort and black reformatting Signed-off-by: jbaczek <[email protected]> * Add more debug code Signed-off-by: Jan Baczek <[email protected]> * Apply isort and black reformatting Signed-off-by: jbaczek <[email protected]> * Remove debug code and change underlying transformer layer to TE Signed-off-by: Jan Baczek <[email protected]> * Override hydra error Signed-off-by: Jan Baczek <[email protected]> * Remove tp overlap from the test Signed-off-by: Jan Baczek <[email protected]> * Change runner for fp8 tests Signed-off-by: Jan Baczek <[email protected]> * fix Signed-off-by: Jan Baczek <[email protected]> * Add tp overlap test Signed-off-by: Jan Baczek <[email protected]> * Remove TP overlap from tests. It is unsupported in docker environment Signed-off-by: Jan Baczek <[email protected]> * Adjust GPT PP2 test to use FP8. Change optimizer in TP2 test Signed-off-by: Jan Baczek <[email protected]> * Remove env overrides form GPT PP2 test Signed-off-by: Jan Baczek <[email protected]> --------- Signed-off-by: Jan Baczek <[email protected]> Signed-off-by: jbaczek <[email protected]> Co-authored-by: jbaczek <[email protected]> Co-authored-by: Pablo Garay <[email protected]>

…variety of tensors (#9641) * enables default data step in megatron parallel to operate on a wider variety of tensors coming out of the dataloader * handles the case where a batch is empty * Apply isort and black reformatting Signed-off-by: jomitchellnv <[email protected]> * Allows the default data step to operate on more types than just dictionaries Signed-off-by: Jonathan Mitchell <[email protected]> --------- Signed-off-by: jomitchellnv <[email protected]> Signed-off-by: Jonathan Mitchell <[email protected]> Co-authored-by: jomitchellnv <[email protected]> Co-authored-by: Marc Romeyn <[email protected]>

…a wider …" (#9666)

* wip contrastive reranker Signed-off-by: arendu <[email protected]> * wip Signed-off-by: arendu <[email protected]> * wip Signed-off-by: arendu <[email protected]> * working reranker training and validation Signed-off-by: arendu <[email protected]> * default peft for reranker Signed-off-by: arendu <[email protected]> * validation time update Signed-off-by: arendu <[email protected]> * reranker test Signed-off-by: arendu <[email protected]> * reranker inference Signed-off-by: arendu <[email protected]> * reranker inference Signed-off-by: arendu <[email protected]> * Apply isort and black reformatting Signed-off-by: arendu <[email protected]> * updates Signed-off-by: arendu <[email protected]> * Apply isort and black reformatting Signed-off-by: arendu <[email protected]> * updates Signed-off-by: arendu <[email protected]> * Apply isort and black reformatting Signed-off-by: arendu <[email protected]> * also can support rlhf style reward model loss Signed-off-by: arendu <[email protected]> * Apply isort and black reformatting Signed-off-by: arendu <[email protected]> * Apply isort and black reformatting Signed-off-by: arendu <[email protected]> * typo in cicd Signed-off-by: arendu <[email protected]> --------- Signed-off-by: arendu <[email protected]> Signed-off-by: arendu <[email protected]> Signed-off-by: Adi Renduchintala <[email protected]> Co-authored-by: arendu <[email protected]>

* unpin transformers Signed-off-by: dimapihtar <[email protected]> * guard deprecated imports Signed-off-by: dimapihtar <[email protected]> * Apply isort and black reformatting Signed-off-by: dimapihtar <[email protected]> * fix import guards Signed-off-by: dimapihtar <[email protected]> * fix import guards Signed-off-by: dimapihtar <[email protected]> * Apply isort and black reformatting Signed-off-by: dimapihtar <[email protected]> * try fixing Signed-off-by: Chen Cui <[email protected]> * disable HF tests Signed-off-by: Dmytro Pykhtar <[email protected]> * try fixing Signed-off-by: Chen Cui <[email protected]> * hard code model lists Signed-off-by: Chen Cui <[email protected]> * Apply isort and black reformatting Signed-off-by: cuichenx <[email protected]> * hard code model lists Signed-off-by: Chen Cui <[email protected]> --------- Signed-off-by: dimapihtar <[email protected]> Signed-off-by: dimapihtar <[email protected]> Signed-off-by: Chen Cui <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: cuichenx <[email protected]> Co-authored-by: dimapihtar <[email protected]> Co-authored-by: Chen Cui <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: cuichenx <[email protected]>

* Added CPU offloading docs Signed-off-by: Selvaraj Anandaraj <[email protected]> * Tech writer review Signed-off-by: Selvaraj Anandaraj <[email protected]> --------- Signed-off-by: Selvaraj Anandaraj <[email protected]> Co-authored-by: Selvaraj Anandaraj <[email protected]> Co-authored-by: Yu Yao <[email protected]>

* Update llama-3 PEFT notebook to download model from NGC Signed-off-by: Shashank Verma <[email protected]> * Fix broken link in llama-3 PEFT tutorial README Signed-off-by: Shashank Verma <[email protected]> * Fix broken code block in llama 3 PEFT tutorial README Signed-off-by: Shashank Verma <[email protected]> * Copy-edits to Llama-3 8B PEFT tutorial README Signed-off-by: Shashank Verma <[email protected]> * Fix broken link Signed-off-by: Shashank Verma <[email protected]> * Minor formatting fixes Signed-off-by: Shashank Verma <[email protected]> --------- Signed-off-by: Shashank Verma <[email protected]>

Signed-off-by: ashors1 <[email protected]> Co-authored-by: Anna Shors <[email protected]> Co-authored-by: Marc Romeyn <[email protected]> Co-authored-by: ashors1 <[email protected]>

* add lita Signed-off-by: Slyne Deng <[email protected]> * Apply isort and black reformatting Signed-off-by: Slyne <[email protected]> * add part of the tutorial and fix format Signed-off-by: slyne deng <[email protected]> * add tutorial Signed-off-by: slyne deng <[email protected]> * fix Tutorial ckpt conversion Signed-off-by: slyne deng <[email protected]> * Apply isort and black reformatting Signed-off-by: Slyne <[email protected]> * update cicd Signed-off-by: Slyne Deng <[email protected]> * add to CIICD test Signed-off-by: Slyne Deng <[email protected]> * changes based on review comments Signed-off-by: Slyne Deng <[email protected]> * fix bot warning Signed-off-by: Slyne Deng <[email protected]> * update cicd main Signed-off-by: Slyne Deng <[email protected]> * fix cicd ckpt conversion Signed-off-by: Slyne Deng <[email protected]> --------- Signed-off-by: Slyne Deng <[email protected]> Signed-off-by: Slyne <[email protected]> Signed-off-by: slyne deng <[email protected]> Co-authored-by: Slyne Deng <[email protected]> Co-authored-by: Slyne <[email protected]> Co-authored-by: Yu Yao <[email protected]>

* Parametrize FPS group * Apply isort and black reformatting * Change deafult to False * Add logic to new ckptIO * Turn on parallel save by default --------- Signed-off-by: Mikołaj Błaż <[email protected]> Signed-off-by: mikolajblaz <[email protected]> Co-authored-by: mikolajblaz <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]>

* huvu/mcore_t5 first commit from local * removing DEBUGGING prints * cleaning megatron_lm_encoder_decoder_model.py code * cleaning code * adding Github action test * only run mcore T5 test * only run mcore T5 test * only run mcore T5 test * only run mcore T5 test * reset .github/workflows/cicd-main.yml * reset .github/workflows/cicd-main.yml * adding condition self.mcore_t5 when running self.build_transformer_config() * refractor megatron_lm_encoder_decoder_model.py to not use self.model * only run T5-related tests * remove all self.model * reset cicd file * reset cicd file * updating codes remove duplicate if/else; adding mcore/transformer_engine to config file * adjust +model.mcore_t5=True * fix training for non-mcore, bf16, O2 * reset cicd-main.yml --------- Co-authored-by: Huy Vu2 <[email protected]>

Signed-off-by: Oliver Koenig <[email protected]>

Signed-off-by: Pablo Garay <[email protected]>

* adding mamba support * fix import mixins * rm convert jamba * Apply isort and black reformatting Signed-off-by: JRD971000 <[email protected]> * more cleanups * use GPT text gen * Apply isort and black reformatting Signed-off-by: JRD971000 <[email protected]> * fixing gbs in TP convetor * Apply isort and black reformatting Signed-off-by: JRD971000 <[email protected]> * add reqs * add tutorial * minor fix to tutorial * moving finetuning files Signed-off-by: arendu <[email protected]> * moving finetuning files Signed-off-by: arendu <[email protected]> * address comments * Apply isort and black reformatting Signed-off-by: JRD971000 <[email protected]> * address comments * Apply isort and black reformatting Signed-off-by: JRD971000 <[email protected]> * address comments * add mamba dependancies * add mcore tag * modify dockerfile ci * modify dockerfile ci * fix TP>1 to TP1 * add inference, update based on latest mcore commits * Apply isort and black reformatting Signed-off-by: JRD971000 <[email protected]> * minor fix * Apply isort and black reformatting Signed-off-by: JRD971000 <[email protected]> * minor fix * Apply isort and black reformatting Signed-off-by: JRD971000 <[email protected]> * bug fix, tutorial update --------- Signed-off-by: JRD971000 <[email protected]> Signed-off-by: arendu <[email protected]> Co-authored-by: Ali Taghibakhshi <[email protected]> Co-authored-by: JRD971000 <[email protected]> Co-authored-by: arendu <[email protected]>

Signed-off-by: Ryan <[email protected]>

* commit to eval/sft/peft * update MCORE_COMMIT * address Chen's comments, updating retro unit test * Apply isort and black reformatting Signed-off-by: huvunvidia <[email protected]> --------- Signed-off-by: huvunvidia <[email protected]> Co-authored-by: Huy Vu2 <[email protected]> Co-authored-by: huvunvidia <[email protected]>

* Allow non-strict load * Point to non-stric load MCore branch * Avoid module level StrictHandling * Use MCore fork * Update to MCore fix * Restore ackward compatibility * Update flag defaults * Update MCore tag * Update PyT Dist interface * Update to latest core_r0.8.0 --------- Signed-off-by: Mikołaj Błaż <[email protected]> Co-authored-by: mikolajblaz <[email protected]>

Signed-off-by: Oliver Koenig <[email protected]>

* fix legacy ds padding bug Signed-off-by: dimapihtar <[email protected]> * Apply isort and black reformatting Signed-off-by: dimapihtar <[email protected]> * avoid code repetition Signed-off-by: dimapihtar <[email protected]> * fix typo Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: dimapihtar <[email protected]> Signed-off-by: dimapihtar <[email protected]> Co-authored-by: dimapihtar <[email protected]>

…variety of tensors - second try (#9671) * enables default data step in megatron parallel to operate on a wider variety of tensors coming out of the dataloader Signed-off-by: Jonathan Mitchell <[email protected]> * handles the case where a batch is empty Signed-off-by: Jonathan Mitchell <[email protected]> * Apply isort and black reformatting Signed-off-by: jomitchellnv <[email protected]> Signed-off-by: Jonathan Mitchell <[email protected]> * Allows the default data step to operate on more types than just dictionaries Signed-off-by: Jonathan Mitchell <[email protected]> * Apply isort and black reformatting Signed-off-by: jomitchellnv <[email protected]> --------- Signed-off-by: Jonathan Mitchell <[email protected]> Signed-off-by: jomitchellnv <[email protected]> Co-authored-by: jomitchellnv <[email protected]> Co-authored-by: John St. John <[email protected]>

* Fix when optimizers are setup for PEFT * Apply isort and black reformatting * Init DDP inside PEFT * Apply isort and black reformatting * Some fixes, loss seems to become nan with peft for some reason * Apply isort and black reformatting * Loss goes down on fp32 * Apply isort and black reformatting * Simplifying FNMixin * Apply isort and black reformatting * Fix bug with new checkpoint-io * Apply isort and black reformatting * Fix failing test: test_peft_on_train_epoch_start_with_adapter * Apply isort and black reformatting --------- Signed-off-by: marcromeyn <[email protected]> Signed-off-by: ashors1 <[email protected]> Co-authored-by: Marc Romeyn <[email protected]> Co-authored-by: marcromeyn <[email protected]> Co-authored-by: Chen Cui <[email protected]> Co-authored-by: ashors1 <[email protected]>

* refactor: README * refactor: Use new README in `setup.py` Signed-off-by: Oliver Koenig <[email protected]>

* Remove mask if use fusion mask Signed-off-by: Cheng-Ping Hsieh <[email protected]> * Apply isort and black reformatting Signed-off-by: hsiehjackson <[email protected]> --------- Signed-off-by: Cheng-Ping Hsieh <[email protected]> Signed-off-by: hsiehjackson <[email protected]> Co-authored-by: hsiehjackson <[email protected]>

* nemo ux mixtral 8x22b config Signed-off-by: Alexandros Koumparoulis <[email protected]> * add mixtral 8x22b recipe Signed-off-by: Alexandros Koumparoulis <[email protected]> * add note Signed-off-by: Alexandros Koumparoulis <[email protected]> * fix type hint Signed-off-by: Alexandros Koumparoulis <[email protected]> * Apply isort and black reformatting Signed-off-by: akoumpa <[email protected]> * fix type hint Signed-off-by: Alexandros Koumparoulis <[email protected]> --------- Signed-off-by: Alexandros Koumparoulis <[email protected]> Signed-off-by: akoumpa <[email protected]> Co-authored-by: akoumpa <[email protected]>

) * Fix logging of consumed samples in MegatronDataSampler Signed-off-by: Hemil Desai <[email protected]> * Apply isort and black reformatting Signed-off-by: hemildesai <[email protected]> * Remove unused import Signed-off-by: Hemil Desai <[email protected]> --------- Signed-off-by: Hemil Desai <[email protected]> Signed-off-by: hemildesai <[email protected]> Co-authored-by: hemildesai <[email protected]>

* updat default PTL logging directories * fix logger versions * fix failed test * add better documentation for 'update_logger_directory' --------- Signed-off-by: ashors1 <[email protected]> Co-authored-by: Anna Shors <[email protected]>

* wrap task config save in a try/except * move fiddle import --------- Signed-off-by: ashors1 <[email protected]> Co-authored-by: Anna Shors <[email protected]>

* Use directly trtllm-build command for quantized checkpoints and remove depedency on modelopt for this Signed-off-by: Jan Lasek <[email protected]> * Fix error messages Signed-off-by: Jan Lasek <[email protected]> * Move setting max_seq_len level up Signed-off-by: Jan Lasek <[email protected]> * Apply isort and black reformatting Signed-off-by: janekl <[email protected]> * Bump ModelOpt version Signed-off-by: Jan Lasek <[email protected]> --------- Signed-off-by: Jan Lasek <[email protected]> Signed-off-by: janekl <[email protected]> Co-authored-by: janekl <[email protected]>

* Fix transcription move_to_device Signed-off-by: Piotr Żelasko <[email protected]> * fix Signed-off-by: Piotr Żelasko <[email protected]> * Fix Canary's transcribe after introducing dataclass for mini-batch representation Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]>

…e // head… (#9994) * Use kv_channels to enable cases where head_dim != hidden_size // head_num Signed-off-by: Alexandros Koumparoulis <[email protected]> * Add head_dim to exporter Signed-off-by: Alexandros Koumparoulis <[email protected]> * Drop default values for kv_channels Signed-off-by: Alexandros Koumparoulis <[email protected]> --------- Signed-off-by: Alexandros Koumparoulis <[email protected]>

* [lhotse] Support for NeMo tarred manifests with offset field Signed-off-by: Piotr Żelasko <[email protected]> * typo fix Signed-off-by: Piotr Żelasko <[email protected]> * fix basename Signed-off-by: Piotr Żelasko <[email protected]> * relieve heavy CPU memory usage for super-long tarred recordings Signed-off-by: Piotr Żelasko <[email protected]> * Tests and fixes Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]>

Signed-off-by: Alexandros Koumparoulis <[email protected]>

…#9929) Signed-off-by: paul-gibbons <[email protected]> Signed-off-by: Yu Yao <[email protected]> Co-authored-by: Paul Gibbons <[email protected]> Co-authored-by: Yu Yao <[email protected]>

Signed-off-by: Sangkug Lym <[email protected]>

Signed-off-by: Alexandros Koumparoulis <[email protected]>

* Make MegatronStrategy.parallelism return ParallelismConfig Signed-off-by: Alexandros Koumparoulis <[email protected]> * Apply isort and black reformatting Signed-off-by: akoumpa <[email protected]> * Make PrallelismConfig a dataclass Signed-off-by: Alexandros Koumparoulis <[email protected]> * Add note on import cycle Signed-off-by: Alexandros Koumparoulis <[email protected]> --------- Signed-off-by: Alexandros Koumparoulis <[email protected]> Signed-off-by: akoumpa <[email protected]> Co-authored-by: akoumpa <[email protected]>

Signed-off-by: ashors1 <[email protected]>

* update structure Signed-off-by: yaoyu-33 <[email protected]> * update structure Signed-off-by: yaoyu-33 <[email protected]> * add image Signed-off-by: yaoyu-33 <[email protected]> * address comments Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]>

ko3n1g · 2024-08-07T23:12:09Z

Highlights

Training

Features and Model architectures

PEFT: QLoRA support, LoRA/QLora for Mixture-of-Experts (MoE) dense layer
State Space Models & Hybrid Architecture support (Mamba2 and NV-Mamba2-hybrid)
Support Nemotron, Minitron, Gemma2, Qwen, RAG

Multimodal

NeVA: Add SOTA LLM backbone support (Mixtral/LLaMA3) and suite of model parallelism support (PP/EP)
Support Language Instructed Temporal-Localization Assistant (LITA) on top of video NeVA

ASR

SpeechLM and SALM

Adapters for Canary Customization

Pytorch allocator in PyTorch 2.2 improves training speed up to 30% for all ASR models

Cuda Graphs for Transducer Inference

Replaced webdataset with Lhotse - gives up to 2x speedup

Transcription Improvements - Speedup and QoL Changes

ASR Prompt Formatter for multimodal Canary

* Fix torch version for tts asr import check test Signed-off-by: Dong Hyuk Chang <[email protected]> * Ignore torch requirement Signed-off-by: Dong Hyuk Chang <[email protected]> * Update base image used for import check Signed-off-by: Dong Hyuk Chang <[email protected]> --------- Signed-off-by: Dong Hyuk Chang <[email protected]> Co-authored-by: Dong Hyuk Chang <[email protected]>

* rm torch version check Signed-off-by: Farhad Ramezanghorbani <[email protected]> * bump min torch version Signed-off-by: Farhad Ramezanghorbani <[email protected]> * rm version Signed-off-by: Farhad Ramezanghorbani <[email protected]> --------- Signed-off-by: Farhad Ramezanghorbani <[email protected]> Co-authored-by: Marc Romeyn <[email protected]>

* Moe doc fixes Signed-off-by: Alexandros Koumparoulis <[email protected]> * JG fixes Signed-off-by: Alexandros Koumparoulis <[email protected]> --------- Signed-off-by: Alexandros Koumparoulis <[email protected]>

* comment docs Signed-off-by: eharper <[email protected]> * fix link Signed-off-by: eharper <[email protected]> * comment Signed-off-by: eharper <[email protected]> * fix noindex syntax Signed-off-by: eharper <[email protected]> --------- Signed-off-by: eharper <[email protected]>

* ci: Token permission to cancel Workflow run Signed-off-by: Oliver Koenig <[email protected]> * ci: Use template Signed-off-by: Oliver Koenig <[email protected]> * ci: Combine cleanup and main job Signed-off-by: Oliver Koenig <[email protected]> --------- Signed-off-by: Oliver Koenig <[email protected]>

Signed-off-by: Oliver Koenig <[email protected]>

borisfom and others added 30 commits July 8, 2024 14:50

support lora when kv_channel != hidden_size / num_heads (#9636)

62459cc

[Nemo CICD] Docker temp files auto-cleanup (#9642)

55ee9f4

* Docker cleanup

Update Dockerfile.ci (#9651)

b97da9c

Signed-off-by: huvunvidia <[email protected]>

Revert "enables default data step in megatron parallel to operate on …

355d3c5

…a wider …" (#9666)

fix pipeline parallel dtype bug (#9637) (#9661)

4e5174b

Signed-off-by: ashors1 <[email protected]> Co-authored-by: Anna Shors <[email protected]> Co-authored-by: Marc Romeyn <[email protected]> Co-authored-by: ashors1 <[email protected]>

chore: Version bump NeMo (#9631)

3cf5a1d

Signed-off-by: Oliver Koenig <[email protected]>

add a bit more for timeout (#9702)

693c55f

Signed-off-by: Pablo Garay <[email protected]>

NeMo performance feature documentation (#9482)

6f91dcc

[TTS] Add fullband mel codec checkpoints (#9704)

472ff9f

Signed-off-by: Ryan <[email protected]>

refactor: Uniform BRANCH for notebooks (#9710)

599b60f

Signed-off-by: Oliver Koenig <[email protected]>

refactor: README (#9712)

02ff85b

* refactor: README * refactor: Use new README in `setup.py` Signed-off-by: Oliver Koenig <[email protected]>

akoumpa and others added 4 commits August 6, 2024 11:52

[NeMo-UX] Wrap task config save in a try/except (#9956) (#9984)

7b7d02f

* wrap task config save in a try/except * move fiddle import --------- Signed-off-by: ashors1 <[email protected]> Co-authored-by: Anna Shors <[email protected]>

pablo-garay temporarily deployed to main August 7, 2024 00:19 — with GitHub Actions Inactive

janekl and others added 11 commits August 7, 2024 11:43

remove assertation for models with unknown chat template (#10042)

695fadc

Signed-off-by: Alexandros Koumparoulis <[email protected]>

add mixtral neva tutorial + update tutorials + update configs (#9926) (…

7cae5c4

…#9929) Signed-off-by: paul-gibbons <[email protected]> Signed-off-by: Yu Yao <[email protected]> Co-authored-by: Paul Gibbons <[email protected]> Co-authored-by: Yu Yao <[email protected]>

add the mcore interface for optim arg; average_in_collective (#10010)

6cf59fa

Signed-off-by: Sangkug Lym <[email protected]>

mixtral recipe (#9975)

633c373

Signed-off-by: Alexandros Koumparoulis <[email protected]>

log learning rate before optimizer step (#10063)

4ee9148

Signed-off-by: ashors1 <[email protected]>

pablo-garay temporarily deployed to main August 8, 2024 00:19 — with GitHub Actions Inactive

pablo-garay temporarily deployed to main August 9, 2024 00:19 — with GitHub Actions Inactive

farhadrgh and others added 2 commits August 9, 2024 11:44

Moe doc fixes (#10077)

86715c1

* Moe doc fixes Signed-off-by: Alexandros Koumparoulis <[email protected]> * JG fixes Signed-off-by: Alexandros Koumparoulis <[email protected]> --------- Signed-off-by: Alexandros Koumparoulis <[email protected]>

pablo-garay temporarily deployed to main August 10, 2024 00:19 — with GitHub Actions Inactive

pablo-garay temporarily deployed to main August 11, 2024 00:21 — with GitHub Actions Inactive

pablo-garay temporarily deployed to main August 12, 2024 00:20 — with GitHub Actions Inactive

ericharper and others added 3 commits August 12, 2024 00:28

ci: Proper cleanup (#10114)

d6cfdc0

Signed-off-by: Oliver Koenig <[email protected]>

pablo-garay temporarily deployed to main August 13, 2024 00:20 — with GitHub Actions Inactive

ko3n1g closed this Aug 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release 2.0.0rc1 #9786

Release 2.0.0rc1 #9786

ko3n1g commented Jul 18, 2024 •

edited by nithinraok

Loading

ko3n1g commented Aug 7, 2024

Release 2.0.0rc1 #9786

Release 2.0.0rc1 #9786

Conversation

ko3n1g commented Jul 18, 2024 • edited by nithinraok Loading

ko3n1g commented Aug 7, 2024

Highlights

Training

Features and Model architectures

Multimodal

ASR

SpeechLM and SALM

Adapters for Canary Customization

Pytorch allocator in PyTorch 2.2 improves training speed up to 30% for all ASR models

Cuda Graphs for Transducer Inference

Replaced webdataset with Lhotse - gives up to 2x speedup

Transcription Improvements - Speedup and QoL Changes

ASR Prompt Formatter for multimodal Canary

ko3n1g commented Jul 18, 2024 •

edited by nithinraok

Loading