NVIDIA Neural Modules 1.21.0
Highlights
Models
NeMo ASR
- Multi-lookahead cache-aware streaming
- Speech enahncement tutorial #6492
- Online code switching dataset #6579
NeMo TTS
- AudioCodec: Training recipe for EnCodec #6852
NeMo Framework
NeMo Core
- Update to PTL 2.0 #6433
NeMo Tools
- Forced aligner tutorial #7210
Container
For additional information regarding NeMo containers, please visit: https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo
docker pull nvcr.io/nvidia/nemo:23.08
ASR
Changelog
- Fix require_grad typos by @kit1980 :: PR: #6930
- rnnt_greedy_decoding.py: typos? auto-repressively -> auto-regressively by @vadimkantorov :: PR: #6989
- Adding tutorial for confidence ensembles by @Kipok :: PR: #6932
- Add support for Numba FP16 RNNT Loss by @titu1994 :: PR: #6991
- fix install_beamsearch_decoders by @karpnv :: PR: #7011
- rnnt and char utils by @karpnv :: PR: #6971
- ASR Confidence update and tutorial by @GNroy :: PR: #6810
- st standalone model by @AlexGrinch :: PR: #6969
- Fix typo in ASR-TTS tutorial by @artbataev :: PR: #7049
- Update Frame-VAD doc and fix onnx export by @stevehuang52 :: PR: #7076
- Fast Conformer global token fix by @sam1373 :: PR: #7085
- Added script to extract ASR CTC and RNNT models from ASR hybrid models by @trias702 :: PR: #7092
- Fix absolute path in path join call by @kingjan1999 :: PR: #7099
- NeMo ASR Demo by @lleaver :: PR: #7110
- Fix plot function in vad_utils.py by @stevehuang52 :: PR: #7113
- Fixed small bug with NoisePerturbationWithNormalization by @trias702 :: PR: #7118
- Merge release r1.20.0 to main by @ericharper :: PR: #7167
- minor fix for conformer subsampling docstring. by @XuesongYang :: PR: #7195
- [ASR] Fix GPU memory leak in transcribe_speech.py by @rlangman :: PR: #7249
- Adding Multilingual, Code-Switched, and Hybrid ASR models by @KunalDhawan :: PR: #7250
- fix partial transcribe by @stevehuang52 :: PR: #7284
- Conv1d subsampling by @burchim :: PR: #7294
- add bf16 inference support and fix seq_len stft issue by @nithinraok :: PR: #7338
- Add finetuning scripts by @nithinraok :: PR: #7263
- Move parameter: trainer -> exp_manager (for PTL 2.0) by @artbataev :: PR: #7339
- Fix typos by @omahs :: PR: #7361
- Fix wrong calling of librosa.get_duration() in notebook by @RobinDong :: PR: #7376
- RNN-T confidence and alignment bugfix (#7381) by @GNroy :: PR: #7459
- update branch by @nithinraok :: PR: #7488
- Replace strategy = None with strategy = auto for notebooks by @athitten :: PR: #7521
- Fix PTL2.0 related ASR bugs in r1.21.0: Val metrics logging, None dataloader issue by @KunalDhawan :: PR: #7531
- gpus -> devices by @nithinraok :: PR: #7542
- [BugFix] Add missing quotes for auto strategy in tutorial notebooks by @athitten :: PR: #7541
- Append output of val_step to self.validation_step_outputs in EncMaskDecAudioToAudioModel by @athitten :: PR: #7543
- fix validation_step_outputs initialization for multi-dataloader by @KunalDhawan :: PR: #7546
- Append val/test output to instance variable in EncDecSpeakerLabelModel by @athitten :: PR: #7562
- update strategy by @nithinraok :: PR: #7577
- Typo fixes by @Kipok :: PR: #7591
- Fix metrics for SE tutorial by @anteju :: PR: #7604
- fix ssl models ptl monitor val through logging by @nithinraok :: PR: #7608
- Fix py3.11 dataclasses issue by @titu1994 :: PR: #7582
- bugfix: trainer.gpus, trainer.strategy, trainer.accelerator by @XuesongYang :: PR: #7621
- Safeguard nemo_text_processing installation on ARM (#7485) by @blisc :: PR: #7619
- [ASR] Fix type error in jasper by @rlangman :: PR: #7636
- Fix vad & speech command tutorial - onnx by @fayejf :: PR: #7671
- Replace strategy='dp'/None with 'auto' by @athitten :: PR: #7681
- Fix multi rank finetune for ASR by @titu1994 :: PR: #7684
- fix ptl_bugs in slu_models.py by @jzi040941 :: PR: #7689
- Add NLPDDPStrategyNotebook and change trainer gpus to devices by @athitten :: PR: #7741
- Updated installation of ctc-decoders by @vsl9 :: PR: #7746
- Fix bug wrt change decoding strategy for bpe models by @titu1994 :: PR: #7762
TTS
Changelog
- [TTS] Add cosine distance option to TTS aligner by @rlangman :: PR: #6806
- [TTS] Add tutorial for TTS data prep scripts by @rlangman :: PR: #6922
- update TTS readme by @XuesongYang :: PR: #7088
- [TTS] Create EnCodec training recipe by @rlangman :: PR: #6852
- [TTS][ZH] add Chinese TTS recipes based on IPA symbol sets. by @XuesongYang :: PR: #6893
- [TTS] Add output audio format to preprocessing by @rlangman :: PR: #6889
- [TTS] Remove nested TTS configs by @rlangman :: PR: #7154
- [TTS] Fix TTS recipes with PTL 2.0 by @rlangman :: PR: #7188
- [TTS] Add license to ported EnCodec code by @rlangman :: PR: #7197
- [Fix] Discriminator update in AudioCodecModel by @anteju :: PR: #7209
- Adapter ipa Tutorial and config update by @styagi130 :: PR: #7260
- [TTS] Audio codec fixes by @rlangman :: PR: #7266
- [TTS] minor fix typos and input_types by @XuesongYang :: PR: #7272
- specify explicitly to set pretrained model paths by @styagi130 :: PR: #7305
- [TTS] Update AudioCodec API by @anteju :: PR: #7310
- [TTS] Add additional config to preprocess_text and compute_feature_stats by @rlangman :: PR: #7321
- [TTS] Change audio codec token type to TokenIndex by @rlangman :: PR: #7356
- fixed trainer.strategy=auto from None. by @XuesongYang :: PR: #7369
- [TTS] Added a callback for logging initial data by @anteju :: PR: #7384
- [TTS] bugfix: trainer.accelerator=auto from None. by @XuesongYang :: PR: #7492
- bugfix: specify trainer.strategy=auto when devices=1 by @XuesongYang :: PR: #7509
- Fix dimensionality in get_dist function by @redoctopus :: PR: #7506
- Fix TTS FastPitch tutorial by @hsiehjackson :: PR: #7494
- [TTS] remove curly braces from in jupyer notebook cell. by @XuesongYang :: PR: #7554
- [TTS] fixed trainer's accelerator and strategy. by @XuesongYang :: PR: #7569
- Change hifigan finetune strategy to ddp_find_unused_parameters_true by @hsiehjackson :: PR: #7579
- Fix validation in G2PModel and ThutmoseTaggerModel by @athitten :: PR: #7597
- [TTS] Fix FastPitch data prep tutorial by @rlangman :: PR: #7602
- [TTS] Add dataset to path of logged artifacts by @rlangman :: PR: #7651
NLP / NMT
Changelog
- Minor MPT-7B fixes and creation script update by @trias702 :: PR: #6982
- remove hard coded input and output fields by @arendu :: PR: #7008
- RoPE length extrapolation with interpolation by @MaximumEntropy :: PR: #7005
- add async + distopt to sft by @MaximumEntropy :: PR: #7018
- ptuning inference table bug fix by @arendu :: PR: #7015
- Fix missing import for GPT SFT by @MaximumEntropy :: PR: #7026
- Add end_strings to SamplingParams by @markelsanz14 :: PR: #6986
- Fix race condition for downloading cache when executing with multi-node by @findkim :: PR: #7016
- added back the retro documents. by @yidong72 :: PR: #7033
- remove pos emb from state dict for old models by @ekmb :: PR: #7068
- memmap worker arg by @arendu :: PR: #7062
- Disable distopt contiguous param buffer by default by @timmoon10 :: PR: #7095
- [Fix] load_state_dict in nlp_model.py by @stevehuang52 :: PR: #7086
- Fix tokenizer file caching where torch.distributed may not be initialized yet by @findkim :: PR: #7061
- freeze base mode on init during peft by @arendu :: PR: #7152
- Include the scripts for preprocessing OAST and unit tests for chat sft datasets by @yidong72 :: PR: #7112
- T5 metrics fix by @jubick1337 :: PR: #7037
- megatron gpt training fix by @anmolgupt :: PR: #7199
- Fix T5 using FA by @hsiehjackson :: PR: #7196
- fix-causal-fa-infer by @hsiehjackson :: PR: #7200
- Fix gpt trainer test by @hsiehjackson :: PR: #6915
- Load ub_cfg from hydra config by @jbaczek :: PR: #7003
- Fixes for lightning 2.0 upgrade by @athitten :: PR: #7176
- Fix which was off by one batch by @odelalleau :: PR: #7212
- Start using ModelParallelConfig from Megatron Core by @ericharper :: PR: #6885
- deprecation warning by @arendu :: PR: #7193
- Fix attention mask inference by @hsiehjackson :: PR: #7213
- Use GPTModel from mcore by @ericharper :: PR: #7093
- Add bf16-mixed and 16-mixed in module.py by @athitten :: PR: #7227
- Refactor LLM pretraining examples by @maanug-nv :: PR: #7159
- Add only trainable parameters to optimizer group in PEFT by @guyueh1 :: PR: #7230
- Dummy class for ModelParallelConfig by @ericharper :: PR: #7254
- [TN][Docs] update language coverage matrix and refs by @mgrafu :: PR: #7247
- tied weights for adapters by @arendu :: PR: #6928
- Fix skip generation by @hsiehjackson :: PR: #7270
- Hidden transforms model parallel config + CI with Perceiver by @michalivne :: PR: #7241
- Fix restore sequence parallel by @hsiehjackson :: PR: #7273
- fix ptuning and lora model_parallel_config by @blahBlahhhJ :: PR: #7287
- Fix adapters and ptuning for amp O2 by @guyueh1 :: PR: #7285
- remove additional line in peft state dict by @blahBlahhhJ :: PR: #7293
- loss mask aware final layer applicaiton by @arendu :: PR: #7275
- Adding server option to peft eval by @Davood-M :: PR: #7292
- migrated class CSVFieldsMemmapDataset from BioNeMo by @dorotat-nv :: PR: #7314
- remove old prompt table for storing cached ptunig representations by @arendu :: PR: #7295
- Bugfix and optimization in by @odelalleau :: PR: #7267
- Set a default value when getting by @yaox12 :: PR: #7115
- Distributed checkpointing with mcore GPT by @ericharper :: PR: #7116
- Fix activation checkpoint by @hsiehjackson :: PR: #7334
- Replace prefetch with val iterator check in megatron models by @athitten :: PR: #7318
- Fixing indentation bug in indexed_dataset memory deallocation by @michalivne :: PR: #7352
- NeMo MCore llama2 support + MCore PEFT adapters by @blahBlahhhJ :: PR: #7299
- Hiddens modules documentation by @michalivne :: PR: #7303
- Support for flash attention 2.0 by @MaximumEntropy :: PR: #7063
- multiple fields can form a context by @arendu :: PR: #7147
- adding bias_dropout_add_fusion option for BERT by @clumsy :: PR: #7332
- enable selective unfreeze by @arendu :: PR: #7326
- Upgrade pytorch container to 23.08 by @ericharper :: PR: #7353
- enable fp32 optimizer for output_layer in mcore by @lhb8125 :: PR: #7355
- Revert comment by @ericharper :: PR: #7368
- fix pipeline parallel inference by @blahBlahhhJ :: PR: #7367
- fix for peft tied weights by @arendu :: PR: #7372
- add O2 option in gpt eval by @blahBlahhhJ :: PR: #7358
- Move model precision copy by @maanug-nv :: PR: #7336
- Fix PEFT checkpoint loading by @blahBlahhhJ :: PR: #7388
- Use distributed optimizer support for multiple dtypes by @timmoon10 :: PR: #7359
- [PATCH] PEFT import mcore by @blahBlahhhJ :: PR: #7393
- Use cfg attribute in bert by @maanug-nv :: PR: #7394
- Add support for bias conversion in Swiglu models by @titu1994 :: PR: #7386
- Update save_to and restore_from for dist checkpointing by @ericharper :: PR: #7343
- fix forward for with mcore=false by @JimmyZhang12 :: PR: #7403
- Fix logging to remove 's/it' from progress bar in Megatron models and add train_step_timing by @athitten :: PR: #7374
- Set Activation Checkpointing Defaults by @aklife97 :: PR: #7404
- Make loss mask default to false by @ericharper :: PR: #7407
- Add dummy userbuffer config files by @erhoo82 :: PR: #7408
- Add missing ubconf files by @aklife97 :: PR: #7412
- Update ptl training ckpt conversion script to work with dist ckpt by @ericharper :: PR: #7416
- Add strategy as ddp_find_unused_parameters_true for glue_benchmark.py by @athitten :: PR: #7454
- fix bug when loading dist ckpt in peft by @lhb8125 :: PR: #7479
- Fix CustomProgressBar for resume by @athitten :: PR: #7427
- Append val output to self.validation_step_outputs in GLUEModel by @athitten :: PR: #7530
- Cherry pick Fix sft dataset truncation (#7464) to r1.21.0 by @ericharper :: PR: #7550
- Avoid duplicated dist checkpoint save by @mikolajblaz :: PR: #7555
- layernorm1p fix by @dimapihtar :: PR: #7523
- r1.21: SFT model parallel fix for dist ckpt by @aklife97 :: PR: #7520
- PEFT needs mp config propagated for dist ckpt by @ericharper :: PR: #7589
- Fix ptuning crash for llama 2 ckpt by @yuanzhedong :: PR: #7594
- PEFT eval fix by @cuichenx :: PR: #7626
- Propagate mp config for continue training by @ericharper :: PR: #7637
- Add ddp_find_unused_parameters=True and change accelerator to auto by @athitten :: PR: #7623
- Add find_unused_parameters_true for text_classiftn and punctuation_capitalization by @athitten :: PR: #7649
- conversion issue fix by @dimapihtar :: PR: #7648
- Fix a nlp nb onnx by @fayejf :: PR: #7703
- Add activations_checkpoint related args for model cfg in lora.ipynb by @athitten :: PR: #7752
- Change accelerator to 'auto' in nlp_checkpoint_port.py by @athitten :: PR: #7747
- Add reconfigure microbatch calculator before inference and update GBS, MBS for inference by @athitten :: PR: #7763
- Create PrecisionPlugin for megatron_ckpt_to_nemo.py trainer by @athitten :: PR: #7767
NeMo Tools
Changelog
Export
Changelog
- Added bool types to neural_types export by @tbartley94 :: PR: #7032
General Improvements
Changelog
- Add migration guide for lightning 2.0 upgrade by @athitten :: PR: #7360
- add support for max_total_length=4096 for 43b by @Zhilin123 :: PR: #6763
- Change Jenkins timeout by @ericharper :: PR: #6997
- Update SDP docs page with a new documentation link by @Kipok :: PR: #7029
- Fixed tutorial's name by @vsl9 :: PR: #7047
- Revert Fix import guard checks by @titu1994 :: PR: #7125
- Fix import guard checks by @titu1994 :: PR: #7126
- fix evaluator.py for various exceptions by ast by @stevehuang52 :: PR: #7150
- NFA bugfix: remove any empty segments by @erastorgueva-nv :: PR: #7155
- NFA subtitle file config - specify colors and vertical alignment by @erastorgueva-nv :: PR: #7160
- add paths to labeler. by @XuesongYang :: PR: #7087
- [Bugfix] Fix a bug in filtering checkpoints by @yaox12 :: PR: #6851
- Update README.rst by @fayejf :: PR: #7175
- Make NFA subtitles stay until end of video by @erastorgueva-nv :: PR: #7189
- Uncomment removal of exp_dir in JenkinsFile by @athitten :: PR: #7198
- NFA: replace ellipses in text with 3 periods by @erastorgueva-nv :: PR: #7208
- NFA tutorial notebook by @erastorgueva-nv :: PR: #7210
- NFA docs: update READMEs and links, add docs page by @erastorgueva-nv :: PR: #7219
- Make image centering in NFA README actually work by @erastorgueva-nv :: PR: #7220
- Add mcore installation to Dockerfile by @ericharper :: PR: #7237
- Checkpoint averaging for model parallel by @Kipok :: PR: #7252
- Upgrade hydra and omegaconf by @athitten :: PR: #7243
- Update numba support in docker by @titu1994 :: PR: #7271
- remove deprecated scripts from ci by @arendu :: PR: #7239
- Logging model checkpoints as artifacts in MlFlow by @AlirezaMorsali :: PR: #7258
- Adithyare/peft metric calculation by @arendu :: PR: #7304
- Resume checkpoint priority by @maanug-nv :: PR: #7335
- lora merge fix for O2 names by @arendu :: PR: #7325
- Llama load buffers in checkpoint by @blahBlahhhJ :: PR: #7357
- pin numba=0.57.1 to fix reinstall.sh error by @XuesongYang :: PR: #7366
- Update to core 23.08 branch ToT by @aklife97 :: PR: #7371
- Upper bounding ptl by @ericharper :: PR: #7370
- minor fix for llama ckpt conversion script by @blahBlahhhJ :: PR: #7387
- Update Core Commit by @aklife97 :: PR: #7402
- Fix resume from checkpoint in exp_manager by @athitten :: PR: #7424
- add sleep by @gshennvm :: PR: #7498
- Fix exp manager check for sleep by @titu1994 :: PR: #7503
- unpin setuptools by @fayejf :: PR: #7534
- Update FFMPEG version to fix issue with torchaudio by @titu1994 :: PR: #7551
- fix typos in nfa and speech enhancement tutorials by @erastorgueva-nv :: PR: #7580
- best ckpt fix by @dimapihtar :: PR: #7564
- add build os key by @nithinraok :: PR: #7596
- Fix issues with Dockerfile by @titu1994 :: PR: #7650
- Change confidence parameters in the test by @Kipok :: PR: #7680
- bugfix: pin nemo-text-process to fix Chinese normalizer error. by @XuesongYang :: PR: #7627
- Remove PUBLICATIONS.md, point to github.io NeMo page instead by @erastorgueva-nv :: PR: #7694
- Pin mcore to 0.3 by @ericharper :: PR: #7751
- fix hybrid eval by @karpnv :: PR: #7759
- Update Apex install command in Dockerfile by @ericharper :: PR: #7794