Highlights

Models

NeMo ASR

Multi-lookahead cache-aware streaming
Speech enahncement tutorial #6492
Online code switching dataset #6579

NeMo TTS

AudioCodec: Training recipe for EnCodec #6852

NeMo Framework

GPT from Mcore #7093
GPT distributed checkpointing #7116
Hidden transformations #6332
LLama-2 #7299

NeMo Core

Update to PTL 2.0 #6433

NeMo Tools

Forced aligner tutorial #7210

Container

For additional information regarding NeMo containers, please visit: https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo

docker pull nvcr.io/nvidia/nemo:23.08

ASR

Changelog

Fix require_grad typos by @kit1980 :: PR: #6930
rnnt_greedy_decoding.py: typos? auto-repressively -> auto-regressively by @vadimkantorov :: PR: #6989
Adding tutorial for confidence ensembles by @Kipok :: PR: #6932
Add support for Numba FP16 RNNT Loss by @titu1994 :: PR: #6991
fix install_beamsearch_decoders by @karpnv :: PR: #7011
rnnt and char utils by @karpnv :: PR: #6971
ASR Confidence update and tutorial by @GNroy :: PR: #6810
st standalone model by @AlexGrinch :: PR: #6969
Fix typo in ASR-TTS tutorial by @artbataev :: PR: #7049
Update Frame-VAD doc and fix onnx export by @stevehuang52 :: PR: #7076
Fast Conformer global token fix by @sam1373 :: PR: #7085
Added script to extract ASR CTC and RNNT models from ASR hybrid models by @trias702 :: PR: #7092
Fix absolute path in path join call by @kingjan1999 :: PR: #7099
NeMo ASR Demo by @lleaver :: PR: #7110
Fix plot function in vad_utils.py by @stevehuang52 :: PR: #7113
Fixed small bug with NoisePerturbationWithNormalization by @trias702 :: PR: #7118
Merge release r1.20.0 to main by @ericharper :: PR: #7167
minor fix for conformer subsampling docstring. by @XuesongYang :: PR: #7195
[ASR] Fix GPU memory leak in transcribe_speech.py by @rlangman :: PR: #7249
Adding Multilingual, Code-Switched, and Hybrid ASR models by @KunalDhawan :: PR: #7250
fix partial transcribe by @stevehuang52 :: PR: #7284
Conv1d subsampling by @burchim :: PR: #7294
add bf16 inference support and fix seq_len stft issue by @nithinraok :: PR: #7338
Add finetuning scripts by @nithinraok :: PR: #7263
Move parameter: trainer -> exp_manager (for PTL 2.0) by @artbataev :: PR: #7339
Fix typos by @omahs :: PR: #7361
Fix wrong calling of librosa.get_duration() in notebook by @RobinDong :: PR: #7376
RNN-T confidence and alignment bugfix (#7381) by @GNroy :: PR: #7459
update branch by @nithinraok :: PR: #7488
Replace strategy = None with strategy = auto for notebooks by @athitten :: PR: #7521
Fix PTL2.0 related ASR bugs in r1.21.0: Val metrics logging, None dataloader issue by @KunalDhawan :: PR: #7531
gpus -> devices by @nithinraok :: PR: #7542
[BugFix] Add missing quotes for auto strategy in tutorial notebooks by @athitten :: PR: #7541
Append output of val_step to self.validation_step_outputs in EncMaskDecAudioToAudioModel by @athitten :: PR: #7543
fix validation_step_outputs initialization for multi-dataloader by @KunalDhawan :: PR: #7546
Append val/test output to instance variable in EncDecSpeakerLabelModel by @athitten :: PR: #7562
update strategy by @nithinraok :: PR: #7577
Typo fixes by @Kipok :: PR: #7591
Fix metrics for SE tutorial by @anteju :: PR: #7604
fix ssl models ptl monitor val through logging by @nithinraok :: PR: #7608
Fix py3.11 dataclasses issue by @titu1994 :: PR: #7582
bugfix: trainer.gpus, trainer.strategy, trainer.accelerator by @XuesongYang :: PR: #7621
Safeguard nemo_text_processing installation on ARM (#7485) by @blisc :: PR: #7619
[ASR] Fix type error in jasper by @rlangman :: PR: #7636
Fix vad & speech command tutorial - onnx by @fayejf :: PR: #7671
Replace strategy='dp'/None with 'auto' by @athitten :: PR: #7681
Fix multi rank finetune for ASR by @titu1994 :: PR: #7684
fix ptl_bugs in slu_models.py by @jzi040941 :: PR: #7689
Add NLPDDPStrategyNotebook and change trainer gpus to devices by @athitten :: PR: #7741
Updated installation of ctc-decoders by @vsl9 :: PR: #7746
Fix bug wrt change decoding strategy for bpe models by @titu1994 :: PR: #7762

TTS

Changelog

[TTS] Add cosine distance option to TTS aligner by @rlangman :: PR: #6806
[TTS] Add tutorial for TTS data prep scripts by @rlangman :: PR: #6922
update TTS readme by @XuesongYang :: PR: #7088
[TTS] Create EnCodec training recipe by @rlangman :: PR: #6852
[TTS][ZH] add Chinese TTS recipes based on IPA symbol sets. by @XuesongYang :: PR: #6893
[TTS] Add output audio format to preprocessing by @rlangman :: PR: #6889
[TTS] Remove nested TTS configs by @rlangman :: PR: #7154
[TTS] Fix TTS recipes with PTL 2.0 by @rlangman :: PR: #7188
[TTS] Add license to ported EnCodec code by @rlangman :: PR: #7197
[Fix] Discriminator update in AudioCodecModel by @anteju :: PR: #7209
Adapter ipa Tutorial and config update by @styagi130 :: PR: #7260
[TTS] Audio codec fixes by @rlangman :: PR: #7266
[TTS] minor fix typos and input_types by @XuesongYang :: PR: #7272
specify explicitly to set pretrained model paths by @styagi130 :: PR: #7305
[TTS] Update AudioCodec API by @anteju :: PR: #7310
[TTS] Add additional config to preprocess_text and compute_feature_stats by @rlangman :: PR: #7321
[TTS] Change audio codec token type to TokenIndex by @rlangman :: PR: #7356
fixed trainer.strategy=auto from None. by @XuesongYang :: PR: #7369
[TTS] Added a callback for logging initial data by @anteju :: PR: #7384
[TTS] bugfix: trainer.accelerator=auto from None. by @XuesongYang :: PR: #7492
bugfix: specify trainer.strategy=auto when devices=1 by @XuesongYang :: PR: #7509
Fix dimensionality in get_dist function by @redoctopus :: PR: #7506
Fix TTS FastPitch tutorial by @hsiehjackson :: PR: #7494
[TTS] remove curly braces from in jupyer notebook cell. by @XuesongYang :: PR: #7554
[TTS] fixed trainer's accelerator and strategy. by @XuesongYang :: PR: #7569
Change hifigan finetune strategy to ddp_find_unused_parameters_true by @hsiehjackson :: PR: #7579
Fix validation in G2PModel and ThutmoseTaggerModel by @athitten :: PR: #7597
[TTS] Fix FastPitch data prep tutorial by @rlangman :: PR: #7602
[TTS] Add dataset to path of logged artifacts by @rlangman :: PR: #7651

NLP / NMT

Changelog

Minor MPT-7B fixes and creation script update by @trias702 :: PR: #6982
remove hard coded input and output fields by @arendu :: PR: #7008
RoPE length extrapolation with interpolation by @MaximumEntropy :: PR: #7005
add async + distopt to sft by @MaximumEntropy :: PR: #7018
ptuning inference table bug fix by @arendu :: PR: #7015
Fix missing import for GPT SFT by @MaximumEntropy :: PR: #7026
Add end_strings to SamplingParams by @markelsanz14 :: PR: #6986
Fix race condition for downloading cache when executing with multi-node by @findkim :: PR: #7016
added back the retro documents. by @yidong72 :: PR: #7033
remove pos emb from state dict for old models by @ekmb :: PR: #7068
memmap worker arg by @arendu :: PR: #7062
Disable distopt contiguous param buffer by default by @timmoon10 :: PR: #7095
[Fix] load_state_dict in nlp_model.py by @stevehuang52 :: PR: #7086
Fix tokenizer file caching where torch.distributed may not be initialized yet by @findkim :: PR: #7061
freeze base mode on init during peft by @arendu :: PR: #7152
Include the scripts for preprocessing OAST and unit tests for chat sft datasets by @yidong72 :: PR: #7112
T5 metrics fix by @jubick1337 :: PR: #7037
megatron gpt training fix by @anmolgupt :: PR: #7199
Fix T5 using FA by @hsiehjackson :: PR: #7196
fix-causal-fa-infer by @hsiehjackson :: PR: #7200
Fix gpt trainer test by @hsiehjackson :: PR: #6915
Load ub_cfg from hydra config by @jbaczek :: PR: #7003
Fixes for lightning 2.0 upgrade by @athitten :: PR: #7176
Fix which was off by one batch by @odelalleau :: PR: #7212
Start using ModelParallelConfig from Megatron Core by @ericharper :: PR: #6885
deprecation warning by @arendu :: PR: #7193
Fix attention mask inference by @hsiehjackson :: PR: #7213
Use GPTModel from mcore by @ericharper :: PR: #7093
Add bf16-mixed and 16-mixed in module.py by @athitten :: PR: #7227
Refactor LLM pretraining examples by @maanug-nv :: PR: #7159
Add only trainable parameters to optimizer group in PEFT by @guyueh1 :: PR: #7230
Dummy class for ModelParallelConfig by @ericharper :: PR: #7254
[TN][Docs] update language coverage matrix and refs by @mgrafu :: PR: #7247
tied weights for adapters by @arendu :: PR: #6928
Fix skip generation by @hsiehjackson :: PR: #7270
Hidden transforms model parallel config + CI with Perceiver by @michalivne :: PR: #7241
Fix restore sequence parallel by @hsiehjackson :: PR: #7273
fix ptuning and lora model_parallel_config by @blahBlahhhJ :: PR: #7287
Fix adapters and ptuning for amp O2 by @guyueh1 :: PR: #7285
remove additional line in peft state dict by @blahBlahhhJ :: PR: #7293
loss mask aware final layer applicaiton by @arendu :: PR: #7275
Adding server option to peft eval by @Davood-M :: PR: #7292
migrated class CSVFieldsMemmapDataset from BioNeMo by @dorotat-nv :: PR: #7314
remove old prompt table for storing cached ptunig representations by @arendu :: PR: #7295
Bugfix and optimization in by @odelalleau :: PR: #7267
Set a default value when getting by @yaox12 :: PR: #7115
Distributed checkpointing with mcore GPT by @ericharper :: PR: #7116
Fix activation checkpoint by @hsiehjackson :: PR: #7334
Replace prefetch with val iterator check in megatron models by @athitten :: PR: #7318
Fixing indentation bug in indexed_dataset memory deallocation by @michalivne :: PR: #7352
NeMo MCore llama2 support + MCore PEFT adapters by @blahBlahhhJ :: PR: #7299
Hiddens modules documentation by @michalivne :: PR: #7303
Support for flash attention 2.0 by @MaximumEntropy :: PR: #7063
multiple fields can form a context by @arendu :: PR: #7147
adding bias_dropout_add_fusion option for BERT by @clumsy :: PR: #7332
enable selective unfreeze by @arendu :: PR: #7326
Upgrade pytorch container to 23.08 by @ericharper :: PR: #7353
enable fp32 optimizer for output_layer in mcore by @lhb8125 :: PR: #7355
Revert comment by @ericharper :: PR: #7368
fix pipeline parallel inference by @blahBlahhhJ :: PR: #7367
fix for peft tied weights by @arendu :: PR: #7372
add O2 option in gpt eval by @blahBlahhhJ :: PR: #7358
Move model precision copy by @maanug-nv :: PR: #7336
Fix PEFT checkpoint loading by @blahBlahhhJ :: PR: #7388
Use distributed optimizer support for multiple dtypes by @timmoon10 :: PR: #7359
[PATCH] PEFT import mcore by @blahBlahhhJ :: PR: #7393
Use cfg attribute in bert by @maanug-nv :: PR: #7394
Add support for bias conversion in Swiglu models by @titu1994 :: PR: #7386
Update save_to and restore_from for dist checkpointing by @ericharper :: PR: #7343
fix forward for with mcore=false by @JimmyZhang12 :: PR: #7403
Fix logging to remove 's/it' from progress bar in Megatron models and add train_step_timing by @athitten :: PR: #7374
Set Activation Checkpointing Defaults by @aklife97 :: PR: #7404
Make loss mask default to false by @ericharper :: PR: #7407
Add dummy userbuffer config files by @erhoo82 :: PR: #7408
Add missing ubconf files by @aklife97 :: PR: #7412
Update ptl training ckpt conversion script to work with dist ckpt by @ericharper :: PR: #7416
Add strategy as ddp_find_unused_parameters_true for glue_benchmark.py by @athitten :: PR: #7454
fix bug when loading dist ckpt in peft by @lhb8125 :: PR: #7479
Fix CustomProgressBar for resume by @athitten :: PR: #7427
Append val output to self.validation_step_outputs in GLUEModel by @athitten :: PR: #7530
Cherry pick Fix sft dataset truncation (#7464) to r1.21.0 by @ericharper :: PR: #7550
Avoid duplicated dist checkpoint save by @mikolajblaz :: PR: #7555
layernorm1p fix by @dimapihtar :: PR: #7523
r1.21: SFT model parallel fix for dist ckpt by @aklife97 :: PR: #7520
PEFT needs mp config propagated for dist ckpt by @ericharper :: PR: #7589
Fix ptuning crash for llama 2 ckpt by @yuanzhedong :: PR: #7594
PEFT eval fix by @cuichenx :: PR: #7626
Propagate mp config for continue training by @ericharper :: PR: #7637
Add ddp_find_unused_parameters=True and change accelerator to auto by @athitten :: PR: #7623
Add find_unused_parameters_true for text_classiftn and punctuation_capitalization by @athitten :: PR: #7649
conversion issue fix by @dimapihtar :: PR: #7648
Fix a nlp nb onnx by @fayejf :: PR: #7703
Add activations_checkpoint related args for model cfg in lora.ipynb by @athitten :: PR: #7752
Change accelerator to 'auto' in nlp_checkpoint_port.py by @athitten :: PR: #7747
Add reconfigure microbatch calculator before inference and update GBS, MBS for inference by @athitten :: PR: #7763
Create PrecisionPlugin for megatron_ckpt_to_nemo.py trainer by @athitten :: PR: #7767

NeMo Tools

Changelog

Update doc, new tutorial on SDE by @Jorjeous :: PR: #7405
Fix branch version for SDE by @titu1994 :: PR: #7528

Export

Changelog

Added bool types to neural_types export by @tbartley94 :: PR: #7032

General Improvements

Changelog

Add migration guide for lightning 2.0 upgrade by @athitten :: PR: #7360
add support for max_total_length=4096 for 43b by @Zhilin123 :: PR: #6763
Change Jenkins timeout by @ericharper :: PR: #6997
Update SDP docs page with a new documentation link by @Kipok :: PR: #7029
Fixed tutorial's name by @vsl9 :: PR: #7047
Revert Fix import guard checks by @titu1994 :: PR: #7125
Fix import guard checks by @titu1994 :: PR: #7126
fix evaluator.py for various exceptions by ast by @stevehuang52 :: PR: #7150
NFA bugfix: remove any empty segments by @erastorgueva-nv :: PR: #7155
NFA subtitle file config - specify colors and vertical alignment by @erastorgueva-nv :: PR: #7160
add paths to labeler. by @XuesongYang :: PR: #7087
[Bugfix] Fix a bug in filtering checkpoints by @yaox12 :: PR: #6851
Update README.rst by @fayejf :: PR: #7175
Make NFA subtitles stay until end of video by @erastorgueva-nv :: PR: #7189
Uncomment removal of exp_dir in JenkinsFile by @athitten :: PR: #7198
NFA: replace ellipses in text with 3 periods by @erastorgueva-nv :: PR: #7208
NFA tutorial notebook by @erastorgueva-nv :: PR: #7210
NFA docs: update READMEs and links, add docs page by @erastorgueva-nv :: PR: #7219
Make image centering in NFA README actually work by @erastorgueva-nv :: PR: #7220
Add mcore installation to Dockerfile by @ericharper :: PR: #7237
Checkpoint averaging for model parallel by @Kipok :: PR: #7252
Upgrade hydra and omegaconf by @athitten :: PR: #7243
Update numba support in docker by @titu1994 :: PR: #7271
remove deprecated scripts from ci by @arendu :: PR: #7239
Logging model checkpoints as artifacts in MlFlow by @AlirezaMorsali :: PR: #7258
Adithyare/peft metric calculation by @arendu :: PR: #7304
Resume checkpoint priority by @maanug-nv :: PR: #7335
lora merge fix for O2 names by @arendu :: PR: #7325
Llama load buffers in checkpoint by @blahBlahhhJ :: PR: #7357
pin numba=0.57.1 to fix reinstall.sh error by @XuesongYang :: PR: #7366
Update to core 23.08 branch ToT by @aklife97 :: PR: #7371
Upper bounding ptl by @ericharper :: PR: #7370
minor fix for llama ckpt conversion script by @blahBlahhhJ :: PR: #7387
Update Core Commit by @aklife97 :: PR: #7402
Fix resume from checkpoint in exp_manager by @athitten :: PR: #7424
add sleep by @gshennvm :: PR: #7498
Fix exp manager check for sleep by @titu1994 :: PR: #7503
unpin setuptools by @fayejf :: PR: #7534
Update FFMPEG version to fix issue with torchaudio by @titu1994 :: PR: #7551
fix typos in nfa and speech enhancement tutorials by @erastorgueva-nv :: PR: #7580
best ckpt fix by @dimapihtar :: PR: #7564
add build os key by @nithinraok :: PR: #7596
Fix issues with Dockerfile by @titu1994 :: PR: #7650
Change confidence parameters in the test by @Kipok :: PR: #7680
bugfix: pin nemo-text-process to fix Chinese normalizer error. by @XuesongYang :: PR: #7627
Remove PUBLICATIONS.md, point to github.io NeMo page instead by @erastorgueva-nv :: PR: #7694
Pin mcore to 0.3 by @ericharper :: PR: #7751
fix hybrid eval by @karpnv :: PR: #7759
Update Apex install command in Dockerfile by @ericharper :: PR: #7794

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NVIDIA Neural Modules 1.21.0

Highlights

Models

NeMo ASR

NeMo TTS

NeMo Framework

NeMo Core

NeMo Tools

Container

ASR

TTS

NLP / NMT

NeMo Tools

Export

General Improvements

Contributors