NeVA media_type fix #9255

paul-gibbons · 2024-05-20T17:12:23Z

What does this PR do ?

Add a one line overview of what this PR aims to accomplish.

Collection: [Note which collection this PR will affect]

Changelog

Add specific line by line info of high level changes in this PR.

Usage

You can potentially add a usage example below

# Add a code snippet demonstrating how to use this

GitHub Actions CI

The Jenkins CI system has been replaced by GitHub Actions self-hosted runners.

The GitHub Actions CI will run automatically when the "Run CICD" label is added to the PR.
To re-run CI remove and add the label again.
To run CI on an untrusted fork, a NeMo user with write access must first click "Approve and run".

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you add or update any necessary documentation?
Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
- Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

New Feature
Bugfix
Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

Related to # (issue)

* update package info Signed-off-by: ericharper <[email protected]> * fix the mpt chatbot (#6957) Signed-off-by: Yi Dong <[email protected]> * Remove `compute_on_step` from metrics (#6979) * Remove `compute_on_step` from metrics Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove confusing log message Signed-off-by: smajumdar <[email protected]> * Update tests Signed-off-by: smajumdar <[email protected]> --------- Signed-off-by: smajumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Hybrid conformer export (#6983) * Implemented generic kv-pair setting of export_config from args Signed-off-by: Boris Fomitchev <[email protected]> * Hybrid conformer export Signed-off-by: Boris Fomitchev <[email protected]> * Hybrid decoder export Signed-off-by: Boris Fomitchev <[email protected]> * Cleanup Signed-off-by: Boris Fomitchev <[email protected]> * Changed from **kwargs Signed-off-by: Boris Fomitchev <[email protected]> * Docstring Signed-off-by: Boris Fomitchev <[email protected]> * Docs added Signed-off-by: Boris Fomitchev <[email protected]> * Stringify args Signed-off-by: Boris Fomitchev <[email protected]> * Added docs for ASR export configs Signed-off-by: Boris Fomitchev <[email protected]> * lowercase ctc Signed-off-by: Boris Fomitchev <[email protected]> --------- Signed-off-by: Boris Fomitchev <[email protected]> * Cache handling without input tensors mutation (#6980) * Cache handling without input tensors mutation Signed-off-by: Boris Fomitchev <[email protected]> * Cleanup Signed-off-by: Boris Fomitchev <[email protected]> * Cleanup#2 Signed-off-by: Boris Fomitchev <[email protected]> * Cleanup#3 Signed-off-by: Boris Fomitchev <[email protected]> --------- Signed-off-by: Boris Fomitchev <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> * fixes for spellmapper (#6994) Signed-off-by: Alexandra Antonova <[email protected]> * Fixing an issue with confidence ensembles (#6987) * Bug fix for the confidence ensembles Signed-off-by: Igor Gitman <[email protected]> * Relax constraints for the test Signed-off-by: Igor Gitman <[email protected]> --------- Signed-off-by: Igor Gitman <[email protected]> * [TTS] Append pretrained FastPitch & SpectrogamEnhancer pair to available models (#7012) * [TTS] fastpitch: add english libritts model with asr stft parameters (25 ms 10 ms) Signed-off-by: Roman Korostik <[email protected]> * [TTS] enhancer: add pretrained model intended for asr finetuning Signed-off-by: Roman Korostik <[email protected]> --------- Signed-off-by: Roman Korostik <[email protected]> * Add ASR with TTS Tutorial. Fix enhancer usage. (#6955) * Add ASR with TTS Tutorial * Fix enhancer usage Signed-off-by: Vladimir Bataev <[email protected]> * install_bs (#7019) Signed-off-by: Nikolay Karpov <[email protected]> * fix tab text gen (#7022) Signed-off-by: Yi Dong <[email protected]> * TE bug fix (#7027) Signed-off-by: Dmytro Pykhtar <[email protected]> * Add support for Numba FP16 RNNT Loss (#6991) (#7038) * Force working space memory to always be in fp32 Signed-off-by: smajumdar <[email protected]> * Add support for fp16 testing in Numba Signed-off-by: smajumdar <[email protected]> * Add support for fp16 testing in Numba Signed-off-by: smajumdar <[email protected]> * Add support for fp16 testing in Numba Signed-off-by: smajumdar <[email protected]> * Fix cost calculation by upcasting to fp32 Signed-off-by: smajumdar <[email protected]> * Fix cost calculation by upcasting to fp32 Signed-off-by: smajumdar <[email protected]> * Add support to check if numba fp16 is available Signed-off-by: smajumdar <[email protected]> * add RNN-T loss implemented by PyTorch and test code (#5312) * Fix the bugs in cache-aware streaming Conformer (#5032) Signed-off-by: Vahid <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * IA3 support for GPT and T5 (#4909) * init commit for ia3 adater training in GPT Signed-off-by: arendu <[email protected]> * ia3 adater training in GPT, models and adapter classes Signed-off-by: arendu <[email protected]> * reshape to operate even on non-contiguous tensors Signed-off-by: arendu <[email protected]> * configs Signed-off-by: arendu <[email protected]> * fixed none init Signed-off-by: arendu <[email protected]> * adding adapter and ia3 support for T5 based models Signed-off-by: arendu <[email protected]> * style fix Signed-off-by: arendu <[email protected]> * config update and t5 model adapter and ia3 Signed-off-by: arendu <[email protected]> * removed unused imports Signed-off-by: arendu <[email protected]> * predict step for inference Signed-off-by: arendu <[email protected]> * style fix Signed-off-by: arendu <[email protected]> * style fix Signed-off-by: arendu <[email protected]> * adapter inference for t5 Signed-off-by: arendu <[email protected]> * style fix Signed-off-by: arendu <[email protected]> * fixed bug micro and global batch size in eval Signed-off-by: arendu <[email protected]> * minor edit Signed-off-by: arendu <[email protected]> * agressive truncation if in test examples if no truncation field is given Signed-off-by: arendu <[email protected]> * corrected for language_model_path name changes in main Signed-off-by: arendu <[email protected]> * removed unused import Signed-off-by: arendu <[email protected]> * name change for language_model_path Signed-off-by: arendu <[email protected]> * include inter_attention to IA3 Signed-off-by: arendu <[email protected]> * minor fix in confg Signed-off-by: arendu <[email protected]> * minor fixes Signed-off-by: arendu <[email protected]> * removed unused flag Signed-off-by: arendu <[email protected]> * addressing PR comments Signed-off-by: arendu <[email protected]> * address PR comments Signed-off-by: arendu <[email protected]> * minor fix Signed-off-by: arendu <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * style fix Signed-off-by: arendu <[email protected]> * CI test Signed-off-by: arendu <[email protected]> * minor fix in jenkinsfile Signed-off-by: arendu <[email protected]> Signed-off-by: arendu <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Hainan Xu <[email protected]> * Bug fix - Limit val batches set to 1.0 (#5023) * Bug fix Signed-off-by: shanmugamr1992 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Adressed sandeep's comments * Fixing limit val batches support in bert * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fixing limit val batches support in bert * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: shanmugamr1992 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Sandeep Subramanian <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * [bug_fix] kv_channels is used when available (#5066) * fix bug s.t kv_channels is used when available Signed-off-by: arendu <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: arendu <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Hainan Xu <[email protected]> * P&C Docs (#5068) (#5069) Signed-off-by: Matvei Novikov <[email protected]> Signed-off-by: Matvei Novikov <[email protected]> Signed-off-by: Matvei Novikov <[email protected]> Co-authored-by: Matvei Novikov <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * Add spe_split_by_unicode_script arg (#5072) * Add spe_split_by_unicode_script arg Signed-off-by: Anas <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Anas <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Hainan Xu <[email protected]> * probabilites -> probabilities (#5078) (#5079) Signed-off-by: nithinraok <[email protected]> Signed-off-by: nithinraok <[email protected]> Signed-off-by: nithinraok <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * increase PR and Issue sweep quantity and active close PRs. (#5073) * increase PR and Issue sweep quantity and active close PRs. Signed-off-by: Xuesong Yang <[email protected]> * update with stricter rules, 30 days to be stale and 7 days to be closed for both Issues and PRs. Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * [TTS] added missing German phoneme tokenizer. (#5070) (#5074) Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * rename to match prompt leanring (#5076) Signed-off-by: arendu <[email protected]> Signed-off-by: arendu <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * Missing fixes from r1.11.0 to T5 finetuning eval (#5054) (#5061) * Fixes to seq2seq eval Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Hainan Xu <[email protected]> * Notebook bug fixes (#5084) (#5085) * Notebook bug fixes Signed-off-by: Virginia Adams <[email protected]> * Turned nemo install back on Signed-off-by: Virginia Adams <[email protected]> * reverted notebook Signed-off-by: Virginia Adams <[email protected]> * Updated one line in entity linking nb Signed-off-by: Virginia Adams <[email protected]> Signed-off-by: Virginia Adams <[email protected]> Co-authored-by: Eric Harper <[email protected]> Signed-off-by: Virginia Adams <[email protected]> Co-authored-by: Virginia Adams <[email protected]> Co-authored-by: Eric Harper <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * update strategy in notebook from ddp_fork to dp (#5088) (#5089) Co-authored-by: Zhilin Wang <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * Fix bug in Squeezeformer Conv block (#5011) (#5024) * Fix bug in Squeezeformer Conv block Signed-off-by: smajumdar <[email protected]> * Fix kernel context Signed-off-by: smajumdar <[email protected]> * Fix access mixin Signed-off-by: smajumdar <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: smajumdar <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * fixed megatron lm conversion bug (PTL related) (#5038) (#5063) Signed-off-by: David Mosallanezhad <[email protected]> Signed-off-by: David Mosallanezhad <[email protected]> Co-authored-by: David Mosallanezhad <[email protected]> Signed-off-by: David Mosallanezhad <[email protected]> Co-authored-by: David <[email protected]> Co-authored-by: David Mosallanezhad <[email protected]> Co-authored-by: Eric Harper <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * Fix Unhashable type list for Numba Cuda spec augment kernel (#5093) (#5094) Signed-off-by: smajumdar <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: smajumdar <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * Fix numba (#5098) Signed-off-by: smajumdar <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * Make it possible to specify output_filename in normalize_with_audio.py (#5092) Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * Greedy decoding confidence for CTC and RNNT (#4931) * rnnt confidence draft Signed-off-by: Aleksandr Laptev <[email protected]> * word confidence Signed-off-by: Aleksandr Laptev <[email protected]> * advanced entropies added Signed-off-by: Aleksandr Laptev <[email protected]> * refactoring Signed-off-by: Aleksandr Laptev <[email protected]> * oops forgot a file Signed-off-by: Aleksandr Laptev <[email protected]> * metrics and benchmarking script added Signed-off-by: Aleksandr Laptev <[email protected]> * style fix Signed-off-by: Aleksandr Laptev <[email protected]> * texterrors installation added Signed-off-by: Aleksandr Laptev <[email protected]> * lgtm and bug fix Signed-off-by: Aleksandr Laptev <[email protected]> * fix comments Signed-off-by: Aleksandr Laptev <[email protected]> * fix typos Signed-off-by: Aleksandr Laptev <[email protected]> * add missing import after rebase Signed-off-by: Aleksandr Laptev <[email protected]> Signed-off-by: Aleksandr Laptev <[email protected]> Co-authored-by: Aleksandr Laptev <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * [Add] SLURP models and examples (#4668) * add model, util and loss Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> * refactor Signed-off-by: stevehuang52 <[email protected]> * refactor annd update Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> * update and refactor Signed-off-by: stevehuang52 <[email protected]> * update and refactor Signed-off-by: stevehuang52 <[email protected]> * update and refactor Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> * update docs Signed-off-by: stevehuang52 <[email protected]> * update available models Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> * refactor data processing Signed-off-by: stevehuang52 <[email protected]> * fix typo Signed-off-by: stevehuang52 <[email protected]> * update docs Signed-off-by: stevehuang52 <[email protected]> * refactor and update Signed-off-by: stevehuang52 <[email protected]> * update doc Signed-off-by: stevehuang52 <[email protected]> * move transformer to asr.modules Signed-off-by: stevehuang52 <[email protected]> * move transformer to asr.modules Signed-off-by: stevehuang52 <[email protected]> * get rid of jsonlines Signed-off-by: stevehuang52 <[email protected]> * refactor Signed-off-by: stevehuang52 <[email protected]> * revert changes to nlp Signed-off-by: stevehuang52 <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Signed-off-by: He Huang (Steve) <[email protected]> Co-authored-by: Jagadeesh Balam <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * only optimize params that are part of the adapter modules (#5086) Signed-off-by: arendu <[email protected]> Signed-off-by: arendu <[email protected]> Co-authored-by: Virginia Adams <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * Pipeline Parallel T5 Prompt Learning (#4956) * Added pre process flag checks and pipeline parallel in fwd Signed-off-by: Virginia Adams <[email protected]> * Added rank check for pipeline parallel Signed-off-by: Virginia Adams <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * T5 prompt learning works! Signed-off-by: Virginia Adams <[email protected]> * IA3 passing CI Signed-off-by: Virginia Adams <[email protected]> * Fixed typo Signed-off-by: Virginia Adams <[email protected]> * removed optimizer setup so Adi's change will not conflict Signed-off-by: Virginia Adams <[email protected]> Signed-off-by: Virginia Adams <[email protected]> Signed-off-by: Adi Renduchintala <[email protected]> Co-authored-by: Adi Renduchintala <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Hainan Xu <[email protected]> * [TTS] remove phonemizer.py (#5090) remove phonemizer.py and convert code block to markdown in the tutorial. Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * T5 Decoding with PP > 2 fix (#5091) (#5103) * set sequence lenghts in the pipeline properly Signed-off-by: MaximumEntropy <[email protected]> * Fix Signed-off-by: MaximumEntropy <[email protected]> Signed-off-by: MaximumEntropy <[email protected]> Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * [TTS] fixed wrong val loss for epoch 0 and inconsistent metrics names (#5087) (#5102) * fixed hifigan configs as well * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Xuesong Yang <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Hainan Xu <[email protected]> * Fix and refactor consumed samples save/restore for Megatron models. (#5077) * Fixes and refactor Signed-off-by: MaximumEntropy <[email protected]> * Fix Signed-off-by: MaximumEntropy <[email protected]> * Remove unused imports Signed-off-by: MaximumEntropy <[email protected]> * Empty Signed-off-by: MaximumEntropy <[email protected]> * Fix Signed-off-by: MaximumEntropy <[email protected]> Signed-off-by: MaximumEntropy <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * RIR corpus generator tool (#4927) Signed-off-by: Ante Jukić <[email protected]> Signed-off-by: Ante Jukić <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * Multiprocessing fix (#5106) (#5107) Signed-off-by: Matvei Novikov <[email protected]> Signed-off-by: Matvei Novikov <[email protected]> Signed-off-by: Matvei Novikov <[email protected]> Co-authored-by: Matvei Novikov <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * [Bug fix] PC lexical + audio (#5109) (#5110) * training running Signed-off-by: ekmb <[email protected]> * revert Signed-off-by: ekmb <[email protected]> * revert Signed-off-by: ekmb <[email protected]> Signed-off-by: ekmb <[email protected]> Signed-off-by: ekmb <[email protected]> Co-authored-by: Evelina <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * [Fix] schedulers with no max_steps param (#4564) * fix schedulers Signed-off-by: stevehuang52 <[email protected]> * update to use python inspect module Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * T5 prompt learning fixes missing from r.11.0 merge (#5075) (#5101) * Fix special tokens Signed-off-by: MaximumEntropy <[email protected]> * Fix Signed-off-by: MaximumEntropy <[email protected]> * Empty Signed-off-by: MaximumEntropy <[email protected]> Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: David <[email protected]> Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Co-authored-by: David <[email protected]> Co-authored-by: Eric Harper <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * [TTS] Add NeMo TTS Primer Tutorial (#4933) * [TTS] Add NeMo TTS Primer Tutorial Signed-off-by: Ryan <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * Add Squeezeformer CTC model checkpoints on Librispeech (#5121) Signed-off-by: smajumdar <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * adding loss normalization options to rnnt joint (#4829) * adding normalization options to rnnt joint loss * moving the param to joint * moving loss normalization to rnnt loss config * style * cleaning up * fixing sum reduction in joint Signed-off-by: Dima Rekesh <[email protected]> * moving reduction into RNNT loss class * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refactoring * typos Signed-off-by: Dima Rekesh <[email protected]> Signed-off-by: Dima Rekesh <[email protected]> Co-authored-by: Dima Rekesh <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Hainan Xu <[email protected]> * Asr concat dataloader (#5108) * forced precision * typo * initial commit Signed-off-by: Dima Rekesh <[email protected]> * typos and bugs Signed-off-by: Dima Rekesh <[email protected]> * reverting conformer encoder Signed-off-by: Dima Rekesh <[email protected]> * additional checks Signed-off-by: Dima Rekesh <[email protected]> * adding support to CTC models as well * reverting conformer_encoder Signed-off-by: Dima Rekesh <[email protected]> * typo Signed-off-by: Dima Rekesh <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refactoring Signed-off-by: Dima Rekesh <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refactoring Signed-off-by: Dima Rekesh <[email protected]> * merging Signed-off-by: Dima Rekesh <[email protected]> Signed-off-by: Dima Rekesh <[email protected]> Signed-off-by: Dima Rekesh <[email protected]> Co-authored-by: Dima Rekesh <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Somshubra Majumdar <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * fix blossom ci unittests Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * bugfix: pybtex.database.InvalidNameString: Too many commas in author field. (#5112) (#5115) Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * Uppdate container version to 22.09 (#5105) * update container version Signed-off-by: ericharper <[email protected]> * pin click Signed-off-by: ericharper <[email protected]> * pin click 8.0.2 Signed-off-by: ericharper <[email protected]> Signed-off-by: ericharper <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * Remove unsupported arguments from MegatronNMT (#5065) * Fixes Signed-off-by: MaximumEntropy <[email protected]> * Fixes Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * Fix Signed-off-by: MaximumEntropy <[email protected]> * More fixes Signed-off-by: MaximumEntropy <[email protected]> Signed-off-by: MaximumEntropy <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * pp2 support for T5 IA3 learning and T5 Adapters learning (#5116) * enabling pp2 Signed-off-by: arendu <[email protected]> * optimizer update Signed-off-by: arendu <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * T5 pp>1 support for adapters and ia3 Signed-off-by: arendu <[email protected]> * fix bug with missing adapter_tuning Signed-off-by: arendu <[email protected]> * inference error fixed, pp=2 Signed-off-by: arendu <[email protected]> Signed-off-by: arendu <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * T5 Prompt Learning Fixes for Pipeline Parallel (#5120) * Initial fixes Signed-off-by: MaximumEntropy <[email protected]> * Added back validation acc Signed-off-by: Virginia Adams <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Put num workers back Signed-off-by: Virginia Adams <[email protected]> * added relative encoding if statament Signed-off-by: Virginia Adams <[email protected]> * Added back val loss only validation Signed-off-by: Virginia Adams <[email protected]> * Revert "Added back val loss only validation" This reverts commit 86d8f4806fe30335c40c3716ce18259939df500f. * Removed val acc for PP > 1 Signed-off-by: Virginia Adams <[email protected]> * Removed enc_seq_len if statement Signed-off-by: Virginia Adams <[email protected]> * Added back validation acc calc Signed-off-by: Virginia Adams <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: MaximumEntropy <[email protected]> Signed-off-by: Virginia Adams <[email protected]> Signed-off-by: Virginia Adams <[email protected]> Co-authored-by: Virginia Adams <[email protected]> Co-authored-by: Virginia Adams <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Virginia Adams <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * add doc info (#4721) Signed-off-by: Yang Zhang <[email protected]> Signed-off-by: Yang Zhang <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * [TTS] Add SpanishCharsTokenizer (#5135) * [TTS] Add SpanishCharsTokenizer Signed-off-by: Ryan <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * Update megatron interface to dialogue (#4936) * fix style formatting Signed-off-by: Zhilin Wang <[email protected]> * update template to include description of intent Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * changes based on requests in review Signed-off-by: Zhilin Wang <[email protected]> * add compatibility with assistant dataset Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins Signed-off-by: Zhilin Wang <[email protected]> * remove dialogue_state_tracking Signed-off-by: Zhilin Wang <[email protected]> * update huggingface utils for dialogue Signed-off-by: Zhilin Wang <[email protected]> * rename dialogue_state_tracking_hybrid to dialogue_state_tracking_sgdqa Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * fix style Signed-off-by: Zhilin Wang <[email protected]> * style fix nemo/collections/nlp/models/dialogue_state_tracking_sgdqa/__init__.py Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile for SGDGEN Signed-off-by: Zhilin Wang <[email protected]> * fix typo Signed-off-by: Zhilin Wang <[email protected]> * add docstrings for assistant data processsor Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins for SGDGEN local checkpoint Signed-off-by: Zhilin Wang <[email protected]> * update style Signed-off-by: Zhilin Wang <[email protected]> * use local vocab file for Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * patch for Jenkins CI using local file Signed-off-by: Zhilin Wang <[email protected]> * add slot filling prediction and metrics Signed-off-by: Zhilin Wang <[email protected]> * remove unused code Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * refactor metrics code out of Dialogue GPT Model Signed-off-by: Zhilin Wang <[email protected]> * integrate backward compatible support for IntentSlotClassificationModel (bert model) Signed-off-by: Zhilin Wang <[email protected]> * save prediction file for IntentSlotClassification Signed-off-by: Zhilin Wang <[email protected]> * update dialogue gpt model training for megatron gpt Signed-off-by: Zhilin Wang <[email protected]> * remove batch generate for HF GPT2, which causes lower performance Signed-off-by: Zhilin Wang <[email protected]> * add few shot capability to dialogue gpt model Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile and remove unused import Signed-off-by: Zhilin Wang <[email protected]> * update code description and clarity Signed-off-by: Zhilin Wang <[email protected]> * address PR comments Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * integrate compatibility with ZeroShotIntentModel Signed-off-by: Zhilin Wang <[email protected]> * rename folder to dialogue due to increased scope and further refactor for clarity Signed-off-by: Zhilin Wang <[email protected]> * added dialogue GPT for sequence generation task (e.g. answer extender) Signed-off-by: Zhilin Wang <[email protected]> * add CI test for DialogueGPTGenerationModel Signed-off-by: Zhilin Wang <[email protected]> * integrate DialogueS2SGenerationModel for generation task (e.g. answer extender) Signed-off-by: Zhilin Wang <[email protected]> * modify huggingface utils to support HF t5/BART models Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * remove unused imports Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * update bleu metric Signed-off-by: Zhilin Wang <[email protected]> * fix bleu metric style Signed-off-by: Zhilin Wang <[email protected]> * debug bleu metric Signed-off-by: Zhilin Wang <[email protected]> * debug bleu metric Signed-off-by: Zhilin Wang <[email protected]> * update based on PR #3893 Signed-off-by: Zhilin Wang <[email protected]> * update 2 based on PR #3893 Signed-off-by: Zhilin Wang <[email protected]> * update 3 based on PR #3893 Signed-off-by: Zhilin Wang <[email protected]> * integrate sgd generation based on user user utterance and system slot-values to generate system utterance Signed-off-by: Zhilin Wang <[email protected]> * add validation model saving capabilities Signed-off-by: Zhilin Wang <[email protected]> * cleaned up code for SGD Based Answer extender Signed-off-by: Zhilin Wang <[email protected]> * update Dialogue Generation CI Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * update Jenkinsfile Signed-off-by: Zhilin Wang <[email protected]> * fix Jenkins CI issue" Signed-off-by: Zhilin Wang <[email protected]> * add support for design dataset Signed-off-by: Zhilin Wang <[email protected]> * remove unnecessary imports Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * support megatron for dialogue_s2s_generation_model Signed-off-by: Zhilin Wang <[email protected]> * reduce loaded samples in MSMarcoDataProcessor to 64 when cfg.model.dataset.debug_mode=True Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update CI Signed-off-by: Zhilin Wang <[email protected]> * update checkpoint and predictions filename to include epoch number Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * integrate HF BART MNLI into zero shot intent model Signed-off-by: Zhilin Wang <[email protected]> * integrate Dialogue Nearest Neighbour Model Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins Signed-off-by: Zhilin Wang <[email protected]> * update Jenkins Signed-off-by: Zhilin Wang <[email protected]> * refactor Dialogue SGD Data Processor to make interface for models cleaner Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * update Dialogue S2S Generation model for DialogueSGDDataProcessor interface Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * update jenkins Signed-off-by: Zhilin Wang <[email protected]> * support sgd and drive thru datasets by zero shot model and nearest neighbour model Signed-off-by: Zhilin Wang <[email protected]> * add prediction saving code to nearest neighbour and zero shot intent models Signed-off-by: Zhilin Wang <[email protected]> * fix typo in sgd data processor Signed-off-by: Zhilin Wang <[email protected]> * integrate Dialogue Mellon QA Data Processor Signed-off-by: Zhilin Wang <[email protected]> * update mellon qa Signed-off-by: Zhilin Wang <[email protected]> * update dialogue.py to remove outdated info Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update dialogue_config.yaml Signed-off-by: Zhilin Wang <[email protected]> * update dialogue_config.yaml Signed-off-by: Zhilin Wang <[email protected]> * add dialogue docs Signed-off-by: Zhilin Wang <[email protected]> * address review comments Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix for cfg Signed-off-by: Zhilin Wang <[email protected]> * make dependency on apex optional Signed-off-by: Zhilin Wang <[email protected]> * change NLPDDPluggin calling logic to make it possible to run without apex Signed-off-by: Zhilin Wang <[email protected]> * add first draft of tutorial Signed-off-by: Zhilin Wang <[email protected]> * reduce ms marco size by removing lines without wellFormedAnswers Signed-off-by: Zhilin Wang <[email protected]> * address pr comments Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update colab tutorial link in dialogue docs Signed-off-by: Zhilin Wang <[email protected]> * include unit test and some refactor to facilitate unit test Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * address pr issues Signed-off-by: Zhilin Wang <[email protected]> * remove typos in dialogue tutorial Signed-off-by: Zhilin Wang <[email protected]> * support larger files for question answering Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * remove unnecessary artifacts to reduce memory use Signed-off-by: Zhilin Wang <[email protected]> * put 0 tensor to device Signed-off-by: Zhilin Wang <[email protected]> * update link within dialogue tutorial Signed-off-by: Zhilin Wang <[email protected]> * restore previously delete files Signed-off-by: Zhilin Wang <[email protected]> * update error handling when loss = nan Signed-off-by: Zhilin Wang <[email protected]> * update nan handling Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update spanning loss func Signed-off-by: Zhilin Wang <[email protected]> * update spanning loss Signed-off-by: Zhilin Wang <[email protected]> * fix type error raised in qa_dataset.py Signed-off-by: Zhilin Wang <[email protected]> * add error checking message Signed-off-by: Zhilin Wang <[email protected]> * revert back to float32 Signed-off-by: Zhilin Wang <[email protected]> * revert back to float32 Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update exp logging Signed-off-by: Zhilin Wang <[email protected]> * update error msgs Signed-off-by: Zhilin Wang <[email protected]> * update loading of large file from pickle to json Signed-off-by: Zhilin Wang <[email protected]> * update loading of large file from pickle to json Signed-off-by: Zhilin Wang <[email protected]> * limit number of negative samples Signed-off-by: Zhilin Wang <[email protected]> * revert post processing Signed-off-by: Zhilin Wang <[email protected]> * revert post processing Signed-off-by: Zhilin Wang <[email protected]> * remove unused methods and style fix Signed-off-by: Zhilin Wang <[email protected]> * add more documentation Signed-off-by: Zhilin Wang <[email protected]> * remove unused imports Signed-off-by: Zhilin Wang <[email protected]> * changes base on PR review Signed-off-by: Zhilin Wang <[email protected]> * set wandb logger falseby default Signed-off-by: Zhilin Wang <[email protected]> * update interface with megatron gpt prompt learning Signed-off-by: Zhilin Wang <[email protected]> * update inline documentation Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update prompt_ids Signed-off-by: Zhilin Wang <[email protected]> * update error msg Signed-off-by: Zhilin Wang <[email protected]> * update config Signed-off-by: Zhilin Wang <[email protected]> * update config Signed-off-by: Zhilin Wang <[email protected]> * set inference = False for dialgue prompt learning during trainng Signed-off-by: Zhilin Wang <[email protected]> * set inference = False for dialgue prompt learning during trainng Signed-off-by: Zhilin Wang <[email protected]> * remove unused code Signed-off-by: Zhilin Wang <[email protected]> * update config yaml Signed-off-by: Zhilin Wang <[email protected]> * fix bug for megatron gpt prompt learning Signed-off-by: Zhilin Wang <[email protected]> * remove unused import Signed-off-by: Zhilin Wang <[email protected]> * address comments in PR Signed-off-by: Zhilin Wang <[email protected]> * address comments in PR Signed-off-by: Zhilin Wang <[email protected]> * address typo Signed-off-by: Zhilin Wang <[email protected]> * add megatron t5 inference Signed-off-by: Zhilin Wang <[email protected]> * fix bug due to bert tokenizer not being space-aware Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update style Signed-off-by: Zhilin Wang <[email protected]> * update IntentSlotModel onnx export test Signed-off-by: Zhilin Wang <[email protected]> * update style Signed-off-by: Zhilin Wang <[email protected]> * update exportable Signed-off-by: Zhilin Wang <[email protected]> * address PR comments Signed-off-by: Zhilin Wang <[email protected]> * replace functools.cache_property with functools.lru_cache to maintain python 3.7 compatibility Signed-off-by: Zhilin Wang <[email protected]> * improve speed of rank_candidates and support for p tuning Signed-off-by: Zhilin Wang <[email protected]> * update dialogue.py Signed-off-by: Zhilin Wang <[email protected]> * fix megatron prompt learning saving bug Signed-off-by: Zhilin Wang <[email protected]> * update generate_candidate method Signed-off-by: Zhilin Wang <[email protected]> * remove repeated init text ids and invert attention masks Signed-off-by: Zhilin Wang <[email protected]> * update typo Signed-off-by: Zhilin Wang <[email protected]> * custom collate fn to remove excess padding in batch Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update complete method to mitigate issue when max seq len is low Signed-off-by: Zhilin Wang <[email protected]> * address pr comments Signed-off-by: Zhilin Wang <[email protected]> * update generation interface Signed-off-by: Zhilin Wang <[email protected]> Signed-off-by: Zhilin Wang <[email protected]> Co-authored-by: Zhilin Wang <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Yang Zhang <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * Added save inference ready .nemo file with every checkpoint (#5055) * Added save inference ready .nemo file with every checkpoint Signed-off-by: Virginia Adams <[email protected]> * Python style fix Signed-off-by: Virginia Adams <[email protected]> * addressed Adi's comment Signed-off-by: Virginia Adams <[email protected]> * Added ptuning check in model checkpoint saving Signed-off-by: Virginia Adams <[email protected]> * Changed save_nemo_on_valdaition default to False Signed-off-by: Virginia Adams <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Changes global batch size of adapter CI Signed-off-by: Virginia Adams <[email protected]> * Changed num workers to 0 Signed-off-by: Virginia Adams <[email protected]> * added first stage of pipeline check Signed-off-by: Virginia Adams <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Virginia Adams <[email protected]> Signed-off-by: Virginia Adams <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Hainan Xu <[email protected]> * Fixes for docs/typos + remove max_utts parameter from tarred datasets as it causes hang in training (#5118) * Remove ; from jupyter notebook cells Signed-off-by: Igor Gitman <[email protected]> * Fix typos in documentation/code Signed-off-by: Igor Gitman <[email protected]> * Fix output message to have 'or equal' Signed-off-by: Igor Gitman <[email protected]> * Link formatting fixes Signed-off-by: Igor Gitman <[email protected]> * Add error if max_utts is used in tarred datasets Signed-off-by: Igor Gitman <[email protected]> * Remove max_utts parameter from tarred datasets Signed-off-by: Igor Gitman <[email protected]> * Fix max_utts removal in tests Signed-off-by: Igor Gitman <[email protected]> * Fix typo if -> is Signed-off-by: Igor Gitman <[email protected]> Signed-off-by: Igor Gitman <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * Merge r1.12.0 main (#5139) * update branch Signed-off-by: ericharper <[email protected]> * Add cherry-pick action (#4958) * add cherry-pick action Signed-off-by: ericharper <[email protected]> * Pin Transformers version to fix CI (#4955) * Pin transformers version in CI to prevent offline tokenizer loading error Signed-off-by: SeanNaren <[email protected]> * Drop version Signed-off-by: SeanNaren <[email protected]> * Disable offline temporarily Signed-off-by: SeanNaren <[email protected]> * Disable offline temporarily Signed-off-by: SeanNaren <[email protected]> * Enable offline Signed-off-by: SeanNaren <[email protected]> Signed-off-by: SeanNaren <[email protected]> Signed-off-by: ericharper <[email protected]> Signed-off-by: SeanNaren <[email protected]> Co-authored-by: Sean Naren <[email protected]> * upper bound transformers Signed-off-by: ericharper <[email protected]> * remove duplicate transformers requirement Signed-off-by: ericharper <[email protected]> * Release SOTA Lang ID model (#5080) * add pretrained lang id model ambernet Signed-off-by: fayejf <[email protected]> * update doc and style fix Signed-off-by: fayejf <[email protected]> Signed-off-by: fayejf <[email protected]> * update branch and package info Signed-off-by: ericharper <[email protected]> * remove upper bounds on lightning and transformers Signed-off-by: ericharper <[email protected]> * remove transformers offline from ci Signed-off-by: ericharper <[email protected]> * upper bound transformers Signed-off-by: ericharper <[email protected]> Signed-off-by: ericharper <[email protected]> Signed-off-by: SeanNaren <[email protected]> Signed-off-by: fayejf <[email protected]> Co-authored-by: Sean Naren <[email protected]> Co-authored-by: fayejf <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * Added ASR model comparison to SDE (#5043) SDE: Added ASR model comparison tool to SDE transcribe speech: Added support for many predictions in one file, as well as custom field names Signed-off-by: George Zelenfroynd <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * fix nmt eval sampler (#5154) Signed-off-by: Abhinav Khattar <[email protected]> Signed-off-by: Abhinav Khattar <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * Fix Global init steps (#5143) * move global step to base Signed-off-by: Yi Dong <[email protected]> * fix fused softmax Signed-off-by: Yi Dong <[email protected]> * add the missing file Signed-off-by: Yi Dong <[email protected]> * update the fused kernel Signed-off-by: Yi Dong <[email protected]> * fix import error Signed-off-by: Yi Dong <[email protected]> * fix import again Signed-off-by: Yi Dong <[email protected]> Signed-off-by: Yi Dong <[email protected]> Signed-off-by: Yi Dong <[email protected]> Co-authored-by: Yi Dong <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * [TTS] bug fix - sample rate was being ignored in vocoder dataset (#4518) * bug fix - sample rate was being ignored in vocoder dataset when not loading mel * handled n segments for a different sampling rate than original sampling rate * Added case for n_segments 0, warning for n_segments greater than file length Signed-off-by: Paarth Neekhara <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Jocelyn <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * Add EMA support to NeMo (#4764) * Added Base files Signed-off-by: SeanNaren <[email protected]> * Some refactors, swap to using MNIST Lnet Signed-off-by: SeanNaren <[email protected]> * Add a few more tests, allow the callback to be set via the exp manager Signed-off-by: SeanNaren <[email protected]> * Actually run validation for testing Signed-off-by: SeanNaren <[email protected]> * Run isort Signed-off-by: SeanNaren <[email protected]> * Add test for saving state/fix saving state Signed-off-by: SeanNaren <[email protected]> * Use dummy model Signed-off-by: SeanNaren <[email protected]> * Fix test Signed-off-by: SeanNaren <[email protected]> * Add copyright Signed-off-by: SeanNaren <[email protected]> * Support saving separate EMA weight module Signed-off-by: SeanNaren <[email protected]> * Add standalone functionality/logging Signed-off-by: SeanNaren <[email protected]> * Expose more parameters Signed-off-by: SeanNaren <[email protected]> * Modify to allow option to replace validation Signed-off-by: SeanNaren <[email protected]> * Add jenkins test, formatting Signed-off-by: SeanNaren <[email protected]> * Pin Transformers version to fix CI (#4955) * Pin transformers version in CI to prevent offline tokenizer loading error Signed-off-by: SeanNaren <[email protected]> * Drop version Signed-off-by: SeanNaren <[email protected]> * Disable offline temporarily Signed-off-by: SeanNaren <[email protected]> * Disable offline temporarily Signed-off-by: SeanNaren <[email protected]> * Enable offline Signed-off-by: SeanNaren <[email protected]> Signed-off-by: SeanNaren <[email protected]> * Add cherry-pick action (#4958) (#4961) * add cherry-pick action Signed-off-by: ericharper <[email protected]> * Pin Transformers version to fix CI (#4955) * Pin transformers version in CI to prevent offline tokenizer loading error Signed-off-by: SeanNaren <[email protected]> * Drop version Signed-off-by: SeanNaren <[email protected]> * Disable offline temporarily Signed-off-by: SeanNaren <[email protected]> * Disable offline temporarily Signed-off-by: SeanNaren <[email protected]> * Enable offline Signed-off-by: SeanNaren <[email protected]> Signed-off-by: SeanNaren <[email protected]> Signed-off-by: ericharper <[email protected]> Signed-off-by: SeanNaren <[email protected]> Co-authored-by: Sean Naren <[email protected]> Signed-off-by: ericharper <[email protected]> Signed-off-by: SeanNaren <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Sean Naren <[email protected]> Signed-off-by: SeanNaren <[email protected]> * Fix changelog builder (#4962) (#4963) Signed-off-by: smajumdar <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: SeanNaren <[email protected]> * fix cherry pick workflow (#4964) (#4965) Signed-off-by: ericharper <[email protected]> Signed-off-by: ericharper <[email protected]> Signed-off-by: ericharper <[email protected]> Co-authored-by: Eric Harper <[email protected]> Signed-off-by: SeanNaren <[email protected]> * reorder model check (#4959) (#4967) Signed-off-by: nithinraok <[email protected]> Signed-off-by: nithinraok <[email protected]> Signed-off-by: nithinraok <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Signed-off-by: SeanNaren <[email protected]> * check for active conda environment (#4970) (#4971) Signed-off-by: SeanNaren <[email protected]> * [TTS] fix broken tutorial for MixerTTS. (#4949) (#4976) Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Signed-off-by: SeanNaren <[email protected]> * Checkpoint averaging class fix (#4946) * 1. Added args.class_path to provide it externally. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed style. Signed-off-by: Micha Livne <[email protected]> Signed-off-by: Micha Livne <[email protected]> Signed-off-by: SeanNaren <[email protected]> * Add ability to give seperate datasets for test, train and validation (#4798) * Add ability to give seperate datasets for test, train and validation * Addressed Sandeeps comments * Addressed Sandeeps comments * Add ability to give seperate datasets for test, train and validation * Add ability to give seperate datasets for test, train and validation * Addressed review comments * Bug fix for common dataset utils * Add CI tests Signed-off-by: shanmugamr1992 <[email protected]> * Reformat code Signed-off-by: shanmugamr1992 <[email protected]> * Bug fix Signed-off-by: shanmugamr1992 <[email protected]> * Bug fix * Bug Fix * Bug Fix * Update Jenkinsfile * Addressed comments * Addressed Eriks comments. * Addressed Sandeep * Update Jenkinsfile * Update Jenkinsfile * Update dataset_utils.py * Update Jenkinsfile * Update Jenkinsfile * Use GPT CI config Signed-off-by: MaximumEntropy <[email protected]> Signed-off-by: shanmugamr1992 <[email protected]> Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: MaximumEntropy <[email protected]> Signed-off-by: SeanNaren <[email protected]> * fix label models restoring issue from wrighted cross entropy (#4968) (#4975) Signed-off-by: nithinraok <[email protected]> Signed-off-by: nithinraok <[email protected]> Signed-off-by: nithinraok <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Signed-off-by: SeanNaren <[email protected]> * Add simple pre-commit file (#4983) * Add simple pre-commit file Signed-off-by: SeanNaren <[email protected]> * Exclude docs folder Signed-off-by: SeanNaren <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: SeanNaren <[email protected]> * Revert "[pre-commit.ci] auto fixes from pre-commit.com hooks" This reverts commit 053bd5ba579537a5f311b431871c21f3381b43eb. Signed-off-by: SeanNaren <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: SeanNaren <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: SeanNaren <[email protected]> * Import pycuda.autoprimaryctx or pycuda.autoinit to init pycuda execution environment (#4951) Signed-off-by: Jin Li <[email protected]> Signed-off-by: Jin Li <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Signed-off-by: SeanNaren <[email protected]> * Adding speaker embedding conditioning in fastpitch (#4986) Signed-off-by: subhankar-ghosh <[email protected]> Signed-off-by: subhankar-ghosh <[email protected]> Signed-off-by: SeanNaren <[email protected]> * Fix ASR issues (#4984) (#4991) * Fix ASR issues Signed-off-by: smajumdar <[email protected]> * Revert fix Signed-off-by: smajumdar <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: smajumdar <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Signed-off-by: SeanNaren <[email protected]> * Fix current tests Signed-off-by: SeanNaren <[email protected]> * More test coverage Signed-off-by: SeanNaren <[email protected]> * Address reviews Signed-off-by: SeanNaren <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Address review Signed-off-by: SeanNaren <[email protected]> * Drop bf16 test Signed-off-by: SeanNaren <[email protected]> * Address review Signed-off-by: SeanNaren <[email protected]> * remove print Signed-off-by: SeanNaren <[email protected]> * Add bf16 Signed-off-by: SeanNaren <[email protected]> Signed-off-by: SeanNaren <[email protected]> Signed-off-by: ericharper <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: nithinraok <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Micha Livne <[email protected]> Signed-off-by: shanmugamr1992 <[email protected]> Signed-off-by: MaximumEntropy <[email protected]> Signed-off-by: Jin Li <[email protected]> Signed-off-by: subhankar-ghosh <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Micha Livne <[email protected]> Co-authored-by: shanmugamr1992 <[email protected]> Co-authored-by: MaximumEntropy <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: liji-nv <[email protected]> Co-authored-by: Subhankar Ghosh <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * Fix BF16 test (#5162) Signed-off-by: SeanNaren <[email protected]> Signed-off-by: SeanNaren <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * Fix errors in speaker diarization nemo docs (#5153) * fix docs and docstrings for MSDD Signed-off-by: Taejin Park <[email protected]> * fix nemo docs errors Signed-off-by: Taejin Park <[email protected]> * reflected review comments Signed-off-by: Taejin Park <[email protected]> Signed-off-by: Taejin Park <[email protected]> Signed-off-by: Hainan Xu <[email protected]> * Add interleaved pipeline schedule to GPT (#5025) * add virtual pipeline size to config Signed-off-by: ericharper <[email protected]> * convert model to list of modules Signed-off-by: ericharper <[email protected]> * convert model to list of modules Signed-off-by: ericharper <[email protected]> * convert model to list of modules Signed-off-by: ericharper <[email protected]> * update for list of modules Signed-off-by: ericharper <[email protected]> * add virtual to init Signed-off-by: ericharper <[email protected]> * update first last stage embedding all reduce Signed-off-by: ericharper <[email protected]> * update sequence parallel all reduce for virtual models Signed-off-by: ericharper <[email protected]> * runs but we get an error Signed-off-by: ericharper <[email protected]> * set virtual rank 0 after looping Signed-off-by: ericharper <[email protected]> * account for virtual when determinining first and last pipeline stages Signed-off-by: ericharper <[email protected]> * checkpointing for virtual models in progress Signed-off-by: ericharper <[email protected]> * add checkpoint hooks Signed-off-by: ericharper <[email protected]> * working on validation when resuming Signed-off-by: ericharper <[email protected]> * skip sanity val steps by default in config Signed-off-by: ericharper <[email protected]> * remove comment Signed-off-by: ericharper <[email protected]> * log number of params Signed-off-by: ericharper <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * style Signed-off-by: ericharper <[email protected]> * check if self.model is a list Signed-off-by: ericharper <[email protected]> * make virtual pipeline default size None on init Signed-off-by: ericharper <[email protected]> * make virtual pipeline default to None in config Signed-off-by: ericharper <[email protected]> * remove ensure_divisibility call Signed-off-by: ericharper <[email protected]> * fix lgtm alerts Signed-off-by: ericharper <[email protected]> * remove num_sanity_val_steps from config Signed-off-by: ericharper <complex451@gmai…

Signed-off-by: eharper <[email protected]>

* change seed to dataset init * Apply isort and black reformatting Signed-off-by: suiyoubi <[email protected]> --------- Signed-off-by: suiyoubi <[email protected]> Co-authored-by: suiyoubi <[email protected]>

Signed-off-by: Gao Deng <[email protected]>

* Add trtllm checkpoint * Change model config * fix no query_group * Using build API * Change export to new API * Update generate API * Fix runtime config * Fix for llama * Fix for ptuning * Fix TP issue * Change TP rank for building weight dict * Add lora config * add prompt embedding table config * Fix PP isue * PP layers fix * Fix no prompt task ids * Add bos for Gemma * Add multi block mode * Embedding and layernorm for PP * MPI multiprocess support for multinode * Only output text on first rank * Change to ModelRunnerCpp * Add falcon * Add rotary_pct default value * Falcon fix * Add MOE config * Fix MOE weight dict * Clean code * Add rotary_base * Fix MOE config * Fix falcon new architecture * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix Gemma 7B * Add rotary_scaling * Apply isort and black reformatting Signed-off-by: oyilmaz-nvidia <[email protected]> --------- Signed-off-by: oyilmaz-nvidia <[email protected]> Co-authored-by: abharwani <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Onur Yilmaz <[email protected]> Co-authored-by: oyilmaz-nvidia <[email protected]> Co-authored-by: Eric Harper <[email protected]>

* fix fuser issue with dynamo * optimized 4k seq len * optim 8k * add checkpointing * add ckpt arg * fix minor bug * minor fix * more optimized chkpting * Apply isort and black reformatting Signed-off-by: JRD971000 <[email protected]> * addressing comments * Apply isort and black reformatting Signed-off-by: JRD971000 <[email protected]> --------- Signed-off-by: JRD971000 <[email protected]> Co-authored-by: Ali Taghibakhshi <[email protected]> Co-authored-by: JRD971000 <[email protected]>

…ded + fixes in estimation script (NVIDIA#9157) * Bucketing duration bins: less optimal but instant init when not provided + fixes in estimation script Signed-off-by: Piotr Żelasko <[email protected]> * Fix CPU mem hungriness Signed-off-by: Piotr Żelasko <[email protected]> * Make estimate duration bins work for every kind of manifest Signed-off-by: Piotr Żelasko <[email protected]> * Support more type of inputs Signed-off-by: Piotr Żelasko <[email protected]> * fixes Signed-off-by: Piotr Żelasko <[email protected]> * msg Signed-off-by: Piotr Żelasko <[email protected]> * fix Signed-off-by: Piotr Żelasko <[email protected]> * fix Signed-off-by: Piotr Żelasko <[email protected]> * Apply isort and black reformatting Signed-off-by: pablo-garay <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: pablo-garay <[email protected]> Co-authored-by: Pablo Garay <[email protected]> Co-authored-by: pablo-garay <[email protected]>

…VIDIA#9197) * Enable CUDA graphs only for transcription. Sync streams before capture. --------- Signed-off-by: Vladimir Bataev <[email protected]>

* move tts fixtures Signed-off-by: Jason <[email protected]> * Apply isort and black reformatting Signed-off-by: blisc <[email protected]> --------- Signed-off-by: Jason <[email protected]> Signed-off-by: blisc <[email protected]> Co-authored-by: blisc <[email protected]>

* enable matryoshka embedding learning Signed-off-by: arendu <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Apply isort and black reformatting Signed-off-by: arendu <[email protected]> --------- Signed-off-by: arendu <[email protected]> Signed-off-by: arendu <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: arendu <[email protected]>

* Add guards to SD imports Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Prevent duplicated checkpoints Signed-off-by: Mikołaj Błaż <[email protected]> * Introduce DistributedCheckpointIO Signed-off-by: Mikołaj Błaż <[email protected]> * Fix DistCkptIO usage Signed-off-by: Mikołaj Błaż <[email protected]> * Use NeMo logger Signed-off-by: Mikołaj Błaż <[email protected]> * [DCIO] Fix save_to dist ckpt path Signed-off-by: Mikołaj Błaż <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add versioning to save_to Signed-off-by: Mikołaj Błaż <[email protected]> * Add versioning logic to all .nemo files Signed-off-by: Mikołaj Błaż <[email protected]> * Add versioning test Signed-off-by: Mikołaj Błaż <[email protected]> * Add dist-ckpt test Signed-off-by: Mikołaj Błaż <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Mikołaj Błaż <[email protected]> * Rename existing ckpts instead of using different name Signed-off-by: Mikołaj Błaż <[email protected]> * Add comment Signed-off-by: Mikołaj Błaż <[email protected]> * Use dist ckpt flag in all methods Signed-off-by: Mikołaj Błaż <[email protected]> * Improve error msg Signed-off-by: Mikołaj Błaż <[email protected]> * Add dist ckpt unit tests Signed-off-by: Mikołaj Błaż <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix load_checkpoint Signed-off-by: Mikołaj Błaż <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Mikołaj Błaż <[email protected]> * Fix auto-issues Signed-off-by: Mikołaj Błaż <[email protected]> * Fix ckpt_dir var Signed-off-by: Mikołaj Błaż <[email protected]> * Restore skipping behavior The fix from prevent-duplicated-checkpoints is required to skip the checkpoints Signed-off-by: Mikołaj Błaż <[email protected]> * Fix steps on single-GPU machine Signed-off-by: Mikołaj Błaż <[email protected]> * Run dist-ckpt test on GPU Signed-off-by: Mikołaj Błaż <[email protected]> * Add docs Signed-off-by: Mikołaj Błaż <[email protected]> * Apply black Signed-off-by: Mikołaj Błaż <[email protected]> * Prevent saving last for non-equal val intervals Signed-off-by: Mikołaj Błaż <[email protected]> * Move checkpoint on rank 0 Signed-off-by: Mikołaj Błaż <[email protected]> * Fix num steps in tests Signed-off-by: Mikołaj Błaż <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Mikołaj Błaż <[email protected]> * Add async ckpt implementation Signed-off-by: Mikołaj Błaż <[email protected]> * Abstract AsyncFinalizableCheckpointIO away Signed-off-by: Mikołaj Błaż <[email protected]> * Change async_save flag location Signed-off-by: Mikołaj Błaż <[email protected]> * Add debug info Signed-off-by: Mikołaj Błaż <[email protected]> * Apply formatting Signed-off-by: Mikołaj Błaż <[email protected]> * Handle multiple async saves Signed-off-by: Mikołaj Błaż <[email protected]> * Apply formatting Signed-off-by: Mikołaj Błaż <[email protected]> * Move finalization calls to a callback Signed-off-by: Mikołaj Błaż <[email protected]> * Avoid deadlock in teardown Signed-off-by: Mikołaj Błaż <[email protected]> * Adjust to MCore implementation Signed-off-by: Mikołaj Błaż <[email protected]> * Add notes and copyrights Signed-off-by: Mikołaj Błaż <[email protected]> * Apply formatting Signed-off-by: Mikołaj Błaż <[email protected]> * Fix async_request attribute Signed-off-by: Mikołaj Błaż <[email protected]> * Add MCore import guards Signed-off-by: Mikołaj Błaż <[email protected]> * Add async test Signed-off-by: Mikołaj Błaż <[email protected]> * Fix finalize_fn arg Signed-off-by: Mikołaj Błaż <[email protected]> * Add docs Signed-off-by: Mikołaj Błaż <[email protected]> * Remove checkpoints from accurate steps Signed-off-by: Mikołaj Błaż <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix MCore class usage Signed-off-by: Mikołaj Błaż <[email protected]> * Update docs Signed-off-by: Mikołaj Błaż <[email protected]> * Fix logger usage Signed-off-by: Mikołaj Błaż <[email protected]> * Fix rebase Signed-off-by: Mikołaj Błaż <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix code scan issues Signed-off-by: Mikołaj Błaż <[email protected]> * Remove unsused import Signed-off-by: Mikołaj Błaż <[email protected]> * Use dist-ckpt for Bert Signed-off-by: Mikołaj Błaż <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix load checkpoint return val Signed-off-by: Mikołaj Błaż <[email protected]> * Use dist-ckpt based on sharded_state_dict Signed-off-by: Mikołaj Błaż <[email protected]> * Add async logging Signed-off-by: Mikołaj Błaż <[email protected]> * Remove deprecated argument Signed-off-by: Mikołaj Błaż <[email protected]> * Use correct checkpoint_io Signed-off-by: Mikołaj Błaż <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix bad merge Signed-off-by: Mikołaj Błaż <[email protected]> * Improve debug msg Signed-off-by: Mikołaj Błaż <[email protected]> * Run async test on GPU Signed-off-by: Mikołaj Błaż <[email protected]> * Fix async ckpt unit test Signed-off-by: Mikołaj Błaż <[email protected]> * Apply isort and black reformatting Signed-off-by: mikolajblaz <[email protected]> * Clarify async logs Signed-off-by: Mikołaj Błaż <[email protected]> * Add schema print Signed-off-by: Mikołaj Błaż <[email protected]> --------- Signed-off-by: Mikołaj Błaż <[email protected]> Signed-off-by: mikolajblaz <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Fix incorrect if logic Signed-off-by: Mikołaj Błaż <[email protected]> * Apply isort and black reformatting Signed-off-by: mikolajblaz <[email protected]> --------- Signed-off-by: Mikołaj Błaż <[email protected]> Signed-off-by: mikolajblaz <[email protected]>

…IA#9178) * Update PTQ to use nvidia-modelopt Signed-off-by: Jan Lasek <[email protected]> * Restore PTQ tests Signed-off-by: Jan Lasek <[email protected]> * Update docs Signed-off-by: Jan Lasek <[email protected]> * Comment on apply_rope_fusion Signed-off-by: Jan Lasek <[email protected]> * Support for calibration PP > 1 Signed-off-by: Jan Lasek <[email protected]> * Apply isort and black reformatting Signed-off-by: janekl <[email protected]> * Fix cicd-main.yml indent Signed-off-by: Jan Lasek <[email protected]> * Set data/tensor parallel groups Signed-off-by: Jan Lasek <[email protected]> * Install only torch dependecies Signed-off-by: Jan Lasek <[email protected]> * Follow up on recent modelopt changes Signed-off-by: Jan Lasek <[email protected]> * Model support matrix Signed-off-by: Jan Lasek <[email protected]> * Apply isort and black reformatting Signed-off-by: janekl <[email protected]> * Rename PTQ script as it should be model-agnostic Signed-off-by: Jan Lasek <[email protected]> * Remove unused import Signed-off-by: Jan Lasek <[email protected]> * Update setup instructions Signed-off-by: Jan Lasek <[email protected]> --------- Signed-off-by: Jan Lasek <[email protected]> Signed-off-by: janekl <[email protected]> Co-authored-by: janekl <[email protected]>

* GPU-based vectorized SpecAug Signed-off-by: Piotr Żelasko <[email protected]> * Wider dtypes for specaug mask bounds computation Signed-off-by: Piotr Żelasko <[email protected]> * fast spec augmentation v2 * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Removed randint code, added comments * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fixed padding coverage bug, fixed long casting bug, fixed comments Signed-off-by: Alessandro Morari <[email protected]> * fixed bug due to using freq_axis with length Signed-off-by: Alessandro Morari <[email protected]> * Added tests for vectorized spectrogram augmentation Signed-off-by: Alessandro Morari <[email protected]> * Apply isort and black reformatting Signed-off-by: pzelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: Alessandro Morari <[email protected]> Signed-off-by: pzelasko <[email protected]> Co-authored-by: Piotr Żelasko <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: pzelasko <[email protected]>

* Remove config aligner - no longer needed after TRT-LLM 0.9 update Signed-off-by: Jan Lasek <[email protected]> * Change default export precision to bf16 (more frequent) Signed-off-by: Jan Lasek <[email protected]> * Specify gpt_attention_plugin Signed-off-by: Jan Lasek <[email protected]> --------- Signed-off-by: Jan Lasek <[email protected]>

Removed best-practices.rst file Signed-off-by: jgerh <[email protected]> Co-authored-by: Eric Harper <[email protected]>

* build: Add `Dockerfile.ci` Signed-off-by: Oliver Koenig <[email protected]> * ci: Build, push, and test ci image Signed-off-by: Oliver Koenig <[email protected]> * chore: Disable cache dir for NeMo reinstall Signed-off-by: Oliver Koenig <[email protected]> * revert: Modify `reinstall.sh` Signed-off-by: Oliver Koenig <[email protected]> * fix: install modelopt[torch] instead of ammo Signed-off-by: Oliver Koenig <[email protected]> * deduplicate requirements Signed-off-by: Oliver Koenig <[email protected]> * make mcore/datasets Signed-off-by: Oliver Koenig <[email protected]> --------- Signed-off-by: Oliver Koenig <[email protected]>

* Add save option to the test script Signed-off-by: Onur Yilmaz <[email protected]> * Apply isort and black reformatting Signed-off-by: oyilmaz-nvidia <[email protected]> --------- Signed-off-by: Onur Yilmaz <[email protected]> Signed-off-by: oyilmaz-nvidia <[email protected]> Co-authored-by: oyilmaz-nvidia <[email protected]>

* rename paths2audiofiles to audio * update transcribe to audio --------- Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao <[email protected]>

* make ckpt loading backward compatible * Apply isort and black reformatting Signed-off-by: suiyoubi <[email protected]> * if not using dist optimizer, the states are stored in 'optimizer' * Apply isort and black reformatting Signed-off-by: suiyoubi <[email protected]> * code refactor * Apply isort and black reformatting Signed-off-by: suiyoubi <[email protected]> * typo --------- Signed-off-by: suiyoubi <[email protected]> Co-authored-by: suiyoubi <[email protected]> Co-authored-by: Alexandros Koumparoulis <[email protected]>

* support qwen1.5(qwen2) Signed-off-by: Agoniii <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove unused import Signed-off-by: Cathy <[email protected]> * Apply isort and black reformatting Signed-off-by: pablo-garay <[email protected]> --------- Signed-off-by: Agoniii <[email protected]> Signed-off-by: Cathy <[email protected]> Signed-off-by: pablo-garay <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Pablo Garay <[email protected]> Co-authored-by: pablo-garay <[email protected]>

* Implement PyT Dist load with MCore Signed-off-by: Mikołaj Błaż <[email protected]> * Use plain PyT Dist utils Signed-off-by: Mikołaj Błaż <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Implement TarPath compatible version Signed-off-by: Mikołaj Błaż <[email protected]> * Apply black Signed-off-by: Mikołaj Błaż <[email protected]> --------- Signed-off-by: Mikołaj Błaż <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* ci: Multi-tenancy for tests and garbage collection Signed-off-by: Oliver Koenig <[email protected]> * add remaining testcases Signed-off-by: Oliver Koenig <[email protected]> --------- Signed-off-by: Oliver Koenig <[email protected]>

* quick fix Signed-off-by: Travis Bartley <[email protected]> * Update nemo/collections/common/data/lhotse/nemo_adapters.py Signed-off-by: Piotr Żelasko <[email protected]> * adding warning flag for non-sharded data. Signed-off-by: Travis Bartley <[email protected]> * Apply isort and black reformatting Signed-off-by: tbartley94 <[email protected]> --------- Signed-off-by: Travis Bartley <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: tbartley94 <[email protected]> Co-authored-by: Piotr Żelasko <[email protected]> Co-authored-by: tbartley94 <[email protected]>

Signed-off-by: andrusenkoau <[email protected]> Co-authored-by: Andrei Andrusenko <[email protected]>

* Support dataloader as input to `audio` for transcription Signed-off-by: smajumdar <[email protected]> * Apply isort and black reformatting Signed-off-by: titu1994 <[email protected]> * Support dataloader as input to `audio` for transcription Signed-off-by: smajumdar <[email protected]> * Update transcribe signatures Signed-off-by: smajumdar <[email protected]> * Apply isort and black reformatting Signed-off-by: titu1994 <[email protected]> --------- Signed-off-by: smajumdar <[email protected]> Signed-off-by: titu1994 <[email protected]>

* add various docs fixes Signed-off-by: Elena Rastorgueva <[email protected]> * make conf.py changes clearer Signed-off-by: Elena Rastorgueva <[email protected]> * fix Duplicate explicit target name error for links Signed-off-by: Elena Rastorgueva <[email protected]> * more fixes, mainly citations Signed-off-by: Elena Rastorgueva <[email protected]> * fix some code formatting Signed-off-by: Elena Rastorgueva <[email protected]> * update hf space iframe link Signed-off-by: Elena Rastorgueva <[email protected]> * fix new ERRORs Signed-off-by: Elena Rastorgueva <[email protected]> * Update docs Signed-off-by: yaoyu-33 <[email protected]> * Add MQA and GQA Signed-off-by: yaoyu-33 <[email protected]> * Fix small issues Signed-off-by: yaoyu-33 <[email protected]> * Add parallelisms Signed-off-by: yaoyu-33 <[email protected]> * Add seq packing in NeMo dev doc Signed-off-by: yaoyu-33 <[email protected]> * fix few issues Signed-off-by: yaoyu-33 <[email protected]> * fix table Signed-off-by: yaoyu-33 <[email protected]> * fix table Signed-off-by: yaoyu-33 <[email protected]> * fix table Signed-off-by: yaoyu-33 <[email protected]> * fix table Signed-off-by: yaoyu-33 <[email protected]> * add EP Signed-off-by: yaoyu-33 <[email protected]> * squeeze in neva updates Signed-off-by: yaoyu-33 <[email protected]> * rename Megatron-Core to Megatron Core Signed-off-by: yaoyu-33 <[email protected]> * address comments Signed-off-by: yaoyu-33 <[email protected]> * Fix typo Signed-off-by: yaoyu-33 <[email protected]> * Update index Signed-off-by: yaoyu-33 <[email protected]> * fix Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: Elena Rastorgueva <[email protected]> Co-authored-by: Elena Rastorgueva <[email protected]>

* Update examples Signed-off-by: yaoyu-33 <[email protected]> * Update index Signed-off-by: yaoyu-33 <[email protected]> * Update index Signed-off-by: yaoyu-33 <[email protected]> * update Signed-off-by: yaoyu-33 <[email protected]> * update Signed-off-by: yaoyu-33 <[email protected]> * fix Signed-off-by: yaoyu-33 <[email protected]> * update Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]>

…#9223) Signed-off-by: Alexandros Koumparoulis <[email protected]>

* revert rope fusion defaults Signed-off-by: Chen Cui <[email protected]> * Apply isort and black reformatting Signed-off-by: cuichenx <[email protected]> --------- Signed-off-by: Chen Cui <[email protected]> Signed-off-by: cuichenx <[email protected]> Co-authored-by: cuichenx <[email protected]>

Signed-off-by: He Huang (Steve) <[email protected]>

Signed-off-by: paul-gibbons <[email protected]> text gen defaults Signed-off-by: paul-gibbons <[email protected]>

stevehuang52 and others added 30 commits May 10, 2024 19:24

ASR_dev_run_Speech_To_Text_HF_Finetuning optional as flaky (NVIDIA#9180)

820a285

update (NVIDIA#9181)

a0e9ee3

Signed-off-by: eharper <[email protected]>

Change FIM Dataset Random Seed Init (NVIDIA#9165)

7a23bfa

* change seed to dataset init * Apply isort and black reformatting Signed-off-by: suiyoubi <[email protected]> --------- Signed-off-by: suiyoubi <[email protected]> Co-authored-by: suiyoubi <[email protected]>

increase time limit for Speech_Checkpoints_tests (NVIDIA#9186)

43686ec

fix ep rank (NVIDIA#9161)

467d94b

Signed-off-by: Gao Deng <[email protected]>

Enable CUDA graphs by default only for transcription (NVIDIA#9196) (N…

acbd4e0

…VIDIA#9197) * Enable CUDA graphs only for transcription. Sync streams before capture. --------- Signed-off-by: Vladimir Bataev <[email protected]>

run_cicd_for_release_branches_also (NVIDIA#9213)

964ea3c

Update index.rst (NVIDIA#9080)

b489fba

Removed best-practices.rst file Signed-off-by: jgerh <[email protected]> Co-authored-by: Eric Harper <[email protected]>

rename paths2audiofiles to audio (NVIDIA#9209) (NVIDIA#9220)

73edac4

* rename paths2audiofiles to audio * update transcribe to audio --------- Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao <[email protected]>

fix graphviz installation for local run (NVIDIA#9233) (NVIDIA#9234)

7f3e535

Signed-off-by: andrusenkoau <[email protected]> Co-authored-by: Andrei Andrusenko <[email protected]>

yaoyu-33 and others added 6 commits May 17, 2024 10:47

use get with fallback when reading checkpoint_callback_params (NVIDIA…

0744016

…#9223) Signed-off-by: Alexandros Koumparoulis <[email protected]>

Update Online_Offline_Microphone_VAD_Demo.ipynb (NVIDIA#9251)

1d576e4

Signed-off-by: He Huang (Steve) <[email protected]>

neva media_type fix

d11324e

Signed-off-by: paul-gibbons <[email protected]> text gen defaults Signed-off-by: paul-gibbons <[email protected]>

github-actions bot added core Changes to NeMo Core TTS ASR NLP CI common Multi Modal labels May 20, 2024

paul-gibbons closed this May 20, 2024

paul-gibbons deleted the neva-fix branch May 20, 2024 17:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NeVA media_type fix #9255

NeVA media_type fix #9255

paul-gibbons commented May 20, 2024

NeVA media_type fix #9255

NeVA media_type fix #9255

Conversation

paul-gibbons commented May 20, 2024

What does this PR do ?

Changelog

Usage

GitHub Actions CI

Before your PR is "Ready for review"

Who can review?

Additional Information