-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add fp8 support for SD/Update notebook paths #8489
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Signed-off-by: Mingyuan Ma <[email protected]>
jenkins |
jenkins |
ericharper
approved these changes
Feb 25, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks!
github-actions bot
pushed a commit
that referenced
this pull request
Feb 25, 2024
* Add fp8 support for SD/Update notebook paths Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <[email protected]>
huvunvidia
added a commit
that referenced
this pull request
Apr 16, 2024
* update branch Signed-off-by: eharper <[email protected]> * Add dist ckpt support for regular optimizers (#7749) * Add dist ckpt support for regular optimizers Signed-off-by: Mikołaj Błaż <[email protected]> * [tutorial] fixed missing RIR scripts file. (#8257) Signed-off-by: Xuesong Yang <[email protected]> * fix imports Signed-off-by: dimapihtar <[email protected]> * imports fix Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * ci imports fix Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert asr notebook Signed-off-by: dimapihtar <[email protected]> * revert asr notebook Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Mikołaj Błaż <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: dimapihtar <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: dimapihtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Pin lhotse=1.19.2 in r1.23.0 (#8303) Signed-off-by: Piotr Żelasko <[email protected]> * Cache Aware Streaming tutorial notebook (#8296) * add notebook Signed-off-by: Elena Rastorgueva <[email protected]> * rename old notebook to Buffered_Streaming Signed-off-by: Elena Rastorgueva <[email protected]> * call setup_streaming_params in set_default_att_context_size method Signed-off-by: Elena Rastorgueva <[email protected]> * update links in docs Signed-off-by: Elena Rastorgueva <[email protected]> * update links to tutorials in docs Signed-off-by: Elena Rastorgueva <[email protected]> * remove hard-coding Signed-off-by: Elena Rastorgueva <[email protected]> * rename var Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> * fix path location and branch (#8304) * fix path location and branch Signed-off-by: Nithin Rao Koluguri <nithinraok> * change to a floating point number Signed-off-by: Nithin Rao Koluguri <nithinraok> --------- Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Somshubra Majumdar <[email protected]> * add deallocate pipeline output optimization (#8279) * add deallocate pipeline output optimization Signed-off-by: Jimmy Zhang <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Jimmy Zhang <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fix memory leak caused by context parallelism hanging references by omegaconf (#8299) * save cp_size to self Signed-off-by: Jimmy Zhang <[email protected]> * use parallel_state instead of self Signed-off-by: Jimmy Zhang <[email protected]> --------- Signed-off-by: Jimmy Zhang <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: Eric Harper <[email protected]> * remove assertion (#8302) Signed-off-by: dimapihtar <[email protected]> * Update PEFT Doc (#8262) * update peft doc Signed-off-by: Chen Cui <[email protected]> * remove old prompt learning doc and notebook Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * Merge branch 'r1.23.0' into chcui/update_peft_doc Signed-off-by: Chen Cui <[email protected]> * revert accidental changes Signed-off-by: Chen Cui <[email protected]> * revert accidental changes Signed-off-by: Chen Cui <[email protected]> --------- Signed-off-by: Chen Cui <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks (#8242) (#8324) * Rebasing canary changes at current main Signed-off-by: Piotr Żelasko <[email protected]> * Move the changes from asr transformer to nlp transformer as originally intended Signed-off-by: Piotr Żelasko <[email protected]> * update eval to strip spaces before punctuations Signed-off-by: stevehuang52 <[email protected]> * update pc strip Signed-off-by: stevehuang52 <[email protected]> * [canary] Refactor: `PromptedAudioToTextLhotseDataset` and `EncDecMultiTaskModel` (#8247) * Create a separate CanaryDataset and use it inside `transformer_bpe_models.py`. Ditches `token_sequence_format`. Signed-off-by: Piotr Żelasko <[email protected]> * [canary] Refactor: move changes in transformer_bpe_models.py to Canar… (#8252) * [canary] Refactor: move changes in transformer_bpe_models.py to CanaryModel Signed-off-by: Piotr Żelasko <[email protected]> * Rename `CanaryModel` to `EncDecMultiTaskModel` and remove inheritance from `EncDecTransfModelBPE`; add a separate config for this model Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * Rename `CanaryDataset` to `PromptedAudioToTextLhotseDataset`; add `prompt_format_fn` argument; clean-up the `_canary_prompt_format` function a bit Signed-off-by: Piotr Żelasko <[email protected]> * Move tokenization into `prompt_format_fn`, fix usage, add docs Signed-off-by: Piotr Żelasko <[email protected]> * Backward-compatible utterance validation Signed-off-by: Piotr Żelasko <[email protected]> * Improve type annotations Signed-off-by: Piotr Żelasko <[email protected]> * config and prompt_fn registration changes from review Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * fix transcribe config Signed-off-by: stevehuang52 <[email protected]> * Refactor Canary to follow schema of remaining ASR models (#8260) * Initial draft of multi task beam decoding strategy Signed-off-by: smajumdar <[email protected]> * Stabilize inference Signed-off-by: smajumdar <[email protected]> * Update AED Multi Task model to mostly conform to Archetype-Type format. Update config Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add change decoding strategy Signed-off-by: smajumdar <[email protected]> * Remove redundant imports Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Cleanup Signed-off-by: smajumdar <[email protected]> * Cleanup Signed-off-by: smajumdar <[email protected]> * remove asr transformer dependency on nlp Signed-off-by: stevehuang52 <[email protected]> * clean up Signed-off-by: stevehuang52 <[email protected]> * copy token_classifier from nlp to asr Signed-off-by: stevehuang52 <[email protected]> * Address comments Signed-off-by: smajumdar <[email protected]> * Add typing to beam decoding Signed-off-by: smajumdar <[email protected]> * Make prompt format configurable Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * drop asr dependency on nlp Signed-off-by: stevehuang52 <[email protected]> --------- Signed-off-by: smajumdar <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: stevehuang52 <[email protected]> * fix transcribe, update asr evaluator Signed-off-by: stevehuang52 <[email protected]> * Extend the docs for the canary prompt_fn Signed-off-by: Piotr Żelasko <[email protected]> * Incorporate changes from Nithin's code review Signed-off-by: Piotr Żelasko <[email protected]> * training bug fix and adding launch script for speech_multitask (#8270) * bug fix and adding launch script for speech_multitask Signed-off-by: Krishna Puvvada <[email protected]> * update launch script example in speech_to_text_aed.py Signed-off-by: Krishna Puvvada <[email protected]> --------- Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> * Fix: drop_last must be true in validation/test otherwise the training will hang Signed-off-by: Piotr Żelasko <[email protected]> * revert to current transcribe API Signed-off-by: stevehuang52 <[email protected]> * revert changes to NLP, update docs Signed-off-by: stevehuang52 <[email protected]> * update eval utils Signed-off-by: stevehuang52 <[email protected]> * update docs Signed-off-by: stevehuang52 <[email protected]> * Remove DALI; rename compute_audio_loss to compute_loss Signed-off-by: Piotr Żelasko <[email protected]> * set default use_model_transcribe=False Signed-off-by: stevehuang52 <[email protected]> * change os.path.dirname to pathlib Signed-off-by: stevehuang52 <[email protected]> * [canary] Test for CanaryTokenizer + refactoring (#8285) * Test for CanaryTokenizer Signed-off-by: Piotr Żelasko <[email protected]> * Attempt at refactor... Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * Update config for AED models (#8294) Signed-off-by: smajumdar <[email protected]> * set default calculate_wer=False in transcribe_speech.py Signed-off-by: stevehuang52 <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks Signed-off-by: Piotr Żelasko <[email protected]> * Apply suggestions from code review, part 1 Co-authored-by: Nithin Rao <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> * Apply suggestions from code review, part 2 Signed-off-by: Piotr Żelasko <[email protected]> * Document compute_loss Signed-off-by: Piotr Żelasko <[email protected]> * update transcribe_speech.py Signed-off-by: stevehuang52 <[email protected]> * add docstring Signed-off-by: stevehuang52 <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: Krishna Puvvada <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> Co-authored-by: stevehuang52 <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: He Huang (Steve) <[email protected]> Co-authored-by: Nithin Rao <[email protected]> (cherry picked from commit d10726d) Co-authored-by: Piotr Żelasko <[email protected]> * add code for calling mcore_retro in NeMo * add code for calling mcore_retro in NeMo * runnable, training curve match retro mcore and nemo * working on retro inference * working on megatron_retro_eval.py and megatron_retro_inference.yaml * refactoring text_generation_utils code and retro inference relevant files * clean PR * resolving quick hacks (reading number of train/valid samples from workdir, discrepancy in total samples and samples with neighbors retrieved, tokenizers) * clean repository * revert changes to inference/eval code to original in main * clean code * runable training code, with already implemented eval code * [tutorial] fixed missing RIR scripts file. (#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * Add Bert HF checkpoint converter (#8088) * Add Bert HF checkpoint converter Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Reformat Signed-off-by: yaoyu-33 <[email protected]> * Add BERT ONNX export * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add NeMo BERT to HF BERT script * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Clean code Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update argument names Signed-off-by: yaoyu-33 <[email protected]> * Update build_transformer_config in Bert Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> * revert to original eval code files * revert to original eval code files 2 * revert to original eval code files 3 * revert to original eval code files 4 * clean code * clean code * update my code to support changes from lastest main * commit before rebase r1.23.0 * Multimodal r1.23.0 bug fix (#8315) * Rename quick-gelu Signed-off-by: yaoyu-33 <[email protected]> * ddpm config guard Signed-off-by: yaoyu-33 <[email protected]> * Fix ddpm edit api Signed-off-by: yaoyu-33 <[email protected]> * Fix insert_image_token cfg issue Signed-off-by: yaoyu-33 <[email protected]> * neva updates Signed-off-by: yaoyu-33 <[email protected]> * reformat Signed-off-by: yaoyu-33 <[email protected]> * Add back jenkins Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix jenkins Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix bugs Signed-off-by: yaoyu-33 <[email protected]> * Update default neva template Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * copy paste files from r1.23.0 * clean PR * Fixes for MoE parameter passing & use of AutoTokenizer/Model for mistral. (#8272) Signed-off-by: Alexandros Koumparoulis <[email protected]> * Keep max_seqlen and cu_seqlens_argmin for later micro-batches when PP>1 (#8334) Signed-off-by: Sangkug Lym <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Remove asr webapp (#8347) Signed-off-by: smajumdar <[email protected]> * remove _target_ at model level in aed config (#8351) Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> * revert changes for tts and asr * Add change_vocabulary and save_tokenizers() support to Multitask ASR models (#8357) * Add change_vocabulary and save_tokenizers() support Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update nemo/collections/asr/models/aed_multitask_models.py Co-authored-by: Piotr Żelasko <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> --------- Signed-off-by: smajumdar <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Piotr Żelasko <[email protected]> * Change default (#8371) Signed-off-by: smajumdar <[email protected]> * implement retro's own fwd_bwd_step() and validation_step() to not have argument first_val_step, which the MLM commit doesn't support * adding megatron compile_helpers(), in future can be fixed with correct MLM commit * bug fix in fast-conformer-aed.yaml and adding jenkins test for speech_to_text_aed model (#8368) Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> * Enable megatron core loggers for GPT pretraining (#8354) * Logging changes tested for gpt_pretraining Signed-off-by: Aishwarya Bhandare <[email protected]> * Additional args Signed-off-by: Aishwarya Bhandare <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Aishwarya Bhandare <[email protected]> Co-authored-by: Aishwarya Bhandare <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <[email protected]> * mcore ds fix (#8283) * [tutorial] fixed missing RIR scripts file. (#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * mcore ds fix Signed-off-by: Dmytro Pykhtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update mcore Signed-off-by: dimapihtar <[email protected]> * revert asr files Signed-off-by: dimapihtar <[email protected]> * add comments Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for mcore mock dataset Signed-off-by: dimapihtar <[email protected]> * update mcore version Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update gpt cfg Signed-off-by: dimapihtar <[email protected]> * update mcore commit Signed-off-by: dimapihtar <[email protected]> * fix Bert unit tests Signed-off-by: dimapihtar <[email protected]> * update bert tests Signed-off-by: dimapihtar <[email protected]> * fix bert mcore test Signed-off-by: dimapihtar <[email protected]> * fix gpt jenkins tests Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update apex & TE commits Signed-off-by: dimapihtar <[email protected]> * revert apex installation Signed-off-by: dimapihtar <[email protected]> * turn off the fusion for jenkins Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: dimapihtar <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Pablo Garay <[email protected]> * addressing Eric's reviews * adding existing implementation RETRO files * adding existing implementation RETRO files * Add Finetuning tutorial with HF Datasets (#8356) * Add Finetuning tutorial with HF Datasets Signed-off-by: Nithin Rao Koluguri <nithinraok> * update on Som comments Signed-off-by: Nithin Rao Koluguri <nithinraok> --------- Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * release updates (#8378) * [tutorial] fixed missing RIR scripts file. (#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * mcore ds fix Signed-off-by: Dmytro Pykhtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update mcore Signed-off-by: dimapihtar <[email protected]> * revert asr files Signed-off-by: dimapihtar <[email protected]> * add comments Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for mcore mock dataset Signed-off-by: dimapihtar <[email protected]> * update mcore version Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update gpt cfg Signed-off-by: dimapihtar <[email protected]> * update mcore commit Signed-off-by: dimapihtar <[email protected]> * fix Bert unit tests Signed-off-by: dimapihtar <[email protected]> * update bert tests Signed-off-by: dimapihtar <[email protected]> * fix bert mcore test Signed-off-by: dimapihtar <[email protected]> * fix gpt jenkins tests Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for dict data input type Signed-off-by: dimapihtar <[email protected]> * add mock ds test Signed-off-by: dimapihtar <[email protected]> * add test for dict data input type Signed-off-by: dimapihtar <[email protected]> * mcore ds fix Signed-off-by: dimapihtar <[email protected]> * data input fix Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: dimapihtar <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Pablo Garay <[email protected]> * MCore dataset compatibility for tokenizers (#8390) * Add unique_identifiers for all tokenizers and eod for SentencePieceTokenizer Signed-off-by: Valerie Sarge <[email protected]> * Add generalized token aliases to TokenizerSpec to conform with MegatronTokenizer's interface. Remove now-redundant individual fixes from AutoTokenizer and SentencePieceTokenizer. Signed-off-by: Valerie Sarge <[email protected]> --------- Signed-off-by: Valerie Sarge <[email protected]> Co-authored-by: Pablo Garay <[email protected]> * Mcore customization doc (#8298) * [tutorial] fixed missing RIR scripts file. (#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * Add Bert HF checkpoint converter (#8088) * Add Bert HF checkpoint converter Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Reformat Signed-off-by: yaoyu-33 <[email protected]> * Add BERT ONNX export * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add NeMo BERT to HF BERT script * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Clean code Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update argument names Signed-off-by: yaoyu-33 <[email protected]> * Update build_transformer_config in Bert Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> * initial placeholder Signed-off-by: Huiying Li <[email protected]> * add to intro/index.rst Signed-off-by: Huiying Li <[email protected]> * initial content update Signed-off-by: Huiying Li <[email protected]> * add diff images Signed-off-by: Huiying Li <[email protected]> size Signed-off-by: Huiying Li <[email protected]> * minor fixes * minor language change Signed-off-by: Chen Cui <[email protected]> * clean changes --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Chen Cui <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> Co-authored-by: Huiying Li <[email protected]> Co-authored-by: Chen Cui <[email protected]> * wer fix (#8404) Signed-off-by: Travis Bartley <[email protected]> * updated link to pubmed (#8402) Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * Update NFA video download link (#8406) * update nfa nasa video link Signed-off-by: Elena Rastorgueva <[email protected]> * update link in markdown Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> * revert changes (#8410) Signed-off-by: Chen Cui <[email protected]> * Fix dreambooth data sampler issue (#8400) * Turn on drop last Signed-off-by: yaoyu-33 <[email protected]> * Some neva fixes Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fixed errors in the CTM gen functions (#8416) Signed-off-by: Taejin Park <[email protected]> * add ensemble decoding fix (#8427) Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * SDE bugfix log (#8430) Signed-off-by: George <[email protected]> * mcore customization doc minor fix (#8421) Signed-off-by: Huiying Li <[email protected]> * NeMo-Mistral to HF converter bugfix. (#8353) Signed-off-by: Alexandros Koumparoulis <[email protected]> * Fixing mcore bert for TP, PP and SP (#8336) * Fixing mcore bert for TP, PP and SP * Fixing mcore bert for TP, PP and SP * Fixing mcore version * Fixing mcore version * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> --------- Signed-off-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Add settings to suppress bf16 compile errors in CI on V100 (#8481) * Add settings to suppress bf16 compile errors in CI on V100 Signed-off-by: Abhishree <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Abhishree <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * MoE parameter passing (#8255) * MoE parameter passing Signed-off-by: Alexandros Koumparoulis <[email protected]> * Pass EP/MoE params in consumer scripts. Signed-off-by: Alexandros Koumparoulis <[email protected]> * PR fixes Signed-off-by: Alexandros Koumparoulis <[email protected]> * Use latest commit of mcore-0.5 Signed-off-by: Alexandros Koumparoulis <[email protected]> * CI fix Signed-off-by: Alexandros Koumparoulis <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Update k2 version (#8478) (#8492) Signed-off-by: Vladimir Bataev <[email protected]> * Add fp8 support for SD/Update notebook paths (#8489) * Add fp8 support for SD/Update notebook paths Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <[email protected]> * pin to 0.5.0 (#8465) Signed-off-by: eharper <[email protected]> * Update NeMo Multimodal Requirements (#8515) * Update requirements_multimodal.txt Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * update github raw content link (#8517) Signed-off-by: Chen Cui <[email protected]> * Add dep notice for notebooks (#8522) * add dep notice Signed-off-by: eharper <[email protected]> * revert Signed-off-by: eharper <[email protected]> --------- Signed-off-by: eharper <[email protected]> * Revert FP8 integration (#8520) * Revert FP8 integration Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Update data prep notebook (#8532) Signed-off-by: Mingyuan Ma <[email protected]> * before update branch with latest r1.23.0 * update to run with MLM ae2817b3dde4efb1515061a5311d01d8f85bd99c (runnable training and saving checkpoint) * remove compile_helpers * reverse changes from main branch to r1.23.0 * adding *_legacy files * update MLM commit in Jenkinsfile to latest * debugging Jenkinstest: test different mcore import in retro_dataset * update Jenkinsfile edit megatron_retro_mutransfer_pretrain_legacy.py * removing all mcore RETRO to pass the Jenkinstest * fixing import legacy problem for tests/collections/nlp/test_indexed_retrieval_dataset.py * update Jenkinsfile file to use TE v0.7 * update NeMo to work with latest mcore RETRO (solving TE problems) * update TE commit Jenkinsfile to be the same with r1.23.0's Jenkinsfile * update commit for MLM * jenkinstest debugging * temporary fix RETRO's __init__ for jenkinstest * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * add model.data.dataloader_type=cyclic to jenkinsfile * update code to work with latest megatron-lm main 81dab6067 * update M-LM commit in Jenkinsfile to latest main M-LM 81dab6067 * fix to by pass CI test bf16 problem (following this PR https://github.com/NVIDIA/NeMo/pull/8481/files) * isort and black * adjusting model.micro_batch_size to 1 * fix BRANCH = 'r1.23.0' * replace tutorials dir from main branch to huvu/mcore_retro * fix minor merges conflict * update Jenkinsfile * runnable with a temporary fix from Jacek (unfound -unfinished problem) * runnable with a temporary fix from Jacek (unfound -unfinished problem) * modified nlp_overrides.py back to original * fix checkpoint from Jacek Bieniusiewicz * config Jenkinsfile test * set RETRO Jenkins MBS to 1 * black fix * isort fix * update TE commit * update to latest Jenkinsfile with latest container and commits * remove new RETRO jenkinstest * merge latest main * put RETRO Jenkinstest to the right place * update code for megatron_retro_pretraining_legacy.py * untrack ipa_cmudict-0.7b_nv23.01.txt * untrack ipa_cmudict-0.7b_nv23.01.txt * set config in megatron_retro_pretraining_legacy.py to megatron_retro_config_legacy * update new RETRO jenkinstest to run faster * merging latest main, and edit Jenkinstest * update Jenkinstest for new RETRO to run faster * fix isort * fix whitespace Signed-off-by: eharper <[email protected]> --------- Signed-off-by: eharper <[email protected]> Signed-off-by: Mikołaj Błaż <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: dimapihtar <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: Nithin Rao Koluguri <nithinraok> Signed-off-by: Jimmy Zhang <[email protected]> Signed-off-by: Chen Cui <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Alexandros Koumparoulis <[email protected]> Signed-off-by: Sangkug Lym <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: Krishna Puvvada <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Signed-off-by: Aishwarya Bhandare <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: Valerie Sarge <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Travis Bartley <[email protected]> Signed-off-by: Taejin Park <[email protected]> Signed-off-by: George <[email protected]> Signed-off-by: Shanmugam Ramasamy <[email protected]> Signed-off-by: Abhishree <[email protected]> Signed-off-by: Vladimir Bataev <[email protected]> Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: eharper <[email protected]> Co-authored-by: mikolajblaz <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: dimapihtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Piotr Żelasko <[email protected]> Co-authored-by: Elena Rastorgueva <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: JimmyZhang12 <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: Chen Cui <[email protected]> Co-authored-by: Huy Vu2 <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: Bobby Chen <[email protected]> Co-authored-by: akoumpa <[email protected]> Co-authored-by: Sangkug Lym <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: ashbhandare <[email protected]> Co-authored-by: Aishwarya Bhandare <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Pablo Garay <[email protected]> Co-authored-by: Valerie Sarge <[email protected]> Co-authored-by: Huiying <[email protected]> Co-authored-by: Huiying Li <[email protected]> Co-authored-by: tbartley94 <[email protected]> Co-authored-by: Taejin Park <[email protected]> Co-authored-by: George <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Abhishree Thittenamane <[email protected]> Co-authored-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: Vladimir Bataev <[email protected]> Co-authored-by: Ming <[email protected]> Co-authored-by: Huy Vu2 <[email protected]>
pablo-garay
added a commit
that referenced
this pull request
Apr 17, 2024
* update branch Signed-off-by: eharper <[email protected]> * Add dist ckpt support for regular optimizers (#7749) * Add dist ckpt support for regular optimizers Signed-off-by: Mikołaj Błaż <[email protected]> * [tutorial] fixed missing RIR scripts file. (#8257) Signed-off-by: Xuesong Yang <[email protected]> * fix imports Signed-off-by: dimapihtar <[email protected]> * imports fix Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * ci imports fix Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert asr notebook Signed-off-by: dimapihtar <[email protected]> * revert asr notebook Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Mikołaj Błaż <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: dimapihtar <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: dimapihtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Pin lhotse=1.19.2 in r1.23.0 (#8303) Signed-off-by: Piotr Żelasko <[email protected]> * Cache Aware Streaming tutorial notebook (#8296) * add notebook Signed-off-by: Elena Rastorgueva <[email protected]> * rename old notebook to Buffered_Streaming Signed-off-by: Elena Rastorgueva <[email protected]> * call setup_streaming_params in set_default_att_context_size method Signed-off-by: Elena Rastorgueva <[email protected]> * update links in docs Signed-off-by: Elena Rastorgueva <[email protected]> * update links to tutorials in docs Signed-off-by: Elena Rastorgueva <[email protected]> * remove hard-coding Signed-off-by: Elena Rastorgueva <[email protected]> * rename var Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> * fix path location and branch (#8304) * fix path location and branch Signed-off-by: Nithin Rao Koluguri <nithinraok> * change to a floating point number Signed-off-by: Nithin Rao Koluguri <nithinraok> --------- Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Somshubra Majumdar <[email protected]> * add deallocate pipeline output optimization (#8279) * add deallocate pipeline output optimization Signed-off-by: Jimmy Zhang <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Jimmy Zhang <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fix memory leak caused by context parallelism hanging references by omegaconf (#8299) * save cp_size to self Signed-off-by: Jimmy Zhang <[email protected]> * use parallel_state instead of self Signed-off-by: Jimmy Zhang <[email protected]> --------- Signed-off-by: Jimmy Zhang <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: Eric Harper <[email protected]> * remove assertion (#8302) Signed-off-by: dimapihtar <[email protected]> * Update PEFT Doc (#8262) * update peft doc Signed-off-by: Chen Cui <[email protected]> * remove old prompt learning doc and notebook Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * Merge branch 'r1.23.0' into chcui/update_peft_doc Signed-off-by: Chen Cui <[email protected]> * revert accidental changes Signed-off-by: Chen Cui <[email protected]> * revert accidental changes Signed-off-by: Chen Cui <[email protected]> --------- Signed-off-by: Chen Cui <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks (#8242) (#8324) * Rebasing canary changes at current main Signed-off-by: Piotr Żelasko <[email protected]> * Move the changes from asr transformer to nlp transformer as originally intended Signed-off-by: Piotr Żelasko <[email protected]> * update eval to strip spaces before punctuations Signed-off-by: stevehuang52 <[email protected]> * update pc strip Signed-off-by: stevehuang52 <[email protected]> * [canary] Refactor: `PromptedAudioToTextLhotseDataset` and `EncDecMultiTaskModel` (#8247) * Create a separate CanaryDataset and use it inside `transformer_bpe_models.py`. Ditches `token_sequence_format`. Signed-off-by: Piotr Żelasko <[email protected]> * [canary] Refactor: move changes in transformer_bpe_models.py to Canar… (#8252) * [canary] Refactor: move changes in transformer_bpe_models.py to CanaryModel Signed-off-by: Piotr Żelasko <[email protected]> * Rename `CanaryModel` to `EncDecMultiTaskModel` and remove inheritance from `EncDecTransfModelBPE`; add a separate config for this model Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * Rename `CanaryDataset` to `PromptedAudioToTextLhotseDataset`; add `prompt_format_fn` argument; clean-up the `_canary_prompt_format` function a bit Signed-off-by: Piotr Żelasko <[email protected]> * Move tokenization into `prompt_format_fn`, fix usage, add docs Signed-off-by: Piotr Żelasko <[email protected]> * Backward-compatible utterance validation Signed-off-by: Piotr Żelasko <[email protected]> * Improve type annotations Signed-off-by: Piotr Żelasko <[email protected]> * config and prompt_fn registration changes from review Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * fix transcribe config Signed-off-by: stevehuang52 <[email protected]> * Refactor Canary to follow schema of remaining ASR models (#8260) * Initial draft of multi task beam decoding strategy Signed-off-by: smajumdar <[email protected]> * Stabilize inference Signed-off-by: smajumdar <[email protected]> * Update AED Multi Task model to mostly conform to Archetype-Type format. Update config Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add change decoding strategy Signed-off-by: smajumdar <[email protected]> * Remove redundant imports Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Cleanup Signed-off-by: smajumdar <[email protected]> * Cleanup Signed-off-by: smajumdar <[email protected]> * remove asr transformer dependency on nlp Signed-off-by: stevehuang52 <[email protected]> * clean up Signed-off-by: stevehuang52 <[email protected]> * copy token_classifier from nlp to asr Signed-off-by: stevehuang52 <[email protected]> * Address comments Signed-off-by: smajumdar <[email protected]> * Add typing to beam decoding Signed-off-by: smajumdar <[email protected]> * Make prompt format configurable Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * drop asr dependency on nlp Signed-off-by: stevehuang52 <[email protected]> --------- Signed-off-by: smajumdar <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: stevehuang52 <[email protected]> * fix transcribe, update asr evaluator Signed-off-by: stevehuang52 <[email protected]> * Extend the docs for the canary prompt_fn Signed-off-by: Piotr Żelasko <[email protected]> * Incorporate changes from Nithin's code review Signed-off-by: Piotr Żelasko <[email protected]> * training bug fix and adding launch script for speech_multitask (#8270) * bug fix and adding launch script for speech_multitask Signed-off-by: Krishna Puvvada <[email protected]> * update launch script example in speech_to_text_aed.py Signed-off-by: Krishna Puvvada <[email protected]> --------- Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> * Fix: drop_last must be true in validation/test otherwise the training will hang Signed-off-by: Piotr Żelasko <[email protected]> * revert to current transcribe API Signed-off-by: stevehuang52 <[email protected]> * revert changes to NLP, update docs Signed-off-by: stevehuang52 <[email protected]> * update eval utils Signed-off-by: stevehuang52 <[email protected]> * update docs Signed-off-by: stevehuang52 <[email protected]> * Remove DALI; rename compute_audio_loss to compute_loss Signed-off-by: Piotr Żelasko <[email protected]> * set default use_model_transcribe=False Signed-off-by: stevehuang52 <[email protected]> * change os.path.dirname to pathlib Signed-off-by: stevehuang52 <[email protected]> * [canary] Test for CanaryTokenizer + refactoring (#8285) * Test for CanaryTokenizer Signed-off-by: Piotr Żelasko <[email protected]> * Attempt at refactor... Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * Update config for AED models (#8294) Signed-off-by: smajumdar <[email protected]> * set default calculate_wer=False in transcribe_speech.py Signed-off-by: stevehuang52 <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks Signed-off-by: Piotr Żelasko <[email protected]> * Apply suggestions from code review, part 1 Co-authored-by: Nithin Rao <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> * Apply suggestions from code review, part 2 Signed-off-by: Piotr Żelasko <[email protected]> * Document compute_loss Signed-off-by: Piotr Żelasko <[email protected]> * update transcribe_speech.py Signed-off-by: stevehuang52 <[email protected]> * add docstring Signed-off-by: stevehuang52 <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: Krishna Puvvada <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> Co-authored-by: stevehuang52 <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: He Huang (Steve) <[email protected]> Co-authored-by: Nithin Rao <[email protected]> (cherry picked from commit d10726d) Co-authored-by: Piotr Żelasko <[email protected]> * add code for calling mcore_retro in NeMo * add code for calling mcore_retro in NeMo * runnable, training curve match retro mcore and nemo * working on retro inference * working on megatron_retro_eval.py and megatron_retro_inference.yaml * refactoring text_generation_utils code and retro inference relevant files * clean PR * resolving quick hacks (reading number of train/valid samples from workdir, discrepancy in total samples and samples with neighbors retrieved, tokenizers) * clean repository * revert changes to inference/eval code to original in main * clean code * runable training code, with already implemented eval code * [tutorial] fixed missing RIR scripts file. (#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * Add Bert HF checkpoint converter (#8088) * Add Bert HF checkpoint converter Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Reformat Signed-off-by: yaoyu-33 <[email protected]> * Add BERT ONNX export * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add NeMo BERT to HF BERT script * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Clean code Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update argument names Signed-off-by: yaoyu-33 <[email protected]> * Update build_transformer_config in Bert Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> * revert to original eval code files * revert to original eval code files 2 * revert to original eval code files 3 * revert to original eval code files 4 * clean code * clean code * update my code to support changes from lastest main * commit before rebase r1.23.0 * Multimodal r1.23.0 bug fix (#8315) * Rename quick-gelu Signed-off-by: yaoyu-33 <[email protected]> * ddpm config guard Signed-off-by: yaoyu-33 <[email protected]> * Fix ddpm edit api Signed-off-by: yaoyu-33 <[email protected]> * Fix insert_image_token cfg issue Signed-off-by: yaoyu-33 <[email protected]> * neva updates Signed-off-by: yaoyu-33 <[email protected]> * reformat Signed-off-by: yaoyu-33 <[email protected]> * Add back jenkins Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix jenkins Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix bugs Signed-off-by: yaoyu-33 <[email protected]> * Update default neva template Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * copy paste files from r1.23.0 * clean PR * Fixes for MoE parameter passing & use of AutoTokenizer/Model for mistral. (#8272) Signed-off-by: Alexandros Koumparoulis <[email protected]> * Keep max_seqlen and cu_seqlens_argmin for later micro-batches when PP>1 (#8334) Signed-off-by: Sangkug Lym <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Remove asr webapp (#8347) Signed-off-by: smajumdar <[email protected]> * remove _target_ at model level in aed config (#8351) Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> * revert changes for tts and asr * Add change_vocabulary and save_tokenizers() support to Multitask ASR models (#8357) * Add change_vocabulary and save_tokenizers() support Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update nemo/collections/asr/models/aed_multitask_models.py Co-authored-by: Piotr Żelasko <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> --------- Signed-off-by: smajumdar <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Piotr Żelasko <[email protected]> * Change default (#8371) Signed-off-by: smajumdar <[email protected]> * implement retro's own fwd_bwd_step() and validation_step() to not have argument first_val_step, which the MLM commit doesn't support * adding megatron compile_helpers(), in future can be fixed with correct MLM commit * bug fix in fast-conformer-aed.yaml and adding jenkins test for speech_to_text_aed model (#8368) Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> * Enable megatron core loggers for GPT pretraining (#8354) * Logging changes tested for gpt_pretraining Signed-off-by: Aishwarya Bhandare <[email protected]> * Additional args Signed-off-by: Aishwarya Bhandare <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Aishwarya Bhandare <[email protected]> Co-authored-by: Aishwarya Bhandare <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <[email protected]> * mcore ds fix (#8283) * [tutorial] fixed missing RIR scripts file. (#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * mcore ds fix Signed-off-by: Dmytro Pykhtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update mcore Signed-off-by: dimapihtar <[email protected]> * revert asr files Signed-off-by: dimapihtar <[email protected]> * add comments Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for mcore mock dataset Signed-off-by: dimapihtar <[email protected]> * update mcore version Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update gpt cfg Signed-off-by: dimapihtar <[email protected]> * update mcore commit Signed-off-by: dimapihtar <[email protected]> * fix Bert unit tests Signed-off-by: dimapihtar <[email protected]> * update bert tests Signed-off-by: dimapihtar <[email protected]> * fix bert mcore test Signed-off-by: dimapihtar <[email protected]> * fix gpt jenkins tests Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update apex & TE commits Signed-off-by: dimapihtar <[email protected]> * revert apex installation Signed-off-by: dimapihtar <[email protected]> * turn off the fusion for jenkins Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: dimapihtar <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Pablo Garay <[email protected]> * addressing Eric's reviews * adding existing implementation RETRO files * adding existing implementation RETRO files * Add Finetuning tutorial with HF Datasets (#8356) * Add Finetuning tutorial with HF Datasets Signed-off-by: Nithin Rao Koluguri <nithinraok> * update on Som comments Signed-off-by: Nithin Rao Koluguri <nithinraok> --------- Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * release updates (#8378) * [tutorial] fixed missing RIR scripts file. (#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * mcore ds fix Signed-off-by: Dmytro Pykhtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update mcore Signed-off-by: dimapihtar <[email protected]> * revert asr files Signed-off-by: dimapihtar <[email protected]> * add comments Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for mcore mock dataset Signed-off-by: dimapihtar <[email protected]> * update mcore version Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update gpt cfg Signed-off-by: dimapihtar <[email protected]> * update mcore commit Signed-off-by: dimapihtar <[email protected]> * fix Bert unit tests Signed-off-by: dimapihtar <[email protected]> * update bert tests Signed-off-by: dimapihtar <[email protected]> * fix bert mcore test Signed-off-by: dimapihtar <[email protected]> * fix gpt jenkins tests Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for dict data input type Signed-off-by: dimapihtar <[email protected]> * add mock ds test Signed-off-by: dimapihtar <[email protected]> * add test for dict data input type Signed-off-by: dimapihtar <[email protected]> * mcore ds fix Signed-off-by: dimapihtar <[email protected]> * data input fix Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: dimapihtar <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Pablo Garay <[email protected]> * MCore dataset compatibility for tokenizers (#8390) * Add unique_identifiers for all tokenizers and eod for SentencePieceTokenizer Signed-off-by: Valerie Sarge <[email protected]> * Add generalized token aliases to TokenizerSpec to conform with MegatronTokenizer's interface. Remove now-redundant individual fixes from AutoTokenizer and SentencePieceTokenizer. Signed-off-by: Valerie Sarge <[email protected]> --------- Signed-off-by: Valerie Sarge <[email protected]> Co-authored-by: Pablo Garay <[email protected]> * Mcore customization doc (#8298) * [tutorial] fixed missing RIR scripts file. (#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * Add Bert HF checkpoint converter (#8088) * Add Bert HF checkpoint converter Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Reformat Signed-off-by: yaoyu-33 <[email protected]> * Add BERT ONNX export * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add NeMo BERT to HF BERT script * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Clean code Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update argument names Signed-off-by: yaoyu-33 <[email protected]> * Update build_transformer_config in Bert Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> * initial placeholder Signed-off-by: Huiying Li <[email protected]> * add to intro/index.rst Signed-off-by: Huiying Li <[email protected]> * initial content update Signed-off-by: Huiying Li <[email protected]> * add diff images Signed-off-by: Huiying Li <[email protected]> size Signed-off-by: Huiying Li <[email protected]> * minor fixes * minor language change Signed-off-by: Chen Cui <[email protected]> * clean changes --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Chen Cui <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> Co-authored-by: Huiying Li <[email protected]> Co-authored-by: Chen Cui <[email protected]> * wer fix (#8404) Signed-off-by: Travis Bartley <[email protected]> * updated link to pubmed (#8402) Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * Update NFA video download link (#8406) * update nfa nasa video link Signed-off-by: Elena Rastorgueva <[email protected]> * update link in markdown Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> * revert changes (#8410) Signed-off-by: Chen Cui <[email protected]> * Fix dreambooth data sampler issue (#8400) * Turn on drop last Signed-off-by: yaoyu-33 <[email protected]> * Some neva fixes Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fixed errors in the CTM gen functions (#8416) Signed-off-by: Taejin Park <[email protected]> * add ensemble decoding fix (#8427) Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * SDE bugfix log (#8430) Signed-off-by: George <[email protected]> * mcore customization doc minor fix (#8421) Signed-off-by: Huiying Li <[email protected]> * NeMo-Mistral to HF converter bugfix. (#8353) Signed-off-by: Alexandros Koumparoulis <[email protected]> * Fixing mcore bert for TP, PP and SP (#8336) * Fixing mcore bert for TP, PP and SP * Fixing mcore bert for TP, PP and SP * Fixing mcore version * Fixing mcore version * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> --------- Signed-off-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Add settings to suppress bf16 compile errors in CI on V100 (#8481) * Add settings to suppress bf16 compile errors in CI on V100 Signed-off-by: Abhishree <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Abhishree <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * MoE parameter passing (#8255) * MoE parameter passing Signed-off-by: Alexandros Koumparoulis <[email protected]> * Pass EP/MoE params in consumer scripts. Signed-off-by: Alexandros Koumparoulis <[email protected]> * PR fixes Signed-off-by: Alexandros Koumparoulis <[email protected]> * Use latest commit of mcore-0.5 Signed-off-by: Alexandros Koumparoulis <[email protected]> * CI fix Signed-off-by: Alexandros Koumparoulis <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Update k2 version (#8478) (#8492) Signed-off-by: Vladimir Bataev <[email protected]> * Add fp8 support for SD/Update notebook paths (#8489) * Add fp8 support for SD/Update notebook paths Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <[email protected]> * pin to 0.5.0 (#8465) Signed-off-by: eharper <[email protected]> * Update NeMo Multimodal Requirements (#8515) * Update requirements_multimodal.txt Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * update github raw content link (#8517) Signed-off-by: Chen Cui <[email protected]> * Add dep notice for notebooks (#8522) * add dep notice Signed-off-by: eharper <[email protected]> * revert Signed-off-by: eharper <[email protected]> --------- Signed-off-by: eharper <[email protected]> * Revert FP8 integration (#8520) * Revert FP8 integration Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Update data prep notebook (#8532) Signed-off-by: Mingyuan Ma <[email protected]> * before update branch with latest r1.23.0 * update to run with MLM ae2817b3dde4efb1515061a5311d01d8f85bd99c (runnable training and saving checkpoint) * remove compile_helpers * reverse changes from main branch to r1.23.0 * adding *_legacy files * update MLM commit in Jenkinsfile to latest * debugging Jenkinstest: test different mcore import in retro_dataset * update Jenkinsfile edit megatron_retro_mutransfer_pretrain_legacy.py * removing all mcore RETRO to pass the Jenkinstest * fixing import legacy problem for tests/collections/nlp/test_indexed_retrieval_dataset.py * update Jenkinsfile file to use TE v0.7 * update NeMo to work with latest mcore RETRO (solving TE problems) * update TE commit Jenkinsfile to be the same with r1.23.0's Jenkinsfile * update commit for MLM * jenkinstest debugging * temporary fix RETRO's __init__ for jenkinstest * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * add model.data.dataloader_type=cyclic to jenkinsfile * update code to work with latest megatron-lm main 81dab6067 * update M-LM commit in Jenkinsfile to latest main M-LM 81dab6067 * fix to by pass CI test bf16 problem (following this PR https://github.com/NVIDIA/NeMo/pull/8481/files) * isort and black * adjusting model.micro_batch_size to 1 * fix BRANCH = 'r1.23.0' * replace tutorials dir from main branch to huvu/mcore_retro * fix minor merges conflict * update Jenkinsfile * runnable with a temporary fix from Jacek (unfound -unfinished problem) * runnable with a temporary fix from Jacek (unfound -unfinished problem) * modified nlp_overrides.py back to original * fix checkpoint from Jacek Bieniusiewicz * config Jenkinsfile test * set RETRO Jenkins MBS to 1 * black fix * isort fix * update TE commit * update to latest Jenkinsfile with latest container and commits * remove new RETRO jenkinstest * merge latest main * put RETRO Jenkinstest to the right place * update code for megatron_retro_pretraining_legacy.py * untrack ipa_cmudict-0.7b_nv23.01.txt * untrack ipa_cmudict-0.7b_nv23.01.txt * set config in megatron_retro_pretraining_legacy.py to megatron_retro_config_legacy * update new RETRO jenkinstest to run faster * merging latest main, and edit Jenkinstest * update Jenkinstest for new RETRO to run faster * fix isort * adding RETRO tests to cicd-main.yml action tests * update ipa_cmudict-0.7b_nv23.01.txt * remove quotes for model.data for legacy RETRO action tests --------- Signed-off-by: eharper <[email protected]> Signed-off-by: Mikołaj Błaż <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: dimapihtar <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: Nithin Rao Koluguri <nithinraok> Signed-off-by: Jimmy Zhang <[email protected]> Signed-off-by: Chen Cui <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Alexandros Koumparoulis <[email protected]> Signed-off-by: Sangkug Lym <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: Krishna Puvvada <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Signed-off-by: Aishwarya Bhandare <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: Valerie Sarge <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Travis Bartley <[email protected]> Signed-off-by: Taejin Park <[email protected]> Signed-off-by: George <[email protected]> Signed-off-by: Shanmugam Ramasamy <[email protected]> Signed-off-by: Abhishree <[email protected]> Signed-off-by: Vladimir Bataev <[email protected]> Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: eharper <[email protected]> Co-authored-by: mikolajblaz <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: dimapihtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Piotr Żelasko <[email protected]> Co-authored-by: Elena Rastorgueva <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: JimmyZhang12 <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: Chen Cui <[email protected]> Co-authored-by: Huy Vu2 <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: Bobby Chen <[email protected]> Co-authored-by: akoumpa <[email protected]> Co-authored-by: Sangkug Lym <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: ashbhandare <[email protected]> Co-authored-by: Aishwarya Bhandare <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Pablo Garay <[email protected]> Co-authored-by: Valerie Sarge <[email protected]> Co-authored-by: Huiying <[email protected]> Co-authored-by: Huiying Li <[email protected]> Co-authored-by: tbartley94 <[email protected]> Co-authored-by: Taejin Park <[email protected]> Co-authored-by: George <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Abhishree Thittenamane <[email protected]> Co-authored-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: Vladimir Bataev <[email protected]> Co-authored-by: Ming <[email protected]> Co-authored-by: Huy Vu2 <[email protected]>
xingyaoww
pushed a commit
to xingyaoww/NeMo
that referenced
this pull request
Apr 23, 2024
* update branch Signed-off-by: eharper <[email protected]> * Add dist ckpt support for regular optimizers (NVIDIA#7749) * Add dist ckpt support for regular optimizers Signed-off-by: Mikołaj Błaż <[email protected]> * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * fix imports Signed-off-by: dimapihtar <[email protected]> * imports fix Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * ci imports fix Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert asr notebook Signed-off-by: dimapihtar <[email protected]> * revert asr notebook Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Mikołaj Błaż <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: dimapihtar <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: dimapihtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Pin lhotse=1.19.2 in r1.23.0 (NVIDIA#8303) Signed-off-by: Piotr Żelasko <[email protected]> * Cache Aware Streaming tutorial notebook (NVIDIA#8296) * add notebook Signed-off-by: Elena Rastorgueva <[email protected]> * rename old notebook to Buffered_Streaming Signed-off-by: Elena Rastorgueva <[email protected]> * call setup_streaming_params in set_default_att_context_size method Signed-off-by: Elena Rastorgueva <[email protected]> * update links in docs Signed-off-by: Elena Rastorgueva <[email protected]> * update links to tutorials in docs Signed-off-by: Elena Rastorgueva <[email protected]> * remove hard-coding Signed-off-by: Elena Rastorgueva <[email protected]> * rename var Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> * fix path location and branch (NVIDIA#8304) * fix path location and branch Signed-off-by: Nithin Rao Koluguri <nithinraok> * change to a floating point number Signed-off-by: Nithin Rao Koluguri <nithinraok> --------- Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Somshubra Majumdar <[email protected]> * add deallocate pipeline output optimization (NVIDIA#8279) * add deallocate pipeline output optimization Signed-off-by: Jimmy Zhang <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Jimmy Zhang <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fix memory leak caused by context parallelism hanging references by omegaconf (NVIDIA#8299) * save cp_size to self Signed-off-by: Jimmy Zhang <[email protected]> * use parallel_state instead of self Signed-off-by: Jimmy Zhang <[email protected]> --------- Signed-off-by: Jimmy Zhang <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: Eric Harper <[email protected]> * remove assertion (NVIDIA#8302) Signed-off-by: dimapihtar <[email protected]> * Update PEFT Doc (NVIDIA#8262) * update peft doc Signed-off-by: Chen Cui <[email protected]> * remove old prompt learning doc and notebook Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * Merge branch 'r1.23.0' into chcui/update_peft_doc Signed-off-by: Chen Cui <[email protected]> * revert accidental changes Signed-off-by: Chen Cui <[email protected]> * revert accidental changes Signed-off-by: Chen Cui <[email protected]> --------- Signed-off-by: Chen Cui <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks (NVIDIA#8242) (NVIDIA#8324) * Rebasing canary changes at current main Signed-off-by: Piotr Żelasko <[email protected]> * Move the changes from asr transformer to nlp transformer as originally intended Signed-off-by: Piotr Żelasko <[email protected]> * update eval to strip spaces before punctuations Signed-off-by: stevehuang52 <[email protected]> * update pc strip Signed-off-by: stevehuang52 <[email protected]> * [canary] Refactor: `PromptedAudioToTextLhotseDataset` and `EncDecMultiTaskModel` (NVIDIA#8247) * Create a separate CanaryDataset and use it inside `transformer_bpe_models.py`. Ditches `token_sequence_format`. Signed-off-by: Piotr Żelasko <[email protected]> * [canary] Refactor: move changes in transformer_bpe_models.py to Canar… (NVIDIA#8252) * [canary] Refactor: move changes in transformer_bpe_models.py to CanaryModel Signed-off-by: Piotr Żelasko <[email protected]> * Rename `CanaryModel` to `EncDecMultiTaskModel` and remove inheritance from `EncDecTransfModelBPE`; add a separate config for this model Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * Rename `CanaryDataset` to `PromptedAudioToTextLhotseDataset`; add `prompt_format_fn` argument; clean-up the `_canary_prompt_format` function a bit Signed-off-by: Piotr Żelasko <[email protected]> * Move tokenization into `prompt_format_fn`, fix usage, add docs Signed-off-by: Piotr Żelasko <[email protected]> * Backward-compatible utterance validation Signed-off-by: Piotr Żelasko <[email protected]> * Improve type annotations Signed-off-by: Piotr Żelasko <[email protected]> * config and prompt_fn registration changes from review Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * fix transcribe config Signed-off-by: stevehuang52 <[email protected]> * Refactor Canary to follow schema of remaining ASR models (NVIDIA#8260) * Initial draft of multi task beam decoding strategy Signed-off-by: smajumdar <[email protected]> * Stabilize inference Signed-off-by: smajumdar <[email protected]> * Update AED Multi Task model to mostly conform to Archetype-Type format. Update config Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add change decoding strategy Signed-off-by: smajumdar <[email protected]> * Remove redundant imports Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Cleanup Signed-off-by: smajumdar <[email protected]> * Cleanup Signed-off-by: smajumdar <[email protected]> * remove asr transformer dependency on nlp Signed-off-by: stevehuang52 <[email protected]> * clean up Signed-off-by: stevehuang52 <[email protected]> * copy token_classifier from nlp to asr Signed-off-by: stevehuang52 <[email protected]> * Address comments Signed-off-by: smajumdar <[email protected]> * Add typing to beam decoding Signed-off-by: smajumdar <[email protected]> * Make prompt format configurable Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * drop asr dependency on nlp Signed-off-by: stevehuang52 <[email protected]> --------- Signed-off-by: smajumdar <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: stevehuang52 <[email protected]> * fix transcribe, update asr evaluator Signed-off-by: stevehuang52 <[email protected]> * Extend the docs for the canary prompt_fn Signed-off-by: Piotr Żelasko <[email protected]> * Incorporate changes from Nithin's code review Signed-off-by: Piotr Żelasko <[email protected]> * training bug fix and adding launch script for speech_multitask (NVIDIA#8270) * bug fix and adding launch script for speech_multitask Signed-off-by: Krishna Puvvada <[email protected]> * update launch script example in speech_to_text_aed.py Signed-off-by: Krishna Puvvada <[email protected]> --------- Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> * Fix: drop_last must be true in validation/test otherwise the training will hang Signed-off-by: Piotr Żelasko <[email protected]> * revert to current transcribe API Signed-off-by: stevehuang52 <[email protected]> * revert changes to NLP, update docs Signed-off-by: stevehuang52 <[email protected]> * update eval utils Signed-off-by: stevehuang52 <[email protected]> * update docs Signed-off-by: stevehuang52 <[email protected]> * Remove DALI; rename compute_audio_loss to compute_loss Signed-off-by: Piotr Żelasko <[email protected]> * set default use_model_transcribe=False Signed-off-by: stevehuang52 <[email protected]> * change os.path.dirname to pathlib Signed-off-by: stevehuang52 <[email protected]> * [canary] Test for CanaryTokenizer + refactoring (NVIDIA#8285) * Test for CanaryTokenizer Signed-off-by: Piotr Żelasko <[email protected]> * Attempt at refactor... Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * Update config for AED models (NVIDIA#8294) Signed-off-by: smajumdar <[email protected]> * set default calculate_wer=False in transcribe_speech.py Signed-off-by: stevehuang52 <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks Signed-off-by: Piotr Żelasko <[email protected]> * Apply suggestions from code review, part 1 Co-authored-by: Nithin Rao <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> * Apply suggestions from code review, part 2 Signed-off-by: Piotr Żelasko <[email protected]> * Document compute_loss Signed-off-by: Piotr Żelasko <[email protected]> * update transcribe_speech.py Signed-off-by: stevehuang52 <[email protected]> * add docstring Signed-off-by: stevehuang52 <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: Krishna Puvvada <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> Co-authored-by: stevehuang52 <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: He Huang (Steve) <[email protected]> Co-authored-by: Nithin Rao <[email protected]> (cherry picked from commit d10726d) Co-authored-by: Piotr Żelasko <[email protected]> * add code for calling mcore_retro in NeMo * add code for calling mcore_retro in NeMo * runnable, training curve match retro mcore and nemo * working on retro inference * working on megatron_retro_eval.py and megatron_retro_inference.yaml * refactoring text_generation_utils code and retro inference relevant files * clean PR * resolving quick hacks (reading number of train/valid samples from workdir, discrepancy in total samples and samples with neighbors retrieved, tokenizers) * clean repository * revert changes to inference/eval code to original in main * clean code * runable training code, with already implemented eval code * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (NVIDIA#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * Add Bert HF checkpoint converter (NVIDIA#8088) * Add Bert HF checkpoint converter Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Reformat Signed-off-by: yaoyu-33 <[email protected]> * Add BERT ONNX export * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add NeMo BERT to HF BERT script * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Clean code Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update argument names Signed-off-by: yaoyu-33 <[email protected]> * Update build_transformer_config in Bert Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> * revert to original eval code files * revert to original eval code files 2 * revert to original eval code files 3 * revert to original eval code files 4 * clean code * clean code * update my code to support changes from lastest main * commit before rebase r1.23.0 * Multimodal r1.23.0 bug fix (NVIDIA#8315) * Rename quick-gelu Signed-off-by: yaoyu-33 <[email protected]> * ddpm config guard Signed-off-by: yaoyu-33 <[email protected]> * Fix ddpm edit api Signed-off-by: yaoyu-33 <[email protected]> * Fix insert_image_token cfg issue Signed-off-by: yaoyu-33 <[email protected]> * neva updates Signed-off-by: yaoyu-33 <[email protected]> * reformat Signed-off-by: yaoyu-33 <[email protected]> * Add back jenkins Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix jenkins Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix bugs Signed-off-by: yaoyu-33 <[email protected]> * Update default neva template Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * copy paste files from r1.23.0 * clean PR * Fixes for MoE parameter passing & use of AutoTokenizer/Model for mistral. (NVIDIA#8272) Signed-off-by: Alexandros Koumparoulis <[email protected]> * Keep max_seqlen and cu_seqlens_argmin for later micro-batches when PP>1 (NVIDIA#8334) Signed-off-by: Sangkug Lym <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Remove asr webapp (NVIDIA#8347) Signed-off-by: smajumdar <[email protected]> * remove _target_ at model level in aed config (NVIDIA#8351) Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> * revert changes for tts and asr * Add change_vocabulary and save_tokenizers() support to Multitask ASR models (NVIDIA#8357) * Add change_vocabulary and save_tokenizers() support Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update nemo/collections/asr/models/aed_multitask_models.py Co-authored-by: Piotr Żelasko <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> --------- Signed-off-by: smajumdar <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Piotr Żelasko <[email protected]> * Change default (NVIDIA#8371) Signed-off-by: smajumdar <[email protected]> * implement retro's own fwd_bwd_step() and validation_step() to not have argument first_val_step, which the MLM commit doesn't support * adding megatron compile_helpers(), in future can be fixed with correct MLM commit * bug fix in fast-conformer-aed.yaml and adding jenkins test for speech_to_text_aed model (NVIDIA#8368) Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> * Enable megatron core loggers for GPT pretraining (NVIDIA#8354) * Logging changes tested for gpt_pretraining Signed-off-by: Aishwarya Bhandare <[email protected]> * Additional args Signed-off-by: Aishwarya Bhandare <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Aishwarya Bhandare <[email protected]> Co-authored-by: Aishwarya Bhandare <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <[email protected]> * mcore ds fix (NVIDIA#8283) * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (NVIDIA#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * mcore ds fix Signed-off-by: Dmytro Pykhtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update mcore Signed-off-by: dimapihtar <[email protected]> * revert asr files Signed-off-by: dimapihtar <[email protected]> * add comments Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for mcore mock dataset Signed-off-by: dimapihtar <[email protected]> * update mcore version Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update gpt cfg Signed-off-by: dimapihtar <[email protected]> * update mcore commit Signed-off-by: dimapihtar <[email protected]> * fix Bert unit tests Signed-off-by: dimapihtar <[email protected]> * update bert tests Signed-off-by: dimapihtar <[email protected]> * fix bert mcore test Signed-off-by: dimapihtar <[email protected]> * fix gpt jenkins tests Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update apex & TE commits Signed-off-by: dimapihtar <[email protected]> * revert apex installation Signed-off-by: dimapihtar <[email protected]> * turn off the fusion for jenkins Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: dimapihtar <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Pablo Garay <[email protected]> * addressing Eric's reviews * adding existing implementation RETRO files * adding existing implementation RETRO files * Add Finetuning tutorial with HF Datasets (NVIDIA#8356) * Add Finetuning tutorial with HF Datasets Signed-off-by: Nithin Rao Koluguri <nithinraok> * update on Som comments Signed-off-by: Nithin Rao Koluguri <nithinraok> --------- Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * release updates (NVIDIA#8378) * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (NVIDIA#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * mcore ds fix Signed-off-by: Dmytro Pykhtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update mcore Signed-off-by: dimapihtar <[email protected]> * revert asr files Signed-off-by: dimapihtar <[email protected]> * add comments Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for mcore mock dataset Signed-off-by: dimapihtar <[email protected]> * update mcore version Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update gpt cfg Signed-off-by: dimapihtar <[email protected]> * update mcore commit Signed-off-by: dimapihtar <[email protected]> * fix Bert unit tests Signed-off-by: dimapihtar <[email protected]> * update bert tests Signed-off-by: dimapihtar <[email protected]> * fix bert mcore test Signed-off-by: dimapihtar <[email protected]> * fix gpt jenkins tests Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for dict data input type Signed-off-by: dimapihtar <[email protected]> * add mock ds test Signed-off-by: dimapihtar <[email protected]> * add test for dict data input type Signed-off-by: dimapihtar <[email protected]> * mcore ds fix Signed-off-by: dimapihtar <[email protected]> * data input fix Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: dimapihtar <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Pablo Garay <[email protected]> * MCore dataset compatibility for tokenizers (NVIDIA#8390) * Add unique_identifiers for all tokenizers and eod for SentencePieceTokenizer Signed-off-by: Valerie Sarge <[email protected]> * Add generalized token aliases to TokenizerSpec to conform with MegatronTokenizer's interface. Remove now-redundant individual fixes from AutoTokenizer and SentencePieceTokenizer. Signed-off-by: Valerie Sarge <[email protected]> --------- Signed-off-by: Valerie Sarge <[email protected]> Co-authored-by: Pablo Garay <[email protected]> * Mcore customization doc (NVIDIA#8298) * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (NVIDIA#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * Add Bert HF checkpoint converter (NVIDIA#8088) * Add Bert HF checkpoint converter Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Reformat Signed-off-by: yaoyu-33 <[email protected]> * Add BERT ONNX export * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add NeMo BERT to HF BERT script * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Clean code Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update argument names Signed-off-by: yaoyu-33 <[email protected]> * Update build_transformer_config in Bert Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> * initial placeholder Signed-off-by: Huiying Li <[email protected]> * add to intro/index.rst Signed-off-by: Huiying Li <[email protected]> * initial content update Signed-off-by: Huiying Li <[email protected]> * add diff images Signed-off-by: Huiying Li <[email protected]> size Signed-off-by: Huiying Li <[email protected]> * minor fixes * minor language change Signed-off-by: Chen Cui <[email protected]> * clean changes --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Chen Cui <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> Co-authored-by: Huiying Li <[email protected]> Co-authored-by: Chen Cui <[email protected]> * wer fix (NVIDIA#8404) Signed-off-by: Travis Bartley <[email protected]> * updated link to pubmed (NVIDIA#8402) Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * Update NFA video download link (NVIDIA#8406) * update nfa nasa video link Signed-off-by: Elena Rastorgueva <[email protected]> * update link in markdown Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> * revert changes (NVIDIA#8410) Signed-off-by: Chen Cui <[email protected]> * Fix dreambooth data sampler issue (NVIDIA#8400) * Turn on drop last Signed-off-by: yaoyu-33 <[email protected]> * Some neva fixes Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fixed errors in the CTM gen functions (NVIDIA#8416) Signed-off-by: Taejin Park <[email protected]> * add ensemble decoding fix (NVIDIA#8427) Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * SDE bugfix log (NVIDIA#8430) Signed-off-by: George <[email protected]> * mcore customization doc minor fix (NVIDIA#8421) Signed-off-by: Huiying Li <[email protected]> * NeMo-Mistral to HF converter bugfix. (NVIDIA#8353) Signed-off-by: Alexandros Koumparoulis <[email protected]> * Fixing mcore bert for TP, PP and SP (NVIDIA#8336) * Fixing mcore bert for TP, PP and SP * Fixing mcore bert for TP, PP and SP * Fixing mcore version * Fixing mcore version * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> --------- Signed-off-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Add settings to suppress bf16 compile errors in CI on V100 (NVIDIA#8481) * Add settings to suppress bf16 compile errors in CI on V100 Signed-off-by: Abhishree <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Abhishree <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * MoE parameter passing (NVIDIA#8255) * MoE parameter passing Signed-off-by: Alexandros Koumparoulis <[email protected]> * Pass EP/MoE params in consumer scripts. Signed-off-by: Alexandros Koumparoulis <[email protected]> * PR fixes Signed-off-by: Alexandros Koumparoulis <[email protected]> * Use latest commit of mcore-0.5 Signed-off-by: Alexandros Koumparoulis <[email protected]> * CI fix Signed-off-by: Alexandros Koumparoulis <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Update k2 version (NVIDIA#8478) (NVIDIA#8492) Signed-off-by: Vladimir Bataev <[email protected]> * Add fp8 support for SD/Update notebook paths (NVIDIA#8489) * Add fp8 support for SD/Update notebook paths Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <[email protected]> * pin to 0.5.0 (NVIDIA#8465) Signed-off-by: eharper <[email protected]> * Update NeMo Multimodal Requirements (NVIDIA#8515) * Update requirements_multimodal.txt Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * update github raw content link (NVIDIA#8517) Signed-off-by: Chen Cui <[email protected]> * Add dep notice for notebooks (NVIDIA#8522) * add dep notice Signed-off-by: eharper <[email protected]> * revert Signed-off-by: eharper <[email protected]> --------- Signed-off-by: eharper <[email protected]> * Revert FP8 integration (NVIDIA#8520) * Revert FP8 integration Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Update data prep notebook (NVIDIA#8532) Signed-off-by: Mingyuan Ma <[email protected]> * before update branch with latest r1.23.0 * update to run with MLM ae2817b3dde4efb1515061a5311d01d8f85bd99c (runnable training and saving checkpoint) * remove compile_helpers * reverse changes from main branch to r1.23.0 * adding *_legacy files * update MLM commit in Jenkinsfile to latest * debugging Jenkinstest: test different mcore import in retro_dataset * update Jenkinsfile edit megatron_retro_mutransfer_pretrain_legacy.py * removing all mcore RETRO to pass the Jenkinstest * fixing import legacy problem for tests/collections/nlp/test_indexed_retrieval_dataset.py * update Jenkinsfile file to use TE v0.7 * update NeMo to work with latest mcore RETRO (solving TE problems) * update TE commit Jenkinsfile to be the same with r1.23.0's Jenkinsfile * update commit for MLM * jenkinstest debugging * temporary fix RETRO's __init__ for jenkinstest * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * add model.data.dataloader_type=cyclic to jenkinsfile * update code to work with latest megatron-lm main 81dab6067 * update M-LM commit in Jenkinsfile to latest main M-LM 81dab6067 * fix to by pass CI test bf16 problem (following this PR https://github.com/NVIDIA/NeMo/pull/8481/files) * isort and black * adjusting model.micro_batch_size to 1 * fix BRANCH = 'r1.23.0' * replace tutorials dir from main branch to huvu/mcore_retro * fix minor merges conflict * update Jenkinsfile * runnable with a temporary fix from Jacek (unfound -unfinished problem) * runnable with a temporary fix from Jacek (unfound -unfinished problem) * modified nlp_overrides.py back to original * fix checkpoint from Jacek Bieniusiewicz * config Jenkinsfile test * set RETRO Jenkins MBS to 1 * black fix * isort fix * update TE commit * update to latest Jenkinsfile with latest container and commits * remove new RETRO jenkinstest * merge latest main * put RETRO Jenkinstest to the right place * update code for megatron_retro_pretraining_legacy.py * untrack ipa_cmudict-0.7b_nv23.01.txt * untrack ipa_cmudict-0.7b_nv23.01.txt * set config in megatron_retro_pretraining_legacy.py to megatron_retro_config_legacy * update new RETRO jenkinstest to run faster * merging latest main, and edit Jenkinstest * update Jenkinstest for new RETRO to run faster * fix isort * fix whitespace Signed-off-by: eharper <[email protected]> --------- Signed-off-by: eharper <[email protected]> Signed-off-by: Mikołaj Błaż <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: dimapihtar <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: Nithin Rao Koluguri <nithinraok> Signed-off-by: Jimmy Zhang <[email protected]> Signed-off-by: Chen Cui <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Alexandros Koumparoulis <[email protected]> Signed-off-by: Sangkug Lym <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: Krishna Puvvada <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Signed-off-by: Aishwarya Bhandare <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: Valerie Sarge <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Travis Bartley <[email protected]> Signed-off-by: Taejin Park <[email protected]> Signed-off-by: George <[email protected]> Signed-off-by: Shanmugam Ramasamy <[email protected]> Signed-off-by: Abhishree <[email protected]> Signed-off-by: Vladimir Bataev <[email protected]> Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: eharper <[email protected]> Co-authored-by: mikolajblaz <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: dimapihtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Piotr Żelasko <[email protected]> Co-authored-by: Elena Rastorgueva <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: JimmyZhang12 <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: Chen Cui <[email protected]> Co-authored-by: Huy Vu2 <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: Bobby Chen <[email protected]> Co-authored-by: akoumpa <[email protected]> Co-authored-by: Sangkug Lym <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: ashbhandare <[email protected]> Co-authored-by: Aishwarya Bhandare <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Pablo Garay <[email protected]> Co-authored-by: Valerie Sarge <[email protected]> Co-authored-by: Huiying <[email protected]> Co-authored-by: Huiying Li <[email protected]> Co-authored-by: tbartley94 <[email protected]> Co-authored-by: Taejin Park <[email protected]> Co-authored-by: George <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Abhishree Thittenamane <[email protected]> Co-authored-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: Vladimir Bataev <[email protected]> Co-authored-by: Ming <[email protected]> Co-authored-by: Huy Vu2 <[email protected]>
xingyaoww
pushed a commit
to xingyaoww/NeMo
that referenced
this pull request
Apr 23, 2024
* update branch Signed-off-by: eharper <[email protected]> * Add dist ckpt support for regular optimizers (NVIDIA#7749) * Add dist ckpt support for regular optimizers Signed-off-by: Mikołaj Błaż <[email protected]> * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * fix imports Signed-off-by: dimapihtar <[email protected]> * imports fix Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * ci imports fix Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert asr notebook Signed-off-by: dimapihtar <[email protected]> * revert asr notebook Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Mikołaj Błaż <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: dimapihtar <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: dimapihtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Pin lhotse=1.19.2 in r1.23.0 (NVIDIA#8303) Signed-off-by: Piotr Żelasko <[email protected]> * Cache Aware Streaming tutorial notebook (NVIDIA#8296) * add notebook Signed-off-by: Elena Rastorgueva <[email protected]> * rename old notebook to Buffered_Streaming Signed-off-by: Elena Rastorgueva <[email protected]> * call setup_streaming_params in set_default_att_context_size method Signed-off-by: Elena Rastorgueva <[email protected]> * update links in docs Signed-off-by: Elena Rastorgueva <[email protected]> * update links to tutorials in docs Signed-off-by: Elena Rastorgueva <[email protected]> * remove hard-coding Signed-off-by: Elena Rastorgueva <[email protected]> * rename var Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> * fix path location and branch (NVIDIA#8304) * fix path location and branch Signed-off-by: Nithin Rao Koluguri <nithinraok> * change to a floating point number Signed-off-by: Nithin Rao Koluguri <nithinraok> --------- Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Somshubra Majumdar <[email protected]> * add deallocate pipeline output optimization (NVIDIA#8279) * add deallocate pipeline output optimization Signed-off-by: Jimmy Zhang <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Jimmy Zhang <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fix memory leak caused by context parallelism hanging references by omegaconf (NVIDIA#8299) * save cp_size to self Signed-off-by: Jimmy Zhang <[email protected]> * use parallel_state instead of self Signed-off-by: Jimmy Zhang <[email protected]> --------- Signed-off-by: Jimmy Zhang <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: Eric Harper <[email protected]> * remove assertion (NVIDIA#8302) Signed-off-by: dimapihtar <[email protected]> * Update PEFT Doc (NVIDIA#8262) * update peft doc Signed-off-by: Chen Cui <[email protected]> * remove old prompt learning doc and notebook Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * Merge branch 'r1.23.0' into chcui/update_peft_doc Signed-off-by: Chen Cui <[email protected]> * revert accidental changes Signed-off-by: Chen Cui <[email protected]> * revert accidental changes Signed-off-by: Chen Cui <[email protected]> --------- Signed-off-by: Chen Cui <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks (NVIDIA#8242) (NVIDIA#8324) * Rebasing canary changes at current main Signed-off-by: Piotr Żelasko <[email protected]> * Move the changes from asr transformer to nlp transformer as originally intended Signed-off-by: Piotr Żelasko <[email protected]> * update eval to strip spaces before punctuations Signed-off-by: stevehuang52 <[email protected]> * update pc strip Signed-off-by: stevehuang52 <[email protected]> * [canary] Refactor: `PromptedAudioToTextLhotseDataset` and `EncDecMultiTaskModel` (NVIDIA#8247) * Create a separate CanaryDataset and use it inside `transformer_bpe_models.py`. Ditches `token_sequence_format`. Signed-off-by: Piotr Żelasko <[email protected]> * [canary] Refactor: move changes in transformer_bpe_models.py to Canar… (NVIDIA#8252) * [canary] Refactor: move changes in transformer_bpe_models.py to CanaryModel Signed-off-by: Piotr Żelasko <[email protected]> * Rename `CanaryModel` to `EncDecMultiTaskModel` and remove inheritance from `EncDecTransfModelBPE`; add a separate config for this model Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * Rename `CanaryDataset` to `PromptedAudioToTextLhotseDataset`; add `prompt_format_fn` argument; clean-up the `_canary_prompt_format` function a bit Signed-off-by: Piotr Żelasko <[email protected]> * Move tokenization into `prompt_format_fn`, fix usage, add docs Signed-off-by: Piotr Żelasko <[email protected]> * Backward-compatible utterance validation Signed-off-by: Piotr Żelasko <[email protected]> * Improve type annotations Signed-off-by: Piotr Żelasko <[email protected]> * config and prompt_fn registration changes from review Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * fix transcribe config Signed-off-by: stevehuang52 <[email protected]> * Refactor Canary to follow schema of remaining ASR models (NVIDIA#8260) * Initial draft of multi task beam decoding strategy Signed-off-by: smajumdar <[email protected]> * Stabilize inference Signed-off-by: smajumdar <[email protected]> * Update AED Multi Task model to mostly conform to Archetype-Type format. Update config Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add change decoding strategy Signed-off-by: smajumdar <[email protected]> * Remove redundant imports Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Cleanup Signed-off-by: smajumdar <[email protected]> * Cleanup Signed-off-by: smajumdar <[email protected]> * remove asr transformer dependency on nlp Signed-off-by: stevehuang52 <[email protected]> * clean up Signed-off-by: stevehuang52 <[email protected]> * copy token_classifier from nlp to asr Signed-off-by: stevehuang52 <[email protected]> * Address comments Signed-off-by: smajumdar <[email protected]> * Add typing to beam decoding Signed-off-by: smajumdar <[email protected]> * Make prompt format configurable Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * drop asr dependency on nlp Signed-off-by: stevehuang52 <[email protected]> --------- Signed-off-by: smajumdar <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: stevehuang52 <[email protected]> * fix transcribe, update asr evaluator Signed-off-by: stevehuang52 <[email protected]> * Extend the docs for the canary prompt_fn Signed-off-by: Piotr Żelasko <[email protected]> * Incorporate changes from Nithin's code review Signed-off-by: Piotr Żelasko <[email protected]> * training bug fix and adding launch script for speech_multitask (NVIDIA#8270) * bug fix and adding launch script for speech_multitask Signed-off-by: Krishna Puvvada <[email protected]> * update launch script example in speech_to_text_aed.py Signed-off-by: Krishna Puvvada <[email protected]> --------- Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> * Fix: drop_last must be true in validation/test otherwise the training will hang Signed-off-by: Piotr Żelasko <[email protected]> * revert to current transcribe API Signed-off-by: stevehuang52 <[email protected]> * revert changes to NLP, update docs Signed-off-by: stevehuang52 <[email protected]> * update eval utils Signed-off-by: stevehuang52 <[email protected]> * update docs Signed-off-by: stevehuang52 <[email protected]> * Remove DALI; rename compute_audio_loss to compute_loss Signed-off-by: Piotr Żelasko <[email protected]> * set default use_model_transcribe=False Signed-off-by: stevehuang52 <[email protected]> * change os.path.dirname to pathlib Signed-off-by: stevehuang52 <[email protected]> * [canary] Test for CanaryTokenizer + refactoring (NVIDIA#8285) * Test for CanaryTokenizer Signed-off-by: Piotr Żelasko <[email protected]> * Attempt at refactor... Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * Update config for AED models (NVIDIA#8294) Signed-off-by: smajumdar <[email protected]> * set default calculate_wer=False in transcribe_speech.py Signed-off-by: stevehuang52 <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks Signed-off-by: Piotr Żelasko <[email protected]> * Apply suggestions from code review, part 1 Co-authored-by: Nithin Rao <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> * Apply suggestions from code review, part 2 Signed-off-by: Piotr Żelasko <[email protected]> * Document compute_loss Signed-off-by: Piotr Żelasko <[email protected]> * update transcribe_speech.py Signed-off-by: stevehuang52 <[email protected]> * add docstring Signed-off-by: stevehuang52 <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: Krishna Puvvada <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> Co-authored-by: stevehuang52 <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: He Huang (Steve) <[email protected]> Co-authored-by: Nithin Rao <[email protected]> (cherry picked from commit d10726d) Co-authored-by: Piotr Żelasko <[email protected]> * add code for calling mcore_retro in NeMo * add code for calling mcore_retro in NeMo * runnable, training curve match retro mcore and nemo * working on retro inference * working on megatron_retro_eval.py and megatron_retro_inference.yaml * refactoring text_generation_utils code and retro inference relevant files * clean PR * resolving quick hacks (reading number of train/valid samples from workdir, discrepancy in total samples and samples with neighbors retrieved, tokenizers) * clean repository * revert changes to inference/eval code to original in main * clean code * runable training code, with already implemented eval code * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (NVIDIA#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * Add Bert HF checkpoint converter (NVIDIA#8088) * Add Bert HF checkpoint converter Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Reformat Signed-off-by: yaoyu-33 <[email protected]> * Add BERT ONNX export * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add NeMo BERT to HF BERT script * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Clean code Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update argument names Signed-off-by: yaoyu-33 <[email protected]> * Update build_transformer_config in Bert Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> * revert to original eval code files * revert to original eval code files 2 * revert to original eval code files 3 * revert to original eval code files 4 * clean code * clean code * update my code to support changes from lastest main * commit before rebase r1.23.0 * Multimodal r1.23.0 bug fix (NVIDIA#8315) * Rename quick-gelu Signed-off-by: yaoyu-33 <[email protected]> * ddpm config guard Signed-off-by: yaoyu-33 <[email protected]> * Fix ddpm edit api Signed-off-by: yaoyu-33 <[email protected]> * Fix insert_image_token cfg issue Signed-off-by: yaoyu-33 <[email protected]> * neva updates Signed-off-by: yaoyu-33 <[email protected]> * reformat Signed-off-by: yaoyu-33 <[email protected]> * Add back jenkins Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix jenkins Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix bugs Signed-off-by: yaoyu-33 <[email protected]> * Update default neva template Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * copy paste files from r1.23.0 * clean PR * Fixes for MoE parameter passing & use of AutoTokenizer/Model for mistral. (NVIDIA#8272) Signed-off-by: Alexandros Koumparoulis <[email protected]> * Keep max_seqlen and cu_seqlens_argmin for later micro-batches when PP>1 (NVIDIA#8334) Signed-off-by: Sangkug Lym <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Remove asr webapp (NVIDIA#8347) Signed-off-by: smajumdar <[email protected]> * remove _target_ at model level in aed config (NVIDIA#8351) Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> * revert changes for tts and asr * Add change_vocabulary and save_tokenizers() support to Multitask ASR models (NVIDIA#8357) * Add change_vocabulary and save_tokenizers() support Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update nemo/collections/asr/models/aed_multitask_models.py Co-authored-by: Piotr Żelasko <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> --------- Signed-off-by: smajumdar <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Piotr Żelasko <[email protected]> * Change default (NVIDIA#8371) Signed-off-by: smajumdar <[email protected]> * implement retro's own fwd_bwd_step() and validation_step() to not have argument first_val_step, which the MLM commit doesn't support * adding megatron compile_helpers(), in future can be fixed with correct MLM commit * bug fix in fast-conformer-aed.yaml and adding jenkins test for speech_to_text_aed model (NVIDIA#8368) Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> * Enable megatron core loggers for GPT pretraining (NVIDIA#8354) * Logging changes tested for gpt_pretraining Signed-off-by: Aishwarya Bhandare <[email protected]> * Additional args Signed-off-by: Aishwarya Bhandare <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Aishwarya Bhandare <[email protected]> Co-authored-by: Aishwarya Bhandare <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <[email protected]> * mcore ds fix (NVIDIA#8283) * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (NVIDIA#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * mcore ds fix Signed-off-by: Dmytro Pykhtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update mcore Signed-off-by: dimapihtar <[email protected]> * revert asr files Signed-off-by: dimapihtar <[email protected]> * add comments Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for mcore mock dataset Signed-off-by: dimapihtar <[email protected]> * update mcore version Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update gpt cfg Signed-off-by: dimapihtar <[email protected]> * update mcore commit Signed-off-by: dimapihtar <[email protected]> * fix Bert unit tests Signed-off-by: dimapihtar <[email protected]> * update bert tests Signed-off-by: dimapihtar <[email protected]> * fix bert mcore test Signed-off-by: dimapihtar <[email protected]> * fix gpt jenkins tests Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update apex & TE commits Signed-off-by: dimapihtar <[email protected]> * revert apex installation Signed-off-by: dimapihtar <[email protected]> * turn off the fusion for jenkins Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: dimapihtar <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Pablo Garay <[email protected]> * addressing Eric's reviews * adding existing implementation RETRO files * adding existing implementation RETRO files * Add Finetuning tutorial with HF Datasets (NVIDIA#8356) * Add Finetuning tutorial with HF Datasets Signed-off-by: Nithin Rao Koluguri <nithinraok> * update on Som comments Signed-off-by: Nithin Rao Koluguri <nithinraok> --------- Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * release updates (NVIDIA#8378) * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (NVIDIA#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * mcore ds fix Signed-off-by: Dmytro Pykhtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update mcore Signed-off-by: dimapihtar <[email protected]> * revert asr files Signed-off-by: dimapihtar <[email protected]> * add comments Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for mcore mock dataset Signed-off-by: dimapihtar <[email protected]> * update mcore version Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update gpt cfg Signed-off-by: dimapihtar <[email protected]> * update mcore commit Signed-off-by: dimapihtar <[email protected]> * fix Bert unit tests Signed-off-by: dimapihtar <[email protected]> * update bert tests Signed-off-by: dimapihtar <[email protected]> * fix bert mcore test Signed-off-by: dimapihtar <[email protected]> * fix gpt jenkins tests Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for dict data input type Signed-off-by: dimapihtar <[email protected]> * add mock ds test Signed-off-by: dimapihtar <[email protected]> * add test for dict data input type Signed-off-by: dimapihtar <[email protected]> * mcore ds fix Signed-off-by: dimapihtar <[email protected]> * data input fix Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: dimapihtar <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Pablo Garay <[email protected]> * MCore dataset compatibility for tokenizers (NVIDIA#8390) * Add unique_identifiers for all tokenizers and eod for SentencePieceTokenizer Signed-off-by: Valerie Sarge <[email protected]> * Add generalized token aliases to TokenizerSpec to conform with MegatronTokenizer's interface. Remove now-redundant individual fixes from AutoTokenizer and SentencePieceTokenizer. Signed-off-by: Valerie Sarge <[email protected]> --------- Signed-off-by: Valerie Sarge <[email protected]> Co-authored-by: Pablo Garay <[email protected]> * Mcore customization doc (NVIDIA#8298) * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (NVIDIA#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * Add Bert HF checkpoint converter (NVIDIA#8088) * Add Bert HF checkpoint converter Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Reformat Signed-off-by: yaoyu-33 <[email protected]> * Add BERT ONNX export * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add NeMo BERT to HF BERT script * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Clean code Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update argument names Signed-off-by: yaoyu-33 <[email protected]> * Update build_transformer_config in Bert Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> * initial placeholder Signed-off-by: Huiying Li <[email protected]> * add to intro/index.rst Signed-off-by: Huiying Li <[email protected]> * initial content update Signed-off-by: Huiying Li <[email protected]> * add diff images Signed-off-by: Huiying Li <[email protected]> size Signed-off-by: Huiying Li <[email protected]> * minor fixes * minor language change Signed-off-by: Chen Cui <[email protected]> * clean changes --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Chen Cui <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> Co-authored-by: Huiying Li <[email protected]> Co-authored-by: Chen Cui <[email protected]> * wer fix (NVIDIA#8404) Signed-off-by: Travis Bartley <[email protected]> * updated link to pubmed (NVIDIA#8402) Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * Update NFA video download link (NVIDIA#8406) * update nfa nasa video link Signed-off-by: Elena Rastorgueva <[email protected]> * update link in markdown Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> * revert changes (NVIDIA#8410) Signed-off-by: Chen Cui <[email protected]> * Fix dreambooth data sampler issue (NVIDIA#8400) * Turn on drop last Signed-off-by: yaoyu-33 <[email protected]> * Some neva fixes Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fixed errors in the CTM gen functions (NVIDIA#8416) Signed-off-by: Taejin Park <[email protected]> * add ensemble decoding fix (NVIDIA#8427) Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * SDE bugfix log (NVIDIA#8430) Signed-off-by: George <[email protected]> * mcore customization doc minor fix (NVIDIA#8421) Signed-off-by: Huiying Li <[email protected]> * NeMo-Mistral to HF converter bugfix. (NVIDIA#8353) Signed-off-by: Alexandros Koumparoulis <[email protected]> * Fixing mcore bert for TP, PP and SP (NVIDIA#8336) * Fixing mcore bert for TP, PP and SP * Fixing mcore bert for TP, PP and SP * Fixing mcore version * Fixing mcore version * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> --------- Signed-off-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Add settings to suppress bf16 compile errors in CI on V100 (NVIDIA#8481) * Add settings to suppress bf16 compile errors in CI on V100 Signed-off-by: Abhishree <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Abhishree <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * MoE parameter passing (NVIDIA#8255) * MoE parameter passing Signed-off-by: Alexandros Koumparoulis <[email protected]> * Pass EP/MoE params in consumer scripts. Signed-off-by: Alexandros Koumparoulis <[email protected]> * PR fixes Signed-off-by: Alexandros Koumparoulis <[email protected]> * Use latest commit of mcore-0.5 Signed-off-by: Alexandros Koumparoulis <[email protected]> * CI fix Signed-off-by: Alexandros Koumparoulis <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Update k2 version (NVIDIA#8478) (NVIDIA#8492) Signed-off-by: Vladimir Bataev <[email protected]> * Add fp8 support for SD/Update notebook paths (NVIDIA#8489) * Add fp8 support for SD/Update notebook paths Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <[email protected]> * pin to 0.5.0 (NVIDIA#8465) Signed-off-by: eharper <[email protected]> * Update NeMo Multimodal Requirements (NVIDIA#8515) * Update requirements_multimodal.txt Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * update github raw content link (NVIDIA#8517) Signed-off-by: Chen Cui <[email protected]> * Add dep notice for notebooks (NVIDIA#8522) * add dep notice Signed-off-by: eharper <[email protected]> * revert Signed-off-by: eharper <[email protected]> --------- Signed-off-by: eharper <[email protected]> * Revert FP8 integration (NVIDIA#8520) * Revert FP8 integration Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Update data prep notebook (NVIDIA#8532) Signed-off-by: Mingyuan Ma <[email protected]> * before update branch with latest r1.23.0 * update to run with MLM ae2817b3dde4efb1515061a5311d01d8f85bd99c (runnable training and saving checkpoint) * remove compile_helpers * reverse changes from main branch to r1.23.0 * adding *_legacy files * update MLM commit in Jenkinsfile to latest * debugging Jenkinstest: test different mcore import in retro_dataset * update Jenkinsfile edit megatron_retro_mutransfer_pretrain_legacy.py * removing all mcore RETRO to pass the Jenkinstest * fixing import legacy problem for tests/collections/nlp/test_indexed_retrieval_dataset.py * update Jenkinsfile file to use TE v0.7 * update NeMo to work with latest mcore RETRO (solving TE problems) * update TE commit Jenkinsfile to be the same with r1.23.0's Jenkinsfile * update commit for MLM * jenkinstest debugging * temporary fix RETRO's __init__ for jenkinstest * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * add model.data.dataloader_type=cyclic to jenkinsfile * update code to work with latest megatron-lm main 81dab6067 * update M-LM commit in Jenkinsfile to latest main M-LM 81dab6067 * fix to by pass CI test bf16 problem (following this PR https://github.com/NVIDIA/NeMo/pull/8481/files) * isort and black * adjusting model.micro_batch_size to 1 * fix BRANCH = 'r1.23.0' * replace tutorials dir from main branch to huvu/mcore_retro * fix minor merges conflict * update Jenkinsfile * runnable with a temporary fix from Jacek (unfound -unfinished problem) * runnable with a temporary fix from Jacek (unfound -unfinished problem) * modified nlp_overrides.py back to original * fix checkpoint from Jacek Bieniusiewicz * config Jenkinsfile test * set RETRO Jenkins MBS to 1 * black fix * isort fix * update TE commit * update to latest Jenkinsfile with latest container and commits * remove new RETRO jenkinstest * merge latest main * put RETRO Jenkinstest to the right place * update code for megatron_retro_pretraining_legacy.py * untrack ipa_cmudict-0.7b_nv23.01.txt * untrack ipa_cmudict-0.7b_nv23.01.txt * set config in megatron_retro_pretraining_legacy.py to megatron_retro_config_legacy * update new RETRO jenkinstest to run faster * merging latest main, and edit Jenkinstest * update Jenkinstest for new RETRO to run faster * fix isort * adding RETRO tests to cicd-main.yml action tests * update ipa_cmudict-0.7b_nv23.01.txt * remove quotes for model.data for legacy RETRO action tests --------- Signed-off-by: eharper <[email protected]> Signed-off-by: Mikołaj Błaż <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: dimapihtar <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: Nithin Rao Koluguri <nithinraok> Signed-off-by: Jimmy Zhang <[email protected]> Signed-off-by: Chen Cui <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Alexandros Koumparoulis <[email protected]> Signed-off-by: Sangkug Lym <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: Krishna Puvvada <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Signed-off-by: Aishwarya Bhandare <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: Valerie Sarge <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Travis Bartley <[email protected]> Signed-off-by: Taejin Park <[email protected]> Signed-off-by: George <[email protected]> Signed-off-by: Shanmugam Ramasamy <[email protected]> Signed-off-by: Abhishree <[email protected]> Signed-off-by: Vladimir Bataev <[email protected]> Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: eharper <[email protected]> Co-authored-by: mikolajblaz <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: dimapihtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Piotr Żelasko <[email protected]> Co-authored-by: Elena Rastorgueva <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: JimmyZhang12 <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: Chen Cui <[email protected]> Co-authored-by: Huy Vu2 <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: Bobby Chen <[email protected]> Co-authored-by: akoumpa <[email protected]> Co-authored-by: Sangkug Lym <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: ashbhandare <[email protected]> Co-authored-by: Aishwarya Bhandare <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Pablo Garay <[email protected]> Co-authored-by: Valerie Sarge <[email protected]> Co-authored-by: Huiying <[email protected]> Co-authored-by: Huiying Li <[email protected]> Co-authored-by: tbartley94 <[email protected]> Co-authored-by: Taejin Park <[email protected]> Co-authored-by: George <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Abhishree Thittenamane <[email protected]> Co-authored-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: Vladimir Bataev <[email protected]> Co-authored-by: Ming <[email protected]> Co-authored-by: Huy Vu2 <[email protected]>
huvunvidia
added a commit
that referenced
this pull request
Apr 23, 2024
* update branch Signed-off-by: eharper <[email protected]> * Add dist ckpt support for regular optimizers (#7749) * Add dist ckpt support for regular optimizers Signed-off-by: Mikołaj Błaż <[email protected]> * [tutorial] fixed missing RIR scripts file. (#8257) Signed-off-by: Xuesong Yang <[email protected]> * fix imports Signed-off-by: dimapihtar <[email protected]> * imports fix Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * ci imports fix Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert asr notebook Signed-off-by: dimapihtar <[email protected]> * revert asr notebook Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Mikołaj Błaż <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: dimapihtar <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: dimapihtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Pin lhotse=1.19.2 in r1.23.0 (#8303) Signed-off-by: Piotr Żelasko <[email protected]> * Cache Aware Streaming tutorial notebook (#8296) * add notebook Signed-off-by: Elena Rastorgueva <[email protected]> * rename old notebook to Buffered_Streaming Signed-off-by: Elena Rastorgueva <[email protected]> * call setup_streaming_params in set_default_att_context_size method Signed-off-by: Elena Rastorgueva <[email protected]> * update links in docs Signed-off-by: Elena Rastorgueva <[email protected]> * update links to tutorials in docs Signed-off-by: Elena Rastorgueva <[email protected]> * remove hard-coding Signed-off-by: Elena Rastorgueva <[email protected]> * rename var Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> * fix path location and branch (#8304) * fix path location and branch Signed-off-by: Nithin Rao Koluguri <nithinraok> * change to a floating point number Signed-off-by: Nithin Rao Koluguri <nithinraok> --------- Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Somshubra Majumdar <[email protected]> * add deallocate pipeline output optimization (#8279) * add deallocate pipeline output optimization Signed-off-by: Jimmy Zhang <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Jimmy Zhang <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fix memory leak caused by context parallelism hanging references by omegaconf (#8299) * save cp_size to self Signed-off-by: Jimmy Zhang <[email protected]> * use parallel_state instead of self Signed-off-by: Jimmy Zhang <[email protected]> --------- Signed-off-by: Jimmy Zhang <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: Eric Harper <[email protected]> * remove assertion (#8302) Signed-off-by: dimapihtar <[email protected]> * Update PEFT Doc (#8262) * update peft doc Signed-off-by: Chen Cui <[email protected]> * remove old prompt learning doc and notebook Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * Merge branch 'r1.23.0' into chcui/update_peft_doc Signed-off-by: Chen Cui <[email protected]> * revert accidental changes Signed-off-by: Chen Cui <[email protected]> * revert accidental changes Signed-off-by: Chen Cui <[email protected]> --------- Signed-off-by: Chen Cui <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks (#8242) (#8324) * Rebasing canary changes at current main Signed-off-by: Piotr Żelasko <[email protected]> * Move the changes from asr transformer to nlp transformer as originally intended Signed-off-by: Piotr Żelasko <[email protected]> * update eval to strip spaces before punctuations Signed-off-by: stevehuang52 <[email protected]> * update pc strip Signed-off-by: stevehuang52 <[email protected]> * [canary] Refactor: `PromptedAudioToTextLhotseDataset` and `EncDecMultiTaskModel` (#8247) * Create a separate CanaryDataset and use it inside `transformer_bpe_models.py`. Ditches `token_sequence_format`. Signed-off-by: Piotr Żelasko <[email protected]> * [canary] Refactor: move changes in transformer_bpe_models.py to Canar… (#8252) * [canary] Refactor: move changes in transformer_bpe_models.py to CanaryModel Signed-off-by: Piotr Żelasko <[email protected]> * Rename `CanaryModel` to `EncDecMultiTaskModel` and remove inheritance from `EncDecTransfModelBPE`; add a separate config for this model Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * Rename `CanaryDataset` to `PromptedAudioToTextLhotseDataset`; add `prompt_format_fn` argument; clean-up the `_canary_prompt_format` function a bit Signed-off-by: Piotr Żelasko <[email protected]> * Move tokenization into `prompt_format_fn`, fix usage, add docs Signed-off-by: Piotr Żelasko <[email protected]> * Backward-compatible utterance validation Signed-off-by: Piotr Żelasko <[email protected]> * Improve type annotations Signed-off-by: Piotr Żelasko <[email protected]> * config and prompt_fn registration changes from review Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * fix transcribe config Signed-off-by: stevehuang52 <[email protected]> * Refactor Canary to follow schema of remaining ASR models (#8260) * Initial draft of multi task beam decoding strategy Signed-off-by: smajumdar <[email protected]> * Stabilize inference Signed-off-by: smajumdar <[email protected]> * Update AED Multi Task model to mostly conform to Archetype-Type format. Update config Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add change decoding strategy Signed-off-by: smajumdar <[email protected]> * Remove redundant imports Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Cleanup Signed-off-by: smajumdar <[email protected]> * Cleanup Signed-off-by: smajumdar <[email protected]> * remove asr transformer dependency on nlp Signed-off-by: stevehuang52 <[email protected]> * clean up Signed-off-by: stevehuang52 <[email protected]> * copy token_classifier from nlp to asr Signed-off-by: stevehuang52 <[email protected]> * Address comments Signed-off-by: smajumdar <[email protected]> * Add typing to beam decoding Signed-off-by: smajumdar <[email protected]> * Make prompt format configurable Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * drop asr dependency on nlp Signed-off-by: stevehuang52 <[email protected]> --------- Signed-off-by: smajumdar <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: stevehuang52 <[email protected]> * fix transcribe, update asr evaluator Signed-off-by: stevehuang52 <[email protected]> * Extend the docs for the canary prompt_fn Signed-off-by: Piotr Żelasko <[email protected]> * Incorporate changes from Nithin's code review Signed-off-by: Piotr Żelasko <[email protected]> * training bug fix and adding launch script for speech_multitask (#8270) * bug fix and adding launch script for speech_multitask Signed-off-by: Krishna Puvvada <[email protected]> * update launch script example in speech_to_text_aed.py Signed-off-by: Krishna Puvvada <[email protected]> --------- Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> * Fix: drop_last must be true in validation/test otherwise the training will hang Signed-off-by: Piotr Żelasko <[email protected]> * revert to current transcribe API Signed-off-by: stevehuang52 <[email protected]> * revert changes to NLP, update docs Signed-off-by: stevehuang52 <[email protected]> * update eval utils Signed-off-by: stevehuang52 <[email protected]> * update docs Signed-off-by: stevehuang52 <[email protected]> * Remove DALI; rename compute_audio_loss to compute_loss Signed-off-by: Piotr Żelasko <[email protected]> * set default use_model_transcribe=False Signed-off-by: stevehuang52 <[email protected]> * change os.path.dirname to pathlib Signed-off-by: stevehuang52 <[email protected]> * [canary] Test for CanaryTokenizer + refactoring (#8285) * Test for CanaryTokenizer Signed-off-by: Piotr Żelasko <[email protected]> * Attempt at refactor... Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * Update config for AED models (#8294) Signed-off-by: smajumdar <[email protected]> * set default calculate_wer=False in transcribe_speech.py Signed-off-by: stevehuang52 <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks Signed-off-by: Piotr Żelasko <[email protected]> * Apply suggestions from code review, part 1 Co-authored-by: Nithin Rao <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> * Apply suggestions from code review, part 2 Signed-off-by: Piotr Żelasko <[email protected]> * Document compute_loss Signed-off-by: Piotr Żelasko <[email protected]> * update transcribe_speech.py Signed-off-by: stevehuang52 <[email protected]> * add docstring Signed-off-by: stevehuang52 <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: Krishna Puvvada <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> Co-authored-by: stevehuang52 <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: He Huang (Steve) <[email protected]> Co-authored-by: Nithin Rao <[email protected]> (cherry picked from commit d10726d) Co-authored-by: Piotr Żelasko <[email protected]> * add code for calling mcore_retro in NeMo * add code for calling mcore_retro in NeMo * runnable, training curve match retro mcore and nemo * working on retro inference * working on megatron_retro_eval.py and megatron_retro_inference.yaml * refactoring text_generation_utils code and retro inference relevant files * clean PR * resolving quick hacks (reading number of train/valid samples from workdir, discrepancy in total samples and samples with neighbors retrieved, tokenizers) * clean repository * revert changes to inference/eval code to original in main * clean code * runable training code, with already implemented eval code * [tutorial] fixed missing RIR scripts file. (#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * Add Bert HF checkpoint converter (#8088) * Add Bert HF checkpoint converter Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Reformat Signed-off-by: yaoyu-33 <[email protected]> * Add BERT ONNX export * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add NeMo BERT to HF BERT script * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Clean code Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update argument names Signed-off-by: yaoyu-33 <[email protected]> * Update build_transformer_config in Bert Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> * revert to original eval code files * revert to original eval code files 2 * revert to original eval code files 3 * revert to original eval code files 4 * clean code * clean code * update my code to support changes from lastest main * commit before rebase r1.23.0 * Multimodal r1.23.0 bug fix (#8315) * Rename quick-gelu Signed-off-by: yaoyu-33 <[email protected]> * ddpm config guard Signed-off-by: yaoyu-33 <[email protected]> * Fix ddpm edit api Signed-off-by: yaoyu-33 <[email protected]> * Fix insert_image_token cfg issue Signed-off-by: yaoyu-33 <[email protected]> * neva updates Signed-off-by: yaoyu-33 <[email protected]> * reformat Signed-off-by: yaoyu-33 <[email protected]> * Add back jenkins Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix jenkins Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix bugs Signed-off-by: yaoyu-33 <[email protected]> * Update default neva template Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * copy paste files from r1.23.0 * clean PR * Fixes for MoE parameter passing & use of AutoTokenizer/Model for mistral. (#8272) Signed-off-by: Alexandros Koumparoulis <[email protected]> * Keep max_seqlen and cu_seqlens_argmin for later micro-batches when PP>1 (#8334) Signed-off-by: Sangkug Lym <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Remove asr webapp (#8347) Signed-off-by: smajumdar <[email protected]> * remove _target_ at model level in aed config (#8351) Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> * revert changes for tts and asr * Add change_vocabulary and save_tokenizers() support to Multitask ASR models (#8357) * Add change_vocabulary and save_tokenizers() support Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update nemo/collections/asr/models/aed_multitask_models.py Co-authored-by: Piotr Żelasko <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> --------- Signed-off-by: smajumdar <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Piotr Żelasko <[email protected]> * Change default (#8371) Signed-off-by: smajumdar <[email protected]> * implement retro's own fwd_bwd_step() and validation_step() to not have argument first_val_step, which the MLM commit doesn't support * adding megatron compile_helpers(), in future can be fixed with correct MLM commit * bug fix in fast-conformer-aed.yaml and adding jenkins test for speech_to_text_aed model (#8368) Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> * Enable megatron core loggers for GPT pretraining (#8354) * Logging changes tested for gpt_pretraining Signed-off-by: Aishwarya Bhandare <[email protected]> * Additional args Signed-off-by: Aishwarya Bhandare <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Aishwarya Bhandare <[email protected]> Co-authored-by: Aishwarya Bhandare <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <[email protected]> * mcore ds fix (#8283) * [tutorial] fixed missing RIR scripts file. (#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * mcore ds fix Signed-off-by: Dmytro Pykhtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update mcore Signed-off-by: dimapihtar <[email protected]> * revert asr files Signed-off-by: dimapihtar <[email protected]> * add comments Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for mcore mock dataset Signed-off-by: dimapihtar <[email protected]> * update mcore version Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update gpt cfg Signed-off-by: dimapihtar <[email protected]> * update mcore commit Signed-off-by: dimapihtar <[email protected]> * fix Bert unit tests Signed-off-by: dimapihtar <[email protected]> * update bert tests Signed-off-by: dimapihtar <[email protected]> * fix bert mcore test Signed-off-by: dimapihtar <[email protected]> * fix gpt jenkins tests Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update apex & TE commits Signed-off-by: dimapihtar <[email protected]> * revert apex installation Signed-off-by: dimapihtar <[email protected]> * turn off the fusion for jenkins Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: dimapihtar <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Pablo Garay <[email protected]> * addressing Eric's reviews * adding existing implementation RETRO files * adding existing implementation RETRO files * Add Finetuning tutorial with HF Datasets (#8356) * Add Finetuning tutorial with HF Datasets Signed-off-by: Nithin Rao Koluguri <nithinraok> * update on Som comments Signed-off-by: Nithin Rao Koluguri <nithinraok> --------- Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * release updates (#8378) * [tutorial] fixed missing RIR scripts file. (#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * mcore ds fix Signed-off-by: Dmytro Pykhtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update mcore Signed-off-by: dimapihtar <[email protected]> * revert asr files Signed-off-by: dimapihtar <[email protected]> * add comments Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for mcore mock dataset Signed-off-by: dimapihtar <[email protected]> * update mcore version Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update gpt cfg Signed-off-by: dimapihtar <[email protected]> * update mcore commit Signed-off-by: dimapihtar <[email protected]> * fix Bert unit tests Signed-off-by: dimapihtar <[email protected]> * update bert tests Signed-off-by: dimapihtar <[email protected]> * fix bert mcore test Signed-off-by: dimapihtar <[email protected]> * fix gpt jenkins tests Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for dict data input type Signed-off-by: dimapihtar <[email protected]> * add mock ds test Signed-off-by: dimapihtar <[email protected]> * add test for dict data input type Signed-off-by: dimapihtar <[email protected]> * mcore ds fix Signed-off-by: dimapihtar <[email protected]> * data input fix Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: dimapihtar <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Pablo Garay <[email protected]> * MCore dataset compatibility for tokenizers (#8390) * Add unique_identifiers for all tokenizers and eod for SentencePieceTokenizer Signed-off-by: Valerie Sarge <[email protected]> * Add generalized token aliases to TokenizerSpec to conform with MegatronTokenizer's interface. Remove now-redundant individual fixes from AutoTokenizer and SentencePieceTokenizer. Signed-off-by: Valerie Sarge <[email protected]> --------- Signed-off-by: Valerie Sarge <[email protected]> Co-authored-by: Pablo Garay <[email protected]> * Mcore customization doc (#8298) * [tutorial] fixed missing RIR scripts file. (#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * Add Bert HF checkpoint converter (#8088) * Add Bert HF checkpoint converter Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Reformat Signed-off-by: yaoyu-33 <[email protected]> * Add BERT ONNX export * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add NeMo BERT to HF BERT script * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Clean code Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update argument names Signed-off-by: yaoyu-33 <[email protected]> * Update build_transformer_config in Bert Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> * initial placeholder Signed-off-by: Huiying Li <[email protected]> * add to intro/index.rst Signed-off-by: Huiying Li <[email protected]> * initial content update Signed-off-by: Huiying Li <[email protected]> * add diff images Signed-off-by: Huiying Li <[email protected]> size Signed-off-by: Huiying Li <[email protected]> * minor fixes * minor language change Signed-off-by: Chen Cui <[email protected]> * clean changes --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Chen Cui <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> Co-authored-by: Huiying Li <[email protected]> Co-authored-by: Chen Cui <[email protected]> * wer fix (#8404) Signed-off-by: Travis Bartley <[email protected]> * updated link to pubmed (#8402) Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * Update NFA video download link (#8406) * update nfa nasa video link Signed-off-by: Elena Rastorgueva <[email protected]> * update link in markdown Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> * revert changes (#8410) Signed-off-by: Chen Cui <[email protected]> * Fix dreambooth data sampler issue (#8400) * Turn on drop last Signed-off-by: yaoyu-33 <[email protected]> * Some neva fixes Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fixed errors in the CTM gen functions (#8416) Signed-off-by: Taejin Park <[email protected]> * add ensemble decoding fix (#8427) Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * SDE bugfix log (#8430) Signed-off-by: George <[email protected]> * mcore customization doc minor fix (#8421) Signed-off-by: Huiying Li <[email protected]> * NeMo-Mistral to HF converter bugfix. (#8353) Signed-off-by: Alexandros Koumparoulis <[email protected]> * Fixing mcore bert for TP, PP and SP (#8336) * Fixing mcore bert for TP, PP and SP * Fixing mcore bert for TP, PP and SP * Fixing mcore version * Fixing mcore version * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> --------- Signed-off-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Add settings to suppress bf16 compile errors in CI on V100 (#8481) * Add settings to suppress bf16 compile errors in CI on V100 Signed-off-by: Abhishree <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Abhishree <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * MoE parameter passing (#8255) * MoE parameter passing Signed-off-by: Alexandros Koumparoulis <[email protected]> * Pass EP/MoE params in consumer scripts. Signed-off-by: Alexandros Koumparoulis <[email protected]> * PR fixes Signed-off-by: Alexandros Koumparoulis <[email protected]> * Use latest commit of mcore-0.5 Signed-off-by: Alexandros Koumparoulis <[email protected]> * CI fix Signed-off-by: Alexandros Koumparoulis <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Update k2 version (#8478) (#8492) Signed-off-by: Vladimir Bataev <[email protected]> * Add fp8 support for SD/Update notebook paths (#8489) * Add fp8 support for SD/Update notebook paths Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <[email protected]> * pin to 0.5.0 (#8465) Signed-off-by: eharper <[email protected]> * Update NeMo Multimodal Requirements (#8515) * Update requirements_multimodal.txt Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * update github raw content link (#8517) Signed-off-by: Chen Cui <[email protected]> * Add dep notice for notebooks (#8522) * add dep notice Signed-off-by: eharper <[email protected]> * revert Signed-off-by: eharper <[email protected]> --------- Signed-off-by: eharper <[email protected]> * Revert FP8 integration (#8520) * Revert FP8 integration Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Update data prep notebook (#8532) Signed-off-by: Mingyuan Ma <[email protected]> * before update branch with latest r1.23.0 * update to run with MLM ae2817b3dde4efb1515061a5311d01d8f85bd99c (runnable training and saving checkpoint) * remove compile_helpers * reverse changes from main branch to r1.23.0 * adding *_legacy files * update MLM commit in Jenkinsfile to latest * debugging Jenkinstest: test different mcore import in retro_dataset * update Jenkinsfile edit megatron_retro_mutransfer_pretrain_legacy.py * removing all mcore RETRO to pass the Jenkinstest * fixing import legacy problem for tests/collections/nlp/test_indexed_retrieval_dataset.py * update Jenkinsfile file to use TE v0.7 * update NeMo to work with latest mcore RETRO (solving TE problems) * update TE commit Jenkinsfile to be the same with r1.23.0's Jenkinsfile * update commit for MLM * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * jenkinstest debugging * temporary fix RETRO's __init__ for jenkinstest * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * add model.data.dataloader_type=cyclic to jenkinsfile * runnable for inference * update code to work with latest megatron-lm main 81dab6067 * update M-LM commit in Jenkinsfile to latest main M-LM 81dab6067 * cleaning inference code * fix to by pass CI test bf16 problem (following this PR https://github.com/NVIDIA/NeMo/pull/8481/files) * isort and black * adjusting model.micro_batch_size to 1 * fix BRANCH = 'r1.23.0' * replace tutorials dir from main branch to huvu/mcore_retro * fix minor merges conflict * update Jenkinsfile * runnable with a temporary fix from Jacek (unfound -unfinished problem) * runnable with a temporary fix from Jacek (unfound -unfinished problem) * modified nlp_overrides.py back to original * fix checkpoint from Jacek Bieniusiewicz * config Jenkinsfile test * set RETRO Jenkins MBS to 1 * black fix * isort fix * update TE commit * update to latest Jenkinsfile with latest container and commits * remove new RETRO jenkinstest * merge latest main * put RETRO Jenkinstest to the right place * update code for megatron_retro_pretraining_legacy.py * update Jenkins and _legacy.py * update new RETRO jenkinstest to run faster * fixing errors from GitHub Advanced Security / CodeQL * fixing errors from GitHub Advanced Security / CodeQL * update manually branch to huvu/mcore_retro * remove DEBUGGING markers * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * copy paste scripts/tts_dataset_files/ipa_cmudict-0.7b_nv23.01.txt * update codes to fix Github warnings; adding cicd-main.yml action tests * cleaning code, addressing Shanmugam's comments * saving before pulling from main * cleaning code * adding deprecations note * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: eharper <[email protected]> Signed-off-by: Mikołaj Błaż <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: dimapihtar <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: Nithin Rao Koluguri <nithinraok> Signed-off-by: Jimmy Zhang <[email protected]> Signed-off-by: Chen Cui <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Alexandros Koumparoulis <[email protected]> Signed-off-by: Sangkug Lym <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: Krishna Puvvada <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Signed-off-by: Aishwarya Bhandare <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: Valerie Sarge <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Travis Bartley <[email protected]> Signed-off-by: Taejin Park <[email protected]> Signed-off-by: George <[email protected]> Signed-off-by: Shanmugam Ramasamy <[email protected]> Signed-off-by: Abhishree <[email protected]> Signed-off-by: Vladimir Bataev <[email protected]> Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: eharper <[email protected]> Co-authored-by: mikolajblaz <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: dimapihtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Piotr Żelasko <[email protected]> Co-authored-by: Elena Rastorgueva <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: JimmyZhang12 <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: Chen Cui <[email protected]> Co-authored-by: Huy Vu2 <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: Bobby Chen <[email protected]> Co-authored-by: akoumpa <[email protected]> Co-authored-by: Sangkug Lym <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: ashbhandare <[email protected]> Co-authored-by: Aishwarya Bhandare <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Pablo Garay <[email protected]> Co-authored-by: Valerie Sarge <[email protected]> Co-authored-by: Huiying <[email protected]> Co-authored-by: Huiying Li <[email protected]> Co-authored-by: tbartley94 <[email protected]> Co-authored-by: Taejin Park <[email protected]> Co-authored-by: George <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Abhishree Thittenamane <[email protected]> Co-authored-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: Vladimir Bataev <[email protected]> Co-authored-by: Ming <[email protected]> Co-authored-by: Huy Vu2 <[email protected]> Co-authored-by: root <[email protected]>
ericharper
added a commit
that referenced
this pull request
Apr 26, 2024
* update branch Signed-off-by: eharper <[email protected]> * Add dist ckpt support for regular optimizers (#7749) * Add dist ckpt support for regular optimizers Signed-off-by: Mikołaj Błaż <[email protected]> * [tutorial] fixed missing RIR scripts file. (#8257) Signed-off-by: Xuesong Yang <[email protected]> * fix imports Signed-off-by: dimapihtar <[email protected]> * imports fix Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * ci imports fix Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert asr notebook Signed-off-by: dimapihtar <[email protected]> * revert asr notebook Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Mikołaj Błaż <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: dimapihtar <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: dimapihtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Pin lhotse=1.19.2 in r1.23.0 (#8303) Signed-off-by: Piotr Żelasko <[email protected]> * Cache Aware Streaming tutorial notebook (#8296) * add notebook Signed-off-by: Elena Rastorgueva <[email protected]> * rename old notebook to Buffered_Streaming Signed-off-by: Elena Rastorgueva <[email protected]> * call setup_streaming_params in set_default_att_context_size method Signed-off-by: Elena Rastorgueva <[email protected]> * update links in docs Signed-off-by: Elena Rastorgueva <[email protected]> * update links to tutorials in docs Signed-off-by: Elena Rastorgueva <[email protected]> * remove hard-coding Signed-off-by: Elena Rastorgueva <[email protected]> * rename var Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> * fix path location and branch (#8304) * fix path location and branch Signed-off-by: Nithin Rao Koluguri <nithinraok> * change to a floating point number Signed-off-by: Nithin Rao Koluguri <nithinraok> --------- Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Somshubra Majumdar <[email protected]> * add deallocate pipeline output optimization (#8279) * add deallocate pipeline output optimization Signed-off-by: Jimmy Zhang <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Jimmy Zhang <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fix memory leak caused by context parallelism hanging references by omegaconf (#8299) * save cp_size to self Signed-off-by: Jimmy Zhang <[email protected]> * use parallel_state instead of self Signed-off-by: Jimmy Zhang <[email protected]> --------- Signed-off-by: Jimmy Zhang <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: Eric Harper <[email protected]> * remove assertion (#8302) Signed-off-by: dimapihtar <[email protected]> * Update PEFT Doc (#8262) * update peft doc Signed-off-by: Chen Cui <[email protected]> * remove old prompt learning doc and notebook Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * Merge branch 'r1.23.0' into chcui/update_peft_doc Signed-off-by: Chen Cui <[email protected]> * revert accidental changes Signed-off-by: Chen Cui <[email protected]> * revert accidental changes Signed-off-by: Chen Cui <[email protected]> --------- Signed-off-by: Chen Cui <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks (#8242) (#8324) * Rebasing canary changes at current main Signed-off-by: Piotr Żelasko <[email protected]> * Move the changes from asr transformer to nlp transformer as originally intended Signed-off-by: Piotr Żelasko <[email protected]> * update eval to strip spaces before punctuations Signed-off-by: stevehuang52 <[email protected]> * update pc strip Signed-off-by: stevehuang52 <[email protected]> * [canary] Refactor: `PromptedAudioToTextLhotseDataset` and `EncDecMultiTaskModel` (#8247) * Create a separate CanaryDataset and use it inside `transformer_bpe_models.py`. Ditches `token_sequence_format`. Signed-off-by: Piotr Żelasko <[email protected]> * [canary] Refactor: move changes in transformer_bpe_models.py to Canar… (#8252) * [canary] Refactor: move changes in transformer_bpe_models.py to CanaryModel Signed-off-by: Piotr Żelasko <[email protected]> * Rename `CanaryModel` to `EncDecMultiTaskModel` and remove inheritance from `EncDecTransfModelBPE`; add a separate config for this model Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * Rename `CanaryDataset` to `PromptedAudioToTextLhotseDataset`; add `prompt_format_fn` argument; clean-up the `_canary_prompt_format` function a bit Signed-off-by: Piotr Żelasko <[email protected]> * Move tokenization into `prompt_format_fn`, fix usage, add docs Signed-off-by: Piotr Żelasko <[email protected]> * Backward-compatible utterance validation Signed-off-by: Piotr Żelasko <[email protected]> * Improve type annotations Signed-off-by: Piotr Żelasko <[email protected]> * config and prompt_fn registration changes from review Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * fix transcribe config Signed-off-by: stevehuang52 <[email protected]> * Refactor Canary to follow schema of remaining ASR models (#8260) * Initial draft of multi task beam decoding strategy Signed-off-by: smajumdar <[email protected]> * Stabilize inference Signed-off-by: smajumdar <[email protected]> * Update AED Multi Task model to mostly conform to Archetype-Type format. Update config Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add change decoding strategy Signed-off-by: smajumdar <[email protected]> * Remove redundant imports Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Cleanup Signed-off-by: smajumdar <[email protected]> * Cleanup Signed-off-by: smajumdar <[email protected]> * remove asr transformer dependency on nlp Signed-off-by: stevehuang52 <[email protected]> * clean up Signed-off-by: stevehuang52 <[email protected]> * copy token_classifier from nlp to asr Signed-off-by: stevehuang52 <[email protected]> * Address comments Signed-off-by: smajumdar <[email protected]> * Add typing to beam decoding Signed-off-by: smajumdar <[email protected]> * Make prompt format configurable Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * drop asr dependency on nlp Signed-off-by: stevehuang52 <[email protected]> --------- Signed-off-by: smajumdar <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: stevehuang52 <[email protected]> * fix transcribe, update asr evaluator Signed-off-by: stevehuang52 <[email protected]> * Extend the docs for the canary prompt_fn Signed-off-by: Piotr Żelasko <[email protected]> * Incorporate changes from Nithin's code review Signed-off-by: Piotr Żelasko <[email protected]> * training bug fix and adding launch script for speech_multitask (#8270) * bug fix and adding launch script for speech_multitask Signed-off-by: Krishna Puvvada <[email protected]> * update launch script example in speech_to_text_aed.py Signed-off-by: Krishna Puvvada <[email protected]> --------- Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> * Fix: drop_last must be true in validation/test otherwise the training will hang Signed-off-by: Piotr Żelasko <[email protected]> * revert to current transcribe API Signed-off-by: stevehuang52 <[email protected]> * revert changes to NLP, update docs Signed-off-by: stevehuang52 <[email protected]> * update eval utils Signed-off-by: stevehuang52 <[email protected]> * update docs Signed-off-by: stevehuang52 <[email protected]> * Remove DALI; rename compute_audio_loss to compute_loss Signed-off-by: Piotr Żelasko <[email protected]> * set default use_model_transcribe=False Signed-off-by: stevehuang52 <[email protected]> * change os.path.dirname to pathlib Signed-off-by: stevehuang52 <[email protected]> * [canary] Test for CanaryTokenizer + refactoring (#8285) * Test for CanaryTokenizer Signed-off-by: Piotr Żelasko <[email protected]> * Attempt at refactor... Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * Update config for AED models (#8294) Signed-off-by: smajumdar <[email protected]> * set default calculate_wer=False in transcribe_speech.py Signed-off-by: stevehuang52 <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks Signed-off-by: Piotr Żelasko <[email protected]> * Apply suggestions from code review, part 1 Co-authored-by: Nithin Rao <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> * Apply suggestions from code review, part 2 Signed-off-by: Piotr Żelasko <[email protected]> * Document compute_loss Signed-off-by: Piotr Żelasko <[email protected]> * update transcribe_speech.py Signed-off-by: stevehuang52 <[email protected]> * add docstring Signed-off-by: stevehuang52 <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: Krishna Puvvada <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> Co-authored-by: stevehuang52 <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: He Huang (Steve) <[email protected]> Co-authored-by: Nithin Rao <[email protected]> (cherry picked from commit d10726d) Co-authored-by: Piotr Żelasko <[email protected]> * add code for calling mcore_retro in NeMo * add code for calling mcore_retro in NeMo * runnable, training curve match retro mcore and nemo * working on retro inference * working on megatron_retro_eval.py and megatron_retro_inference.yaml * refactoring text_generation_utils code and retro inference relevant files * clean PR * resolving quick hacks (reading number of train/valid samples from workdir, discrepancy in total samples and samples with neighbors retrieved, tokenizers) * clean repository * revert changes to inference/eval code to original in main * clean code * runable training code, with already implemented eval code * [tutorial] fixed missing RIR scripts file. (#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * Add Bert HF checkpoint converter (#8088) * Add Bert HF checkpoint converter Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Reformat Signed-off-by: yaoyu-33 <[email protected]> * Add BERT ONNX export * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add NeMo BERT to HF BERT script * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Clean code Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update argument names Signed-off-by: yaoyu-33 <[email protected]> * Update build_transformer_config in Bert Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> * revert to original eval code files * revert to original eval code files 2 * revert to original eval code files 3 * revert to original eval code files 4 * clean code * clean code * update my code to support changes from lastest main * commit before rebase r1.23.0 * Multimodal r1.23.0 bug fix (#8315) * Rename quick-gelu Signed-off-by: yaoyu-33 <[email protected]> * ddpm config guard Signed-off-by: yaoyu-33 <[email protected]> * Fix ddpm edit api Signed-off-by: yaoyu-33 <[email protected]> * Fix insert_image_token cfg issue Signed-off-by: yaoyu-33 <[email protected]> * neva updates Signed-off-by: yaoyu-33 <[email protected]> * reformat Signed-off-by: yaoyu-33 <[email protected]> * Add back jenkins Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix jenkins Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix bugs Signed-off-by: yaoyu-33 <[email protected]> * Update default neva template Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * copy paste files from r1.23.0 * clean PR * Fixes for MoE parameter passing & use of AutoTokenizer/Model for mistral. (#8272) Signed-off-by: Alexandros Koumparoulis <[email protected]> * Keep max_seqlen and cu_seqlens_argmin for later micro-batches when PP>1 (#8334) Signed-off-by: Sangkug Lym <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Remove asr webapp (#8347) Signed-off-by: smajumdar <[email protected]> * remove _target_ at model level in aed config (#8351) Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> * revert changes for tts and asr * Add change_vocabulary and save_tokenizers() support to Multitask ASR models (#8357) * Add change_vocabulary and save_tokenizers() support Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update nemo/collections/asr/models/aed_multitask_models.py Co-authored-by: Piotr Żelasko <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> --------- Signed-off-by: smajumdar <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Piotr Żelasko <[email protected]> * Change default (#8371) Signed-off-by: smajumdar <[email protected]> * implement retro's own fwd_bwd_step() and validation_step() to not have argument first_val_step, which the MLM commit doesn't support * adding megatron compile_helpers(), in future can be fixed with correct MLM commit * bug fix in fast-conformer-aed.yaml and adding jenkins test for speech_to_text_aed model (#8368) Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> * Enable megatron core loggers for GPT pretraining (#8354) * Logging changes tested for gpt_pretraining Signed-off-by: Aishwarya Bhandare <[email protected]> * Additional args Signed-off-by: Aishwarya Bhandare <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Aishwarya Bhandare <[email protected]> Co-authored-by: Aishwarya Bhandare <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <[email protected]> * mcore ds fix (#8283) * [tutorial] fixed missing RIR scripts file. (#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * mcore ds fix Signed-off-by: Dmytro Pykhtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update mcore Signed-off-by: dimapihtar <[email protected]> * revert asr files Signed-off-by: dimapihtar <[email protected]> * add comments Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for mcore mock dataset Signed-off-by: dimapihtar <[email protected]> * update mcore version Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update gpt cfg Signed-off-by: dimapihtar <[email protected]> * update mcore commit Signed-off-by: dimapihtar <[email protected]> * fix Bert unit tests Signed-off-by: dimapihtar <[email protected]> * update bert tests Signed-off-by: dimapihtar <[email protected]> * fix bert mcore test Signed-off-by: dimapihtar <[email protected]> * fix gpt jenkins tests Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update apex & TE commits Signed-off-by: dimapihtar <[email protected]> * revert apex installation Signed-off-by: dimapihtar <[email protected]> * turn off the fusion for jenkins Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: dimapihtar <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Pablo Garay <[email protected]> * addressing Eric's reviews * adding existing implementation RETRO files * adding existing implementation RETRO files * Add Finetuning tutorial with HF Datasets (#8356) * Add Finetuning tutorial with HF Datasets Signed-off-by: Nithin Rao Koluguri <nithinraok> * update on Som comments Signed-off-by: Nithin Rao Koluguri <nithinraok> --------- Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * release updates (#8378) * [tutorial] fixed missing RIR scripts file. (#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * mcore ds fix Signed-off-by: Dmytro Pykhtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update mcore Signed-off-by: dimapihtar <[email protected]> * revert asr files Signed-off-by: dimapihtar <[email protected]> * add comments Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for mcore mock dataset Signed-off-by: dimapihtar <[email protected]> * update mcore version Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update gpt cfg Signed-off-by: dimapihtar <[email protected]> * update mcore commit Signed-off-by: dimapihtar <[email protected]> * fix Bert unit tests Signed-off-by: dimapihtar <[email protected]> * update bert tests Signed-off-by: dimapihtar <[email protected]> * fix bert mcore test Signed-off-by: dimapihtar <[email protected]> * fix gpt jenkins tests Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for dict data input type Signed-off-by: dimapihtar <[email protected]> * add mock ds test Signed-off-by: dimapihtar <[email protected]> * add test for dict data input type Signed-off-by: dimapihtar <[email protected]> * mcore ds fix Signed-off-by: dimapihtar <[email protected]> * data input fix Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: dimapihtar <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Pablo Garay <[email protected]> * MCore dataset compatibility for tokenizers (#8390) * Add unique_identifiers for all tokenizers and eod for SentencePieceTokenizer Signed-off-by: Valerie Sarge <[email protected]> * Add generalized token aliases to TokenizerSpec to conform with MegatronTokenizer's interface. Remove now-redundant individual fixes from AutoTokenizer and SentencePieceTokenizer. Signed-off-by: Valerie Sarge <[email protected]> --------- Signed-off-by: Valerie Sarge <[email protected]> Co-authored-by: Pablo Garay <[email protected]> * Mcore customization doc (#8298) * [tutorial] fixed missing RIR scripts file. (#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * Add Bert HF checkpoint converter (#8088) * Add Bert HF checkpoint converter Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Reformat Signed-off-by: yaoyu-33 <[email protected]> * Add BERT ONNX export * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add NeMo BERT to HF BERT script * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Clean code Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update argument names Signed-off-by: yaoyu-33 <[email protected]> * Update build_transformer_config in Bert Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> * initial placeholder Signed-off-by: Huiying Li <[email protected]> * add to intro/index.rst Signed-off-by: Huiying Li <[email protected]> * initial content update Signed-off-by: Huiying Li <[email protected]> * add diff images Signed-off-by: Huiying Li <[email protected]> size Signed-off-by: Huiying Li <[email protected]> * minor fixes * minor language change Signed-off-by: Chen Cui <[email protected]> * clean changes --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Chen Cui <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> Co-authored-by: Huiying Li <[email protected]> Co-authored-by: Chen Cui <[email protected]> * wer fix (#8404) Signed-off-by: Travis Bartley <[email protected]> * updated link to pubmed (#8402) Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * Update NFA video download link (#8406) * update nfa nasa video link Signed-off-by: Elena Rastorgueva <[email protected]> * update link in markdown Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> * revert changes (#8410) Signed-off-by: Chen Cui <[email protected]> * Fix dreambooth data sampler issue (#8400) * Turn on drop last Signed-off-by: yaoyu-33 <[email protected]> * Some neva fixes Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fixed errors in the CTM gen functions (#8416) Signed-off-by: Taejin Park <[email protected]> * add ensemble decoding fix (#8427) Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * SDE bugfix log (#8430) Signed-off-by: George <[email protected]> * mcore customization doc minor fix (#8421) Signed-off-by: Huiying Li <[email protected]> * NeMo-Mistral to HF converter bugfix. (#8353) Signed-off-by: Alexandros Koumparoulis <[email protected]> * Fixing mcore bert for TP, PP and SP (#8336) * Fixing mcore bert for TP, PP and SP * Fixing mcore bert for TP, PP and SP * Fixing mcore version * Fixing mcore version * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> --------- Signed-off-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Add settings to suppress bf16 compile errors in CI on V100 (#8481) * Add settings to suppress bf16 compile errors in CI on V100 Signed-off-by: Abhishree <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Abhishree <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * MoE parameter passing (#8255) * MoE parameter passing Signed-off-by: Alexandros Koumparoulis <[email protected]> * Pass EP/MoE params in consumer scripts. Signed-off-by: Alexandros Koumparoulis <[email protected]> * PR fixes Signed-off-by: Alexandros Koumparoulis <[email protected]> * Use latest commit of mcore-0.5 Signed-off-by: Alexandros Koumparoulis <[email protected]> * CI fix Signed-off-by: Alexandros Koumparoulis <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Update k2 version (#8478) (#8492) Signed-off-by: Vladimir Bataev <[email protected]> * Add fp8 support for SD/Update notebook paths (#8489) * Add fp8 support for SD/Update notebook paths Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <[email protected]> * pin to 0.5.0 (#8465) Signed-off-by: eharper <[email protected]> * Update NeMo Multimodal Requirements (#8515) * Update requirements_multimodal.txt Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * update github raw content link (#8517) Signed-off-by: Chen Cui <[email protected]> * Add dep notice for notebooks (#8522) * add dep notice Signed-off-by: eharper <[email protected]> * revert Signed-off-by: eharper <[email protected]> --------- Signed-off-by: eharper <[email protected]> * Revert FP8 integration (#8520) * Revert FP8 integration Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Update data prep notebook (#8532) Signed-off-by: Mingyuan Ma <[email protected]> * before update branch with latest r1.23.0 * update to run with MLM ae2817b3dde4efb1515061a5311d01d8f85bd99c (runnable training and saving checkpoint) * remove compile_helpers * reverse changes from main branch to r1.23.0 * adding *_legacy files * update MLM commit in Jenkinsfile to latest * debugging Jenkinstest: test different mcore import in retro_dataset * update Jenkinsfile edit megatron_retro_mutransfer_pretrain_legacy.py * removing all mcore RETRO to pass the Jenkinstest * fixing import legacy problem for tests/collections/nlp/test_indexed_retrieval_dataset.py * update Jenkinsfile file to use TE v0.7 * update NeMo to work with latest mcore RETRO (solving TE problems) * update TE commit Jenkinsfile to be the same with r1.23.0's Jenkinsfile * update commit for MLM * jenkinstest debugging * temporary fix RETRO's __init__ for jenkinstest * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * add model.data.dataloader_type=cyclic to jenkinsfile * update code to work with latest megatron-lm main 81dab6067 * update M-LM commit in Jenkinsfile to latest main M-LM 81dab6067 * fix to by pass CI test bf16 problem (following this PR https://github.com/NVIDIA/NeMo/pull/8481/files) * isort and black * adjusting model.micro_batch_size to 1 * fix BRANCH = 'r1.23.0' * replace tutorials dir from main branch to huvu/mcore_retro * fix minor merges conflict * update Jenkinsfile * runnable with a temporary fix from Jacek (unfound -unfinished problem) * runnable with a temporary fix from Jacek (unfound -unfinished problem) * modified nlp_overrides.py back to original * fix checkpoint from Jacek Bieniusiewicz * config Jenkinsfile test * set RETRO Jenkins MBS to 1 * black fix * isort fix * update TE commit * update to latest Jenkinsfile with latest container and commits * remove new RETRO jenkinstest * merge latest main * put RETRO Jenkinstest to the right place * update code for megatron_retro_pretraining_legacy.py * untrack ipa_cmudict-0.7b_nv23.01.txt * untrack ipa_cmudict-0.7b_nv23.01.txt * set config in megatron_retro_pretraining_legacy.py to megatron_retro_config_legacy * update new RETRO jenkinstest to run faster * merging latest main, and edit Jenkinstest * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * huvu/mcore_retro_docs first commit * update with main * update RETRO docs * fix scripts/tts_dataset_files/ipa_cmudict-0.7b_nv23.01.txt * update docs * update docs * udpate RETRO docs * update with Jennifer's comments --------- Signed-off-by: eharper <[email protected]> Signed-off-by: Mikołaj Błaż <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: dimapihtar <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: Nithin Rao Koluguri <nithinraok> Signed-off-by: Jimmy Zhang <[email protected]> Signed-off-by: Chen Cui <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Alexandros Koumparoulis <[email protected]> Signed-off-by: Sangkug Lym <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: Krishna Puvvada <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Signed-off-by: Aishwarya Bhandare <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: Valerie Sarge <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Travis Bartley <[email protected]> Signed-off-by: Taejin Park <[email protected]> Signed-off-by: George <[email protected]> Signed-off-by: Shanmugam Ramasamy <[email protected]> Signed-off-by: Abhishree <[email protected]> Signed-off-by: Vladimir Bataev <[email protected]> Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: eharper <[email protected]> Co-authored-by: mikolajblaz <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: dimapihtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Piotr Żelasko <[email protected]> Co-authored-by: Elena Rastorgueva <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: JimmyZhang12 <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: Chen Cui <[email protected]> Co-authored-by: Huy Vu2 <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: Bobby Chen <[email protected]> Co-authored-by: akoumpa <[email protected]> Co-authored-by: Sangkug Lym <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: ashbhandare <[email protected]> Co-authored-by: Aishwarya Bhandare <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Pablo Garay <[email protected]> Co-authored-by: Valerie Sarge <[email protected]> Co-authored-by: Huiying <[email protected]> Co-authored-by: Huiying Li <[email protected]> Co-authored-by: tbartley94 <[email protected]> Co-authored-by: Taejin Park <[email protected]> Co-authored-by: George <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Abhishree Thittenamane <[email protected]> Co-authored-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: Vladimir Bataev <[email protected]> Co-authored-by: Ming <[email protected]> Co-authored-by: Huy Vu2 <[email protected]>
alxzhang-amazon
pushed a commit
to alxzhang-amazon/NeMo
that referenced
this pull request
Apr 26, 2024
* update branch Signed-off-by: eharper <[email protected]> * Add dist ckpt support for regular optimizers (NVIDIA#7749) * Add dist ckpt support for regular optimizers Signed-off-by: Mikołaj Błaż <[email protected]> * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * fix imports Signed-off-by: dimapihtar <[email protected]> * imports fix Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * ci imports fix Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert asr notebook Signed-off-by: dimapihtar <[email protected]> * revert asr notebook Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Mikołaj Błaż <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: dimapihtar <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: dimapihtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Pin lhotse=1.19.2 in r1.23.0 (NVIDIA#8303) Signed-off-by: Piotr Żelasko <[email protected]> * Cache Aware Streaming tutorial notebook (NVIDIA#8296) * add notebook Signed-off-by: Elena Rastorgueva <[email protected]> * rename old notebook to Buffered_Streaming Signed-off-by: Elena Rastorgueva <[email protected]> * call setup_streaming_params in set_default_att_context_size method Signed-off-by: Elena Rastorgueva <[email protected]> * update links in docs Signed-off-by: Elena Rastorgueva <[email protected]> * update links to tutorials in docs Signed-off-by: Elena Rastorgueva <[email protected]> * remove hard-coding Signed-off-by: Elena Rastorgueva <[email protected]> * rename var Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> * fix path location and branch (NVIDIA#8304) * fix path location and branch Signed-off-by: Nithin Rao Koluguri <nithinraok> * change to a floating point number Signed-off-by: Nithin Rao Koluguri <nithinraok> --------- Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Somshubra Majumdar <[email protected]> * add deallocate pipeline output optimization (NVIDIA#8279) * add deallocate pipeline output optimization Signed-off-by: Jimmy Zhang <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Jimmy Zhang <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fix memory leak caused by context parallelism hanging references by omegaconf (NVIDIA#8299) * save cp_size to self Signed-off-by: Jimmy Zhang <[email protected]> * use parallel_state instead of self Signed-off-by: Jimmy Zhang <[email protected]> --------- Signed-off-by: Jimmy Zhang <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: Eric Harper <[email protected]> * remove assertion (NVIDIA#8302) Signed-off-by: dimapihtar <[email protected]> * Update PEFT Doc (NVIDIA#8262) * update peft doc Signed-off-by: Chen Cui <[email protected]> * remove old prompt learning doc and notebook Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * Merge branch 'r1.23.0' into chcui/update_peft_doc Signed-off-by: Chen Cui <[email protected]> * revert accidental changes Signed-off-by: Chen Cui <[email protected]> * revert accidental changes Signed-off-by: Chen Cui <[email protected]> --------- Signed-off-by: Chen Cui <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks (NVIDIA#8242) (NVIDIA#8324) * Rebasing canary changes at current main Signed-off-by: Piotr Żelasko <[email protected]> * Move the changes from asr transformer to nlp transformer as originally intended Signed-off-by: Piotr Żelasko <[email protected]> * update eval to strip spaces before punctuations Signed-off-by: stevehuang52 <[email protected]> * update pc strip Signed-off-by: stevehuang52 <[email protected]> * [canary] Refactor: `PromptedAudioToTextLhotseDataset` and `EncDecMultiTaskModel` (NVIDIA#8247) * Create a separate CanaryDataset and use it inside `transformer_bpe_models.py`. Ditches `token_sequence_format`. Signed-off-by: Piotr Żelasko <[email protected]> * [canary] Refactor: move changes in transformer_bpe_models.py to Canar… (NVIDIA#8252) * [canary] Refactor: move changes in transformer_bpe_models.py to CanaryModel Signed-off-by: Piotr Żelasko <[email protected]> * Rename `CanaryModel` to `EncDecMultiTaskModel` and remove inheritance from `EncDecTransfModelBPE`; add a separate config for this model Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * Rename `CanaryDataset` to `PromptedAudioToTextLhotseDataset`; add `prompt_format_fn` argument; clean-up the `_canary_prompt_format` function a bit Signed-off-by: Piotr Żelasko <[email protected]> * Move tokenization into `prompt_format_fn`, fix usage, add docs Signed-off-by: Piotr Żelasko <[email protected]> * Backward-compatible utterance validation Signed-off-by: Piotr Żelasko <[email protected]> * Improve type annotations Signed-off-by: Piotr Żelasko <[email protected]> * config and prompt_fn registration changes from review Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * fix transcribe config Signed-off-by: stevehuang52 <[email protected]> * Refactor Canary to follow schema of remaining ASR models (NVIDIA#8260) * Initial draft of multi task beam decoding strategy Signed-off-by: smajumdar <[email protected]> * Stabilize inference Signed-off-by: smajumdar <[email protected]> * Update AED Multi Task model to mostly conform to Archetype-Type format. Update config Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add change decoding strategy Signed-off-by: smajumdar <[email protected]> * Remove redundant imports Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Cleanup Signed-off-by: smajumdar <[email protected]> * Cleanup Signed-off-by: smajumdar <[email protected]> * remove asr transformer dependency on nlp Signed-off-by: stevehuang52 <[email protected]> * clean up Signed-off-by: stevehuang52 <[email protected]> * copy token_classifier from nlp to asr Signed-off-by: stevehuang52 <[email protected]> * Address comments Signed-off-by: smajumdar <[email protected]> * Add typing to beam decoding Signed-off-by: smajumdar <[email protected]> * Make prompt format configurable Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * drop asr dependency on nlp Signed-off-by: stevehuang52 <[email protected]> --------- Signed-off-by: smajumdar <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: stevehuang52 <[email protected]> * fix transcribe, update asr evaluator Signed-off-by: stevehuang52 <[email protected]> * Extend the docs for the canary prompt_fn Signed-off-by: Piotr Żelasko <[email protected]> * Incorporate changes from Nithin's code review Signed-off-by: Piotr Żelasko <[email protected]> * training bug fix and adding launch script for speech_multitask (NVIDIA#8270) * bug fix and adding launch script for speech_multitask Signed-off-by: Krishna Puvvada <[email protected]> * update launch script example in speech_to_text_aed.py Signed-off-by: Krishna Puvvada <[email protected]> --------- Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> * Fix: drop_last must be true in validation/test otherwise the training will hang Signed-off-by: Piotr Żelasko <[email protected]> * revert to current transcribe API Signed-off-by: stevehuang52 <[email protected]> * revert changes to NLP, update docs Signed-off-by: stevehuang52 <[email protected]> * update eval utils Signed-off-by: stevehuang52 <[email protected]> * update docs Signed-off-by: stevehuang52 <[email protected]> * Remove DALI; rename compute_audio_loss to compute_loss Signed-off-by: Piotr Żelasko <[email protected]> * set default use_model_transcribe=False Signed-off-by: stevehuang52 <[email protected]> * change os.path.dirname to pathlib Signed-off-by: stevehuang52 <[email protected]> * [canary] Test for CanaryTokenizer + refactoring (NVIDIA#8285) * Test for CanaryTokenizer Signed-off-by: Piotr Żelasko <[email protected]> * Attempt at refactor... Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * Update config for AED models (NVIDIA#8294) Signed-off-by: smajumdar <[email protected]> * set default calculate_wer=False in transcribe_speech.py Signed-off-by: stevehuang52 <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks Signed-off-by: Piotr Żelasko <[email protected]> * Apply suggestions from code review, part 1 Co-authored-by: Nithin Rao <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> * Apply suggestions from code review, part 2 Signed-off-by: Piotr Żelasko <[email protected]> * Document compute_loss Signed-off-by: Piotr Żelasko <[email protected]> * update transcribe_speech.py Signed-off-by: stevehuang52 <[email protected]> * add docstring Signed-off-by: stevehuang52 <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: Krishna Puvvada <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> Co-authored-by: stevehuang52 <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: He Huang (Steve) <[email protected]> Co-authored-by: Nithin Rao <[email protected]> (cherry picked from commit d10726d) Co-authored-by: Piotr Żelasko <[email protected]> * add code for calling mcore_retro in NeMo * add code for calling mcore_retro in NeMo * runnable, training curve match retro mcore and nemo * working on retro inference * working on megatron_retro_eval.py and megatron_retro_inference.yaml * refactoring text_generation_utils code and retro inference relevant files * clean PR * resolving quick hacks (reading number of train/valid samples from workdir, discrepancy in total samples and samples with neighbors retrieved, tokenizers) * clean repository * revert changes to inference/eval code to original in main * clean code * runable training code, with already implemented eval code * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (NVIDIA#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * Add Bert HF checkpoint converter (NVIDIA#8088) * Add Bert HF checkpoint converter Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Reformat Signed-off-by: yaoyu-33 <[email protected]> * Add BERT ONNX export * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add NeMo BERT to HF BERT script * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Clean code Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update argument names Signed-off-by: yaoyu-33 <[email protected]> * Update build_transformer_config in Bert Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> * revert to original eval code files * revert to original eval code files 2 * revert to original eval code files 3 * revert to original eval code files 4 * clean code * clean code * update my code to support changes from lastest main * commit before rebase r1.23.0 * Multimodal r1.23.0 bug fix (NVIDIA#8315) * Rename quick-gelu Signed-off-by: yaoyu-33 <[email protected]> * ddpm config guard Signed-off-by: yaoyu-33 <[email protected]> * Fix ddpm edit api Signed-off-by: yaoyu-33 <[email protected]> * Fix insert_image_token cfg issue Signed-off-by: yaoyu-33 <[email protected]> * neva updates Signed-off-by: yaoyu-33 <[email protected]> * reformat Signed-off-by: yaoyu-33 <[email protected]> * Add back jenkins Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix jenkins Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix bugs Signed-off-by: yaoyu-33 <[email protected]> * Update default neva template Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * copy paste files from r1.23.0 * clean PR * Fixes for MoE parameter passing & use of AutoTokenizer/Model for mistral. (NVIDIA#8272) Signed-off-by: Alexandros Koumparoulis <[email protected]> * Keep max_seqlen and cu_seqlens_argmin for later micro-batches when PP>1 (NVIDIA#8334) Signed-off-by: Sangkug Lym <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Remove asr webapp (NVIDIA#8347) Signed-off-by: smajumdar <[email protected]> * remove _target_ at model level in aed config (NVIDIA#8351) Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> * revert changes for tts and asr * Add change_vocabulary and save_tokenizers() support to Multitask ASR models (NVIDIA#8357) * Add change_vocabulary and save_tokenizers() support Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update nemo/collections/asr/models/aed_multitask_models.py Co-authored-by: Piotr Żelasko <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> --------- Signed-off-by: smajumdar <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Piotr Żelasko <[email protected]> * Change default (NVIDIA#8371) Signed-off-by: smajumdar <[email protected]> * implement retro's own fwd_bwd_step() and validation_step() to not have argument first_val_step, which the MLM commit doesn't support * adding megatron compile_helpers(), in future can be fixed with correct MLM commit * bug fix in fast-conformer-aed.yaml and adding jenkins test for speech_to_text_aed model (NVIDIA#8368) Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> * Enable megatron core loggers for GPT pretraining (NVIDIA#8354) * Logging changes tested for gpt_pretraining Signed-off-by: Aishwarya Bhandare <[email protected]> * Additional args Signed-off-by: Aishwarya Bhandare <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Aishwarya Bhandare <[email protected]> Co-authored-by: Aishwarya Bhandare <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <[email protected]> * mcore ds fix (NVIDIA#8283) * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (NVIDIA#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * mcore ds fix Signed-off-by: Dmytro Pykhtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update mcore Signed-off-by: dimapihtar <[email protected]> * revert asr files Signed-off-by: dimapihtar <[email protected]> * add comments Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for mcore mock dataset Signed-off-by: dimapihtar <[email protected]> * update mcore version Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update gpt cfg Signed-off-by: dimapihtar <[email protected]> * update mcore commit Signed-off-by: dimapihtar <[email protected]> * fix Bert unit tests Signed-off-by: dimapihtar <[email protected]> * update bert tests Signed-off-by: dimapihtar <[email protected]> * fix bert mcore test Signed-off-by: dimapihtar <[email protected]> * fix gpt jenkins tests Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update apex & TE commits Signed-off-by: dimapihtar <[email protected]> * revert apex installation Signed-off-by: dimapihtar <[email protected]> * turn off the fusion for jenkins Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: dimapihtar <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Pablo Garay <[email protected]> * addressing Eric's reviews * adding existing implementation RETRO files * adding existing implementation RETRO files * Add Finetuning tutorial with HF Datasets (NVIDIA#8356) * Add Finetuning tutorial with HF Datasets Signed-off-by: Nithin Rao Koluguri <nithinraok> * update on Som comments Signed-off-by: Nithin Rao Koluguri <nithinraok> --------- Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * release updates (NVIDIA#8378) * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (NVIDIA#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * mcore ds fix Signed-off-by: Dmytro Pykhtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update mcore Signed-off-by: dimapihtar <[email protected]> * revert asr files Signed-off-by: dimapihtar <[email protected]> * add comments Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for mcore mock dataset Signed-off-by: dimapihtar <[email protected]> * update mcore version Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update gpt cfg Signed-off-by: dimapihtar <[email protected]> * update mcore commit Signed-off-by: dimapihtar <[email protected]> * fix Bert unit tests Signed-off-by: dimapihtar <[email protected]> * update bert tests Signed-off-by: dimapihtar <[email protected]> * fix bert mcore test Signed-off-by: dimapihtar <[email protected]> * fix gpt jenkins tests Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for dict data input type Signed-off-by: dimapihtar <[email protected]> * add mock ds test Signed-off-by: dimapihtar <[email protected]> * add test for dict data input type Signed-off-by: dimapihtar <[email protected]> * mcore ds fix Signed-off-by: dimapihtar <[email protected]> * data input fix Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: dimapihtar <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Pablo Garay <[email protected]> * MCore dataset compatibility for tokenizers (NVIDIA#8390) * Add unique_identifiers for all tokenizers and eod for SentencePieceTokenizer Signed-off-by: Valerie Sarge <[email protected]> * Add generalized token aliases to TokenizerSpec to conform with MegatronTokenizer's interface. Remove now-redundant individual fixes from AutoTokenizer and SentencePieceTokenizer. Signed-off-by: Valerie Sarge <[email protected]> --------- Signed-off-by: Valerie Sarge <[email protected]> Co-authored-by: Pablo Garay <[email protected]> * Mcore customization doc (NVIDIA#8298) * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (NVIDIA#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * Add Bert HF checkpoint converter (NVIDIA#8088) * Add Bert HF checkpoint converter Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Reformat Signed-off-by: yaoyu-33 <[email protected]> * Add BERT ONNX export * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add NeMo BERT to HF BERT script * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Clean code Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update argument names Signed-off-by: yaoyu-33 <[email protected]> * Update build_transformer_config in Bert Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> * initial placeholder Signed-off-by: Huiying Li <[email protected]> * add to intro/index.rst Signed-off-by: Huiying Li <[email protected]> * initial content update Signed-off-by: Huiying Li <[email protected]> * add diff images Signed-off-by: Huiying Li <[email protected]> size Signed-off-by: Huiying Li <[email protected]> * minor fixes * minor language change Signed-off-by: Chen Cui <[email protected]> * clean changes --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Chen Cui <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> Co-authored-by: Huiying Li <[email protected]> Co-authored-by: Chen Cui <[email protected]> * wer fix (NVIDIA#8404) Signed-off-by: Travis Bartley <[email protected]> * updated link to pubmed (NVIDIA#8402) Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * Update NFA video download link (NVIDIA#8406) * update nfa nasa video link Signed-off-by: Elena Rastorgueva <[email protected]> * update link in markdown Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> * revert changes (NVIDIA#8410) Signed-off-by: Chen Cui <[email protected]> * Fix dreambooth data sampler issue (NVIDIA#8400) * Turn on drop last Signed-off-by: yaoyu-33 <[email protected]> * Some neva fixes Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fixed errors in the CTM gen functions (NVIDIA#8416) Signed-off-by: Taejin Park <[email protected]> * add ensemble decoding fix (NVIDIA#8427) Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * SDE bugfix log (NVIDIA#8430) Signed-off-by: George <[email protected]> * mcore customization doc minor fix (NVIDIA#8421) Signed-off-by: Huiying Li <[email protected]> * NeMo-Mistral to HF converter bugfix. (NVIDIA#8353) Signed-off-by: Alexandros Koumparoulis <[email protected]> * Fixing mcore bert for TP, PP and SP (NVIDIA#8336) * Fixing mcore bert for TP, PP and SP * Fixing mcore bert for TP, PP and SP * Fixing mcore version * Fixing mcore version * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> --------- Signed-off-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Add settings to suppress bf16 compile errors in CI on V100 (NVIDIA#8481) * Add settings to suppress bf16 compile errors in CI on V100 Signed-off-by: Abhishree <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Abhishree <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * MoE parameter passing (NVIDIA#8255) * MoE parameter passing Signed-off-by: Alexandros Koumparoulis <[email protected]> * Pass EP/MoE params in consumer scripts. Signed-off-by: Alexandros Koumparoulis <[email protected]> * PR fixes Signed-off-by: Alexandros Koumparoulis <[email protected]> * Use latest commit of mcore-0.5 Signed-off-by: Alexandros Koumparoulis <[email protected]> * CI fix Signed-off-by: Alexandros Koumparoulis <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Update k2 version (NVIDIA#8478) (NVIDIA#8492) Signed-off-by: Vladimir Bataev <[email protected]> * Add fp8 support for SD/Update notebook paths (NVIDIA#8489) * Add fp8 support for SD/Update notebook paths Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <[email protected]> * pin to 0.5.0 (NVIDIA#8465) Signed-off-by: eharper <[email protected]> * Update NeMo Multimodal Requirements (NVIDIA#8515) * Update requirements_multimodal.txt Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * update github raw content link (NVIDIA#8517) Signed-off-by: Chen Cui <[email protected]> * Add dep notice for notebooks (NVIDIA#8522) * add dep notice Signed-off-by: eharper <[email protected]> * revert Signed-off-by: eharper <[email protected]> --------- Signed-off-by: eharper <[email protected]> * Revert FP8 integration (NVIDIA#8520) * Revert FP8 integration Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Update data prep notebook (NVIDIA#8532) Signed-off-by: Mingyuan Ma <[email protected]> * before update branch with latest r1.23.0 * update to run with MLM ae2817b3dde4efb1515061a5311d01d8f85bd99c (runnable training and saving checkpoint) * remove compile_helpers * reverse changes from main branch to r1.23.0 * adding *_legacy files * update MLM commit in Jenkinsfile to latest * debugging Jenkinstest: test different mcore import in retro_dataset * update Jenkinsfile edit megatron_retro_mutransfer_pretrain_legacy.py * removing all mcore RETRO to pass the Jenkinstest * fixing import legacy problem for tests/collections/nlp/test_indexed_retrieval_dataset.py * update Jenkinsfile file to use TE v0.7 * update NeMo to work with latest mcore RETRO (solving TE problems) * update TE commit Jenkinsfile to be the same with r1.23.0's Jenkinsfile * update commit for MLM * jenkinstest debugging * temporary fix RETRO's __init__ for jenkinstest * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * add model.data.dataloader_type=cyclic to jenkinsfile * update code to work with latest megatron-lm main 81dab6067 * update M-LM commit in Jenkinsfile to latest main M-LM 81dab6067 * fix to by pass CI test bf16 problem (following this PR https://github.com/NVIDIA/NeMo/pull/8481/files) * isort and black * adjusting model.micro_batch_size to 1 * fix BRANCH = 'r1.23.0' * replace tutorials dir from main branch to huvu/mcore_retro * fix minor merges conflict * update Jenkinsfile * runnable with a temporary fix from Jacek (unfound -unfinished problem) * runnable with a temporary fix from Jacek (unfound -unfinished problem) * modified nlp_overrides.py back to original * fix checkpoint from Jacek Bieniusiewicz * config Jenkinsfile test * set RETRO Jenkins MBS to 1 * black fix * isort fix * update TE commit * update to latest Jenkinsfile with latest container and commits * remove new RETRO jenkinstest * merge latest main * put RETRO Jenkinstest to the right place * update code for megatron_retro_pretraining_legacy.py * untrack ipa_cmudict-0.7b_nv23.01.txt * untrack ipa_cmudict-0.7b_nv23.01.txt * set config in megatron_retro_pretraining_legacy.py to megatron_retro_config_legacy * update new RETRO jenkinstest to run faster * merging latest main, and edit Jenkinstest * update Jenkinstest for new RETRO to run faster * fix isort * fix whitespace Signed-off-by: eharper <[email protected]> --------- Signed-off-by: eharper <[email protected]> Signed-off-by: Mikołaj Błaż <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: dimapihtar <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: Nithin Rao Koluguri <nithinraok> Signed-off-by: Jimmy Zhang <[email protected]> Signed-off-by: Chen Cui <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Alexandros Koumparoulis <[email protected]> Signed-off-by: Sangkug Lym <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: Krishna Puvvada <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Signed-off-by: Aishwarya Bhandare <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: Valerie Sarge <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Travis Bartley <[email protected]> Signed-off-by: Taejin Park <[email protected]> Signed-off-by: George <[email protected]> Signed-off-by: Shanmugam Ramasamy <[email protected]> Signed-off-by: Abhishree <[email protected]> Signed-off-by: Vladimir Bataev <[email protected]> Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: eharper <[email protected]> Co-authored-by: mikolajblaz <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: dimapihtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Piotr Żelasko <[email protected]> Co-authored-by: Elena Rastorgueva <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: JimmyZhang12 <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: Chen Cui <[email protected]> Co-authored-by: Huy Vu2 <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: Bobby Chen <[email protected]> Co-authored-by: akoumpa <[email protected]> Co-authored-by: Sangkug Lym <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: ashbhandare <[email protected]> Co-authored-by: Aishwarya Bhandare <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Pablo Garay <[email protected]> Co-authored-by: Valerie Sarge <[email protected]> Co-authored-by: Huiying <[email protected]> Co-authored-by: Huiying Li <[email protected]> Co-authored-by: tbartley94 <[email protected]> Co-authored-by: Taejin Park <[email protected]> Co-authored-by: George <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Abhishree Thittenamane <[email protected]> Co-authored-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: Vladimir Bataev <[email protected]> Co-authored-by: Ming <[email protected]> Co-authored-by: Huy Vu2 <[email protected]>
alxzhang-amazon
pushed a commit
to alxzhang-amazon/NeMo
that referenced
this pull request
Apr 26, 2024
* update branch Signed-off-by: eharper <[email protected]> * Add dist ckpt support for regular optimizers (NVIDIA#7749) * Add dist ckpt support for regular optimizers Signed-off-by: Mikołaj Błaż <[email protected]> * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * fix imports Signed-off-by: dimapihtar <[email protected]> * imports fix Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * ci imports fix Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert asr notebook Signed-off-by: dimapihtar <[email protected]> * revert asr notebook Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Mikołaj Błaż <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: dimapihtar <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: dimapihtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Pin lhotse=1.19.2 in r1.23.0 (NVIDIA#8303) Signed-off-by: Piotr Żelasko <[email protected]> * Cache Aware Streaming tutorial notebook (NVIDIA#8296) * add notebook Signed-off-by: Elena Rastorgueva <[email protected]> * rename old notebook to Buffered_Streaming Signed-off-by: Elena Rastorgueva <[email protected]> * call setup_streaming_params in set_default_att_context_size method Signed-off-by: Elena Rastorgueva <[email protected]> * update links in docs Signed-off-by: Elena Rastorgueva <[email protected]> * update links to tutorials in docs Signed-off-by: Elena Rastorgueva <[email protected]> * remove hard-coding Signed-off-by: Elena Rastorgueva <[email protected]> * rename var Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> * fix path location and branch (NVIDIA#8304) * fix path location and branch Signed-off-by: Nithin Rao Koluguri <nithinraok> * change to a floating point number Signed-off-by: Nithin Rao Koluguri <nithinraok> --------- Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Somshubra Majumdar <[email protected]> * add deallocate pipeline output optimization (NVIDIA#8279) * add deallocate pipeline output optimization Signed-off-by: Jimmy Zhang <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Jimmy Zhang <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fix memory leak caused by context parallelism hanging references by omegaconf (NVIDIA#8299) * save cp_size to self Signed-off-by: Jimmy Zhang <[email protected]> * use parallel_state instead of self Signed-off-by: Jimmy Zhang <[email protected]> --------- Signed-off-by: Jimmy Zhang <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: Eric Harper <[email protected]> * remove assertion (NVIDIA#8302) Signed-off-by: dimapihtar <[email protected]> * Update PEFT Doc (NVIDIA#8262) * update peft doc Signed-off-by: Chen Cui <[email protected]> * remove old prompt learning doc and notebook Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * Merge branch 'r1.23.0' into chcui/update_peft_doc Signed-off-by: Chen Cui <[email protected]> * revert accidental changes Signed-off-by: Chen Cui <[email protected]> * revert accidental changes Signed-off-by: Chen Cui <[email protected]> --------- Signed-off-by: Chen Cui <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks (NVIDIA#8242) (NVIDIA#8324) * Rebasing canary changes at current main Signed-off-by: Piotr Żelasko <[email protected]> * Move the changes from asr transformer to nlp transformer as originally intended Signed-off-by: Piotr Żelasko <[email protected]> * update eval to strip spaces before punctuations Signed-off-by: stevehuang52 <[email protected]> * update pc strip Signed-off-by: stevehuang52 <[email protected]> * [canary] Refactor: `PromptedAudioToTextLhotseDataset` and `EncDecMultiTaskModel` (NVIDIA#8247) * Create a separate CanaryDataset and use it inside `transformer_bpe_models.py`. Ditches `token_sequence_format`. Signed-off-by: Piotr Żelasko <[email protected]> * [canary] Refactor: move changes in transformer_bpe_models.py to Canar… (NVIDIA#8252) * [canary] Refactor: move changes in transformer_bpe_models.py to CanaryModel Signed-off-by: Piotr Żelasko <[email protected]> * Rename `CanaryModel` to `EncDecMultiTaskModel` and remove inheritance from `EncDecTransfModelBPE`; add a separate config for this model Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * Rename `CanaryDataset` to `PromptedAudioToTextLhotseDataset`; add `prompt_format_fn` argument; clean-up the `_canary_prompt_format` function a bit Signed-off-by: Piotr Żelasko <[email protected]> * Move tokenization into `prompt_format_fn`, fix usage, add docs Signed-off-by: Piotr Żelasko <[email protected]> * Backward-compatible utterance validation Signed-off-by: Piotr Żelasko <[email protected]> * Improve type annotations Signed-off-by: Piotr Żelasko <[email protected]> * config and prompt_fn registration changes from review Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * fix transcribe config Signed-off-by: stevehuang52 <[email protected]> * Refactor Canary to follow schema of remaining ASR models (NVIDIA#8260) * Initial draft of multi task beam decoding strategy Signed-off-by: smajumdar <[email protected]> * Stabilize inference Signed-off-by: smajumdar <[email protected]> * Update AED Multi Task model to mostly conform to Archetype-Type format. Update config Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add change decoding strategy Signed-off-by: smajumdar <[email protected]> * Remove redundant imports Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Cleanup Signed-off-by: smajumdar <[email protected]> * Cleanup Signed-off-by: smajumdar <[email protected]> * remove asr transformer dependency on nlp Signed-off-by: stevehuang52 <[email protected]> * clean up Signed-off-by: stevehuang52 <[email protected]> * copy token_classifier from nlp to asr Signed-off-by: stevehuang52 <[email protected]> * Address comments Signed-off-by: smajumdar <[email protected]> * Add typing to beam decoding Signed-off-by: smajumdar <[email protected]> * Make prompt format configurable Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * drop asr dependency on nlp Signed-off-by: stevehuang52 <[email protected]> --------- Signed-off-by: smajumdar <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: stevehuang52 <[email protected]> * fix transcribe, update asr evaluator Signed-off-by: stevehuang52 <[email protected]> * Extend the docs for the canary prompt_fn Signed-off-by: Piotr Żelasko <[email protected]> * Incorporate changes from Nithin's code review Signed-off-by: Piotr Żelasko <[email protected]> * training bug fix and adding launch script for speech_multitask (NVIDIA#8270) * bug fix and adding launch script for speech_multitask Signed-off-by: Krishna Puvvada <[email protected]> * update launch script example in speech_to_text_aed.py Signed-off-by: Krishna Puvvada <[email protected]> --------- Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> * Fix: drop_last must be true in validation/test otherwise the training will hang Signed-off-by: Piotr Żelasko <[email protected]> * revert to current transcribe API Signed-off-by: stevehuang52 <[email protected]> * revert changes to NLP, update docs Signed-off-by: stevehuang52 <[email protected]> * update eval utils Signed-off-by: stevehuang52 <[email protected]> * update docs Signed-off-by: stevehuang52 <[email protected]> * Remove DALI; rename compute_audio_loss to compute_loss Signed-off-by: Piotr Żelasko <[email protected]> * set default use_model_transcribe=False Signed-off-by: stevehuang52 <[email protected]> * change os.path.dirname to pathlib Signed-off-by: stevehuang52 <[email protected]> * [canary] Test for CanaryTokenizer + refactoring (NVIDIA#8285) * Test for CanaryTokenizer Signed-off-by: Piotr Żelasko <[email protected]> * Attempt at refactor... Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * Update config for AED models (NVIDIA#8294) Signed-off-by: smajumdar <[email protected]> * set default calculate_wer=False in transcribe_speech.py Signed-off-by: stevehuang52 <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks Signed-off-by: Piotr Żelasko <[email protected]> * Apply suggestions from code review, part 1 Co-authored-by: Nithin Rao <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> * Apply suggestions from code review, part 2 Signed-off-by: Piotr Żelasko <[email protected]> * Document compute_loss Signed-off-by: Piotr Żelasko <[email protected]> * update transcribe_speech.py Signed-off-by: stevehuang52 <[email protected]> * add docstring Signed-off-by: stevehuang52 <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: Krishna Puvvada <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> Co-authored-by: stevehuang52 <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: He Huang (Steve) <[email protected]> Co-authored-by: Nithin Rao <[email protected]> (cherry picked from commit d10726d) Co-authored-by: Piotr Żelasko <[email protected]> * add code for calling mcore_retro in NeMo * add code for calling mcore_retro in NeMo * runnable, training curve match retro mcore and nemo * working on retro inference * working on megatron_retro_eval.py and megatron_retro_inference.yaml * refactoring text_generation_utils code and retro inference relevant files * clean PR * resolving quick hacks (reading number of train/valid samples from workdir, discrepancy in total samples and samples with neighbors retrieved, tokenizers) * clean repository * revert changes to inference/eval code to original in main * clean code * runable training code, with already implemented eval code * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (NVIDIA#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * Add Bert HF checkpoint converter (NVIDIA#8088) * Add Bert HF checkpoint converter Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Reformat Signed-off-by: yaoyu-33 <[email protected]> * Add BERT ONNX export * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add NeMo BERT to HF BERT script * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Clean code Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update argument names Signed-off-by: yaoyu-33 <[email protected]> * Update build_transformer_config in Bert Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> * revert to original eval code files * revert to original eval code files 2 * revert to original eval code files 3 * revert to original eval code files 4 * clean code * clean code * update my code to support changes from lastest main * commit before rebase r1.23.0 * Multimodal r1.23.0 bug fix (NVIDIA#8315) * Rename quick-gelu Signed-off-by: yaoyu-33 <[email protected]> * ddpm config guard Signed-off-by: yaoyu-33 <[email protected]> * Fix ddpm edit api Signed-off-by: yaoyu-33 <[email protected]> * Fix insert_image_token cfg issue Signed-off-by: yaoyu-33 <[email protected]> * neva updates Signed-off-by: yaoyu-33 <[email protected]> * reformat Signed-off-by: yaoyu-33 <[email protected]> * Add back jenkins Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix jenkins Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix bugs Signed-off-by: yaoyu-33 <[email protected]> * Update default neva template Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * copy paste files from r1.23.0 * clean PR * Fixes for MoE parameter passing & use of AutoTokenizer/Model for mistral. (NVIDIA#8272) Signed-off-by: Alexandros Koumparoulis <[email protected]> * Keep max_seqlen and cu_seqlens_argmin for later micro-batches when PP>1 (NVIDIA#8334) Signed-off-by: Sangkug Lym <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Remove asr webapp (NVIDIA#8347) Signed-off-by: smajumdar <[email protected]> * remove _target_ at model level in aed config (NVIDIA#8351) Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> * revert changes for tts and asr * Add change_vocabulary and save_tokenizers() support to Multitask ASR models (NVIDIA#8357) * Add change_vocabulary and save_tokenizers() support Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update nemo/collections/asr/models/aed_multitask_models.py Co-authored-by: Piotr Żelasko <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> --------- Signed-off-by: smajumdar <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Piotr Żelasko <[email protected]> * Change default (NVIDIA#8371) Signed-off-by: smajumdar <[email protected]> * implement retro's own fwd_bwd_step() and validation_step() to not have argument first_val_step, which the MLM commit doesn't support * adding megatron compile_helpers(), in future can be fixed with correct MLM commit * bug fix in fast-conformer-aed.yaml and adding jenkins test for speech_to_text_aed model (NVIDIA#8368) Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> * Enable megatron core loggers for GPT pretraining (NVIDIA#8354) * Logging changes tested for gpt_pretraining Signed-off-by: Aishwarya Bhandare <[email protected]> * Additional args Signed-off-by: Aishwarya Bhandare <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Aishwarya Bhandare <[email protected]> Co-authored-by: Aishwarya Bhandare <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <[email protected]> * mcore ds fix (NVIDIA#8283) * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (NVIDIA#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * mcore ds fix Signed-off-by: Dmytro Pykhtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update mcore Signed-off-by: dimapihtar <[email protected]> * revert asr files Signed-off-by: dimapihtar <[email protected]> * add comments Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for mcore mock dataset Signed-off-by: dimapihtar <[email protected]> * update mcore version Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update gpt cfg Signed-off-by: dimapihtar <[email protected]> * update mcore commit Signed-off-by: dimapihtar <[email protected]> * fix Bert unit tests Signed-off-by: dimapihtar <[email protected]> * update bert tests Signed-off-by: dimapihtar <[email protected]> * fix bert mcore test Signed-off-by: dimapihtar <[email protected]> * fix gpt jenkins tests Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update apex & TE commits Signed-off-by: dimapihtar <[email protected]> * revert apex installation Signed-off-by: dimapihtar <[email protected]> * turn off the fusion for jenkins Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: dimapihtar <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Pablo Garay <[email protected]> * addressing Eric's reviews * adding existing implementation RETRO files * adding existing implementation RETRO files * Add Finetuning tutorial with HF Datasets (NVIDIA#8356) * Add Finetuning tutorial with HF Datasets Signed-off-by: Nithin Rao Koluguri <nithinraok> * update on Som comments Signed-off-by: Nithin Rao Koluguri <nithinraok> --------- Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * release updates (NVIDIA#8378) * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (NVIDIA#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * mcore ds fix Signed-off-by: Dmytro Pykhtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update mcore Signed-off-by: dimapihtar <[email protected]> * revert asr files Signed-off-by: dimapihtar <[email protected]> * add comments Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for mcore mock dataset Signed-off-by: dimapihtar <[email protected]> * update mcore version Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update gpt cfg Signed-off-by: dimapihtar <[email protected]> * update mcore commit Signed-off-by: dimapihtar <[email protected]> * fix Bert unit tests Signed-off-by: dimapihtar <[email protected]> * update bert tests Signed-off-by: dimapihtar <[email protected]> * fix bert mcore test Signed-off-by: dimapihtar <[email protected]> * fix gpt jenkins tests Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for dict data input type Signed-off-by: dimapihtar <[email protected]> * add mock ds test Signed-off-by: dimapihtar <[email protected]> * add test for dict data input type Signed-off-by: dimapihtar <[email protected]> * mcore ds fix Signed-off-by: dimapihtar <[email protected]> * data input fix Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: dimapihtar <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Pablo Garay <[email protected]> * MCore dataset compatibility for tokenizers (NVIDIA#8390) * Add unique_identifiers for all tokenizers and eod for SentencePieceTokenizer Signed-off-by: Valerie Sarge <[email protected]> * Add generalized token aliases to TokenizerSpec to conform with MegatronTokenizer's interface. Remove now-redundant individual fixes from AutoTokenizer and SentencePieceTokenizer. Signed-off-by: Valerie Sarge <[email protected]> --------- Signed-off-by: Valerie Sarge <[email protected]> Co-authored-by: Pablo Garay <[email protected]> * Mcore customization doc (NVIDIA#8298) * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (NVIDIA#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * Add Bert HF checkpoint converter (NVIDIA#8088) * Add Bert HF checkpoint converter Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Reformat Signed-off-by: yaoyu-33 <[email protected]> * Add BERT ONNX export * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add NeMo BERT to HF BERT script * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Clean code Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update argument names Signed-off-by: yaoyu-33 <[email protected]> * Update build_transformer_config in Bert Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> * initial placeholder Signed-off-by: Huiying Li <[email protected]> * add to intro/index.rst Signed-off-by: Huiying Li <[email protected]> * initial content update Signed-off-by: Huiying Li <[email protected]> * add diff images Signed-off-by: Huiying Li <[email protected]> size Signed-off-by: Huiying Li <[email protected]> * minor fixes * minor language change Signed-off-by: Chen Cui <[email protected]> * clean changes --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Chen Cui <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> Co-authored-by: Huiying Li <[email protected]> Co-authored-by: Chen Cui <[email protected]> * wer fix (NVIDIA#8404) Signed-off-by: Travis Bartley <[email protected]> * updated link to pubmed (NVIDIA#8402) Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * Update NFA video download link (NVIDIA#8406) * update nfa nasa video link Signed-off-by: Elena Rastorgueva <[email protected]> * update link in markdown Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> * revert changes (NVIDIA#8410) Signed-off-by: Chen Cui <[email protected]> * Fix dreambooth data sampler issue (NVIDIA#8400) * Turn on drop last Signed-off-by: yaoyu-33 <[email protected]> * Some neva fixes Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fixed errors in the CTM gen functions (NVIDIA#8416) Signed-off-by: Taejin Park <[email protected]> * add ensemble decoding fix (NVIDIA#8427) Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * SDE bugfix log (NVIDIA#8430) Signed-off-by: George <[email protected]> * mcore customization doc minor fix (NVIDIA#8421) Signed-off-by: Huiying Li <[email protected]> * NeMo-Mistral to HF converter bugfix. (NVIDIA#8353) Signed-off-by: Alexandros Koumparoulis <[email protected]> * Fixing mcore bert for TP, PP and SP (NVIDIA#8336) * Fixing mcore bert for TP, PP and SP * Fixing mcore bert for TP, PP and SP * Fixing mcore version * Fixing mcore version * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> --------- Signed-off-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Add settings to suppress bf16 compile errors in CI on V100 (NVIDIA#8481) * Add settings to suppress bf16 compile errors in CI on V100 Signed-off-by: Abhishree <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Abhishree <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * MoE parameter passing (NVIDIA#8255) * MoE parameter passing Signed-off-by: Alexandros Koumparoulis <[email protected]> * Pass EP/MoE params in consumer scripts. Signed-off-by: Alexandros Koumparoulis <[email protected]> * PR fixes Signed-off-by: Alexandros Koumparoulis <[email protected]> * Use latest commit of mcore-0.5 Signed-off-by: Alexandros Koumparoulis <[email protected]> * CI fix Signed-off-by: Alexandros Koumparoulis <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Update k2 version (NVIDIA#8478) (NVIDIA#8492) Signed-off-by: Vladimir Bataev <[email protected]> * Add fp8 support for SD/Update notebook paths (NVIDIA#8489) * Add fp8 support for SD/Update notebook paths Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <[email protected]> * pin to 0.5.0 (NVIDIA#8465) Signed-off-by: eharper <[email protected]> * Update NeMo Multimodal Requirements (NVIDIA#8515) * Update requirements_multimodal.txt Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * update github raw content link (NVIDIA#8517) Signed-off-by: Chen Cui <[email protected]> * Add dep notice for notebooks (NVIDIA#8522) * add dep notice Signed-off-by: eharper <[email protected]> * revert Signed-off-by: eharper <[email protected]> --------- Signed-off-by: eharper <[email protected]> * Revert FP8 integration (NVIDIA#8520) * Revert FP8 integration Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Update data prep notebook (NVIDIA#8532) Signed-off-by: Mingyuan Ma <[email protected]> * before update branch with latest r1.23.0 * update to run with MLM ae2817b3dde4efb1515061a5311d01d8f85bd99c (runnable training and saving checkpoint) * remove compile_helpers * reverse changes from main branch to r1.23.0 * adding *_legacy files * update MLM commit in Jenkinsfile to latest * debugging Jenkinstest: test different mcore import in retro_dataset * update Jenkinsfile edit megatron_retro_mutransfer_pretrain_legacy.py * removing all mcore RETRO to pass the Jenkinstest * fixing import legacy problem for tests/collections/nlp/test_indexed_retrieval_dataset.py * update Jenkinsfile file to use TE v0.7 * update NeMo to work with latest mcore RETRO (solving TE problems) * update TE commit Jenkinsfile to be the same with r1.23.0's Jenkinsfile * update commit for MLM * jenkinstest debugging * temporary fix RETRO's __init__ for jenkinstest * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * add model.data.dataloader_type=cyclic to jenkinsfile * update code to work with latest megatron-lm main 81dab6067 * update M-LM commit in Jenkinsfile to latest main M-LM 81dab6067 * fix to by pass CI test bf16 problem (following this PR https://github.com/NVIDIA/NeMo/pull/8481/files) * isort and black * adjusting model.micro_batch_size to 1 * fix BRANCH = 'r1.23.0' * replace tutorials dir from main branch to huvu/mcore_retro * fix minor merges conflict * update Jenkinsfile * runnable with a temporary fix from Jacek (unfound -unfinished problem) * runnable with a temporary fix from Jacek (unfound -unfinished problem) * modified nlp_overrides.py back to original * fix checkpoint from Jacek Bieniusiewicz * config Jenkinsfile test * set RETRO Jenkins MBS to 1 * black fix * isort fix * update TE commit * update to latest Jenkinsfile with latest container and commits * remove new RETRO jenkinstest * merge latest main * put RETRO Jenkinstest to the right place * update code for megatron_retro_pretraining_legacy.py * untrack ipa_cmudict-0.7b_nv23.01.txt * untrack ipa_cmudict-0.7b_nv23.01.txt * set config in megatron_retro_pretraining_legacy.py to megatron_retro_config_legacy * update new RETRO jenkinstest to run faster * merging latest main, and edit Jenkinstest * update Jenkinstest for new RETRO to run faster * fix isort * adding RETRO tests to cicd-main.yml action tests * update ipa_cmudict-0.7b_nv23.01.txt * remove quotes for model.data for legacy RETRO action tests --------- Signed-off-by: eharper <[email protected]> Signed-off-by: Mikołaj Błaż <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: dimapihtar <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: Nithin Rao Koluguri <nithinraok> Signed-off-by: Jimmy Zhang <[email protected]> Signed-off-by: Chen Cui <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Alexandros Koumparoulis <[email protected]> Signed-off-by: Sangkug Lym <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: Krishna Puvvada <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Signed-off-by: Aishwarya Bhandare <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: Valerie Sarge <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Travis Bartley <[email protected]> Signed-off-by: Taejin Park <[email protected]> Signed-off-by: George <[email protected]> Signed-off-by: Shanmugam Ramasamy <[email protected]> Signed-off-by: Abhishree <[email protected]> Signed-off-by: Vladimir Bataev <[email protected]> Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: eharper <[email protected]> Co-authored-by: mikolajblaz <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: dimapihtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Piotr Żelasko <[email protected]> Co-authored-by: Elena Rastorgueva <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: JimmyZhang12 <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: Chen Cui <[email protected]> Co-authored-by: Huy Vu2 <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: Bobby Chen <[email protected]> Co-authored-by: akoumpa <[email protected]> Co-authored-by: Sangkug Lym <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: ashbhandare <[email protected]> Co-authored-by: Aishwarya Bhandare <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Pablo Garay <[email protected]> Co-authored-by: Valerie Sarge <[email protected]> Co-authored-by: Huiying <[email protected]> Co-authored-by: Huiying Li <[email protected]> Co-authored-by: tbartley94 <[email protected]> Co-authored-by: Taejin Park <[email protected]> Co-authored-by: George <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Abhishree Thittenamane <[email protected]> Co-authored-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: Vladimir Bataev <[email protected]> Co-authored-by: Ming <[email protected]> Co-authored-by: Huy Vu2 <[email protected]>
alxzhang-amazon
pushed a commit
to alxzhang-amazon/NeMo
that referenced
this pull request
Apr 26, 2024
* update branch Signed-off-by: eharper <[email protected]> * Add dist ckpt support for regular optimizers (NVIDIA#7749) * Add dist ckpt support for regular optimizers Signed-off-by: Mikołaj Błaż <[email protected]> * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * fix imports Signed-off-by: dimapihtar <[email protected]> * imports fix Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * ci imports fix Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert asr notebook Signed-off-by: dimapihtar <[email protected]> * revert asr notebook Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Mikołaj Błaż <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: dimapihtar <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: dimapihtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Pin lhotse=1.19.2 in r1.23.0 (NVIDIA#8303) Signed-off-by: Piotr Żelasko <[email protected]> * Cache Aware Streaming tutorial notebook (NVIDIA#8296) * add notebook Signed-off-by: Elena Rastorgueva <[email protected]> * rename old notebook to Buffered_Streaming Signed-off-by: Elena Rastorgueva <[email protected]> * call setup_streaming_params in set_default_att_context_size method Signed-off-by: Elena Rastorgueva <[email protected]> * update links in docs Signed-off-by: Elena Rastorgueva <[email protected]> * update links to tutorials in docs Signed-off-by: Elena Rastorgueva <[email protected]> * remove hard-coding Signed-off-by: Elena Rastorgueva <[email protected]> * rename var Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> * fix path location and branch (NVIDIA#8304) * fix path location and branch Signed-off-by: Nithin Rao Koluguri <nithinraok> * change to a floating point number Signed-off-by: Nithin Rao Koluguri <nithinraok> --------- Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Somshubra Majumdar <[email protected]> * add deallocate pipeline output optimization (NVIDIA#8279) * add deallocate pipeline output optimization Signed-off-by: Jimmy Zhang <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Jimmy Zhang <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fix memory leak caused by context parallelism hanging references by omegaconf (NVIDIA#8299) * save cp_size to self Signed-off-by: Jimmy Zhang <[email protected]> * use parallel_state instead of self Signed-off-by: Jimmy Zhang <[email protected]> --------- Signed-off-by: Jimmy Zhang <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: Eric Harper <[email protected]> * remove assertion (NVIDIA#8302) Signed-off-by: dimapihtar <[email protected]> * Update PEFT Doc (NVIDIA#8262) * update peft doc Signed-off-by: Chen Cui <[email protected]> * remove old prompt learning doc and notebook Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * Merge branch 'r1.23.0' into chcui/update_peft_doc Signed-off-by: Chen Cui <[email protected]> * revert accidental changes Signed-off-by: Chen Cui <[email protected]> * revert accidental changes Signed-off-by: Chen Cui <[email protected]> --------- Signed-off-by: Chen Cui <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks (NVIDIA#8242) (NVIDIA#8324) * Rebasing canary changes at current main Signed-off-by: Piotr Żelasko <[email protected]> * Move the changes from asr transformer to nlp transformer as originally intended Signed-off-by: Piotr Żelasko <[email protected]> * update eval to strip spaces before punctuations Signed-off-by: stevehuang52 <[email protected]> * update pc strip Signed-off-by: stevehuang52 <[email protected]> * [canary] Refactor: `PromptedAudioToTextLhotseDataset` and `EncDecMultiTaskModel` (NVIDIA#8247) * Create a separate CanaryDataset and use it inside `transformer_bpe_models.py`. Ditches `token_sequence_format`. Signed-off-by: Piotr Żelasko <[email protected]> * [canary] Refactor: move changes in transformer_bpe_models.py to Canar… (NVIDIA#8252) * [canary] Refactor: move changes in transformer_bpe_models.py to CanaryModel Signed-off-by: Piotr Żelasko <[email protected]> * Rename `CanaryModel` to `EncDecMultiTaskModel` and remove inheritance from `EncDecTransfModelBPE`; add a separate config for this model Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * Rename `CanaryDataset` to `PromptedAudioToTextLhotseDataset`; add `prompt_format_fn` argument; clean-up the `_canary_prompt_format` function a bit Signed-off-by: Piotr Żelasko <[email protected]> * Move tokenization into `prompt_format_fn`, fix usage, add docs Signed-off-by: Piotr Żelasko <[email protected]> * Backward-compatible utterance validation Signed-off-by: Piotr Żelasko <[email protected]> * Improve type annotations Signed-off-by: Piotr Żelasko <[email protected]> * config and prompt_fn registration changes from review Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * fix transcribe config Signed-off-by: stevehuang52 <[email protected]> * Refactor Canary to follow schema of remaining ASR models (NVIDIA#8260) * Initial draft of multi task beam decoding strategy Signed-off-by: smajumdar <[email protected]> * Stabilize inference Signed-off-by: smajumdar <[email protected]> * Update AED Multi Task model to mostly conform to Archetype-Type format. Update config Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add change decoding strategy Signed-off-by: smajumdar <[email protected]> * Remove redundant imports Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Cleanup Signed-off-by: smajumdar <[email protected]> * Cleanup Signed-off-by: smajumdar <[email protected]> * remove asr transformer dependency on nlp Signed-off-by: stevehuang52 <[email protected]> * clean up Signed-off-by: stevehuang52 <[email protected]> * copy token_classifier from nlp to asr Signed-off-by: stevehuang52 <[email protected]> * Address comments Signed-off-by: smajumdar <[email protected]> * Add typing to beam decoding Signed-off-by: smajumdar <[email protected]> * Make prompt format configurable Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * drop asr dependency on nlp Signed-off-by: stevehuang52 <[email protected]> --------- Signed-off-by: smajumdar <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: stevehuang52 <[email protected]> * fix transcribe, update asr evaluator Signed-off-by: stevehuang52 <[email protected]> * Extend the docs for the canary prompt_fn Signed-off-by: Piotr Żelasko <[email protected]> * Incorporate changes from Nithin's code review Signed-off-by: Piotr Żelasko <[email protected]> * training bug fix and adding launch script for speech_multitask (NVIDIA#8270) * bug fix and adding launch script for speech_multitask Signed-off-by: Krishna Puvvada <[email protected]> * update launch script example in speech_to_text_aed.py Signed-off-by: Krishna Puvvada <[email protected]> --------- Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> * Fix: drop_last must be true in validation/test otherwise the training will hang Signed-off-by: Piotr Żelasko <[email protected]> * revert to current transcribe API Signed-off-by: stevehuang52 <[email protected]> * revert changes to NLP, update docs Signed-off-by: stevehuang52 <[email protected]> * update eval utils Signed-off-by: stevehuang52 <[email protected]> * update docs Signed-off-by: stevehuang52 <[email protected]> * Remove DALI; rename compute_audio_loss to compute_loss Signed-off-by: Piotr Żelasko <[email protected]> * set default use_model_transcribe=False Signed-off-by: stevehuang52 <[email protected]> * change os.path.dirname to pathlib Signed-off-by: stevehuang52 <[email protected]> * [canary] Test for CanaryTokenizer + refactoring (NVIDIA#8285) * Test for CanaryTokenizer Signed-off-by: Piotr Żelasko <[email protected]> * Attempt at refactor... Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * Update config for AED models (NVIDIA#8294) Signed-off-by: smajumdar <[email protected]> * set default calculate_wer=False in transcribe_speech.py Signed-off-by: stevehuang52 <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks Signed-off-by: Piotr Żelasko <[email protected]> * Apply suggestions from code review, part 1 Co-authored-by: Nithin Rao <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> * Apply suggestions from code review, part 2 Signed-off-by: Piotr Żelasko <[email protected]> * Document compute_loss Signed-off-by: Piotr Żelasko <[email protected]> * update transcribe_speech.py Signed-off-by: stevehuang52 <[email protected]> * add docstring Signed-off-by: stevehuang52 <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: Krishna Puvvada <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> Co-authored-by: stevehuang52 <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: He Huang (Steve) <[email protected]> Co-authored-by: Nithin Rao <[email protected]> (cherry picked from commit d10726d) Co-authored-by: Piotr Żelasko <[email protected]> * add code for calling mcore_retro in NeMo * add code for calling mcore_retro in NeMo * runnable, training curve match retro mcore and nemo * working on retro inference * working on megatron_retro_eval.py and megatron_retro_inference.yaml * refactoring text_generation_utils code and retro inference relevant files * clean PR * resolving quick hacks (reading number of train/valid samples from workdir, discrepancy in total samples and samples with neighbors retrieved, tokenizers) * clean repository * revert changes to inference/eval code to original in main * clean code * runable training code, with already implemented eval code * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (NVIDIA#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * Add Bert HF checkpoint converter (NVIDIA#8088) * Add Bert HF checkpoint converter Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Reformat Signed-off-by: yaoyu-33 <[email protected]> * Add BERT ONNX export * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add NeMo BERT to HF BERT script * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Clean code Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update argument names Signed-off-by: yaoyu-33 <[email protected]> * Update build_transformer_config in Bert Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> * revert to original eval code files * revert to original eval code files 2 * revert to original eval code files 3 * revert to original eval code files 4 * clean code * clean code * update my code to support changes from lastest main * commit before rebase r1.23.0 * Multimodal r1.23.0 bug fix (NVIDIA#8315) * Rename quick-gelu Signed-off-by: yaoyu-33 <[email protected]> * ddpm config guard Signed-off-by: yaoyu-33 <[email protected]> * Fix ddpm edit api Signed-off-by: yaoyu-33 <[email protected]> * Fix insert_image_token cfg issue Signed-off-by: yaoyu-33 <[email protected]> * neva updates Signed-off-by: yaoyu-33 <[email protected]> * reformat Signed-off-by: yaoyu-33 <[email protected]> * Add back jenkins Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix jenkins Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix bugs Signed-off-by: yaoyu-33 <[email protected]> * Update default neva template Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * copy paste files from r1.23.0 * clean PR * Fixes for MoE parameter passing & use of AutoTokenizer/Model for mistral. (NVIDIA#8272) Signed-off-by: Alexandros Koumparoulis <[email protected]> * Keep max_seqlen and cu_seqlens_argmin for later micro-batches when PP>1 (NVIDIA#8334) Signed-off-by: Sangkug Lym <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Remove asr webapp (NVIDIA#8347) Signed-off-by: smajumdar <[email protected]> * remove _target_ at model level in aed config (NVIDIA#8351) Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> * revert changes for tts and asr * Add change_vocabulary and save_tokenizers() support to Multitask ASR models (NVIDIA#8357) * Add change_vocabulary and save_tokenizers() support Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update nemo/collections/asr/models/aed_multitask_models.py Co-authored-by: Piotr Żelasko <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> --------- Signed-off-by: smajumdar <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Piotr Żelasko <[email protected]> * Change default (NVIDIA#8371) Signed-off-by: smajumdar <[email protected]> * implement retro's own fwd_bwd_step() and validation_step() to not have argument first_val_step, which the MLM commit doesn't support * adding megatron compile_helpers(), in future can be fixed with correct MLM commit * bug fix in fast-conformer-aed.yaml and adding jenkins test for speech_to_text_aed model (NVIDIA#8368) Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> * Enable megatron core loggers for GPT pretraining (NVIDIA#8354) * Logging changes tested for gpt_pretraining Signed-off-by: Aishwarya Bhandare <[email protected]> * Additional args Signed-off-by: Aishwarya Bhandare <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Aishwarya Bhandare <[email protected]> Co-authored-by: Aishwarya Bhandare <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <[email protected]> * mcore ds fix (NVIDIA#8283) * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (NVIDIA#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * mcore ds fix Signed-off-by: Dmytro Pykhtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update mcore Signed-off-by: dimapihtar <[email protected]> * revert asr files Signed-off-by: dimapihtar <[email protected]> * add comments Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for mcore mock dataset Signed-off-by: dimapihtar <[email protected]> * update mcore version Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update gpt cfg Signed-off-by: dimapihtar <[email protected]> * update mcore commit Signed-off-by: dimapihtar <[email protected]> * fix Bert unit tests Signed-off-by: dimapihtar <[email protected]> * update bert tests Signed-off-by: dimapihtar <[email protected]> * fix bert mcore test Signed-off-by: dimapihtar <[email protected]> * fix gpt jenkins tests Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update apex & TE commits Signed-off-by: dimapihtar <[email protected]> * revert apex installation Signed-off-by: dimapihtar <[email protected]> * turn off the fusion for jenkins Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: dimapihtar <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Pablo Garay <[email protected]> * addressing Eric's reviews * adding existing implementation RETRO files * adding existing implementation RETRO files * Add Finetuning tutorial with HF Datasets (NVIDIA#8356) * Add Finetuning tutorial with HF Datasets Signed-off-by: Nithin Rao Koluguri <nithinraok> * update on Som comments Signed-off-by: Nithin Rao Koluguri <nithinraok> --------- Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * release updates (NVIDIA#8378) * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (NVIDIA#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * mcore ds fix Signed-off-by: Dmytro Pykhtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update mcore Signed-off-by: dimapihtar <[email protected]> * revert asr files Signed-off-by: dimapihtar <[email protected]> * add comments Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for mcore mock dataset Signed-off-by: dimapihtar <[email protected]> * update mcore version Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update gpt cfg Signed-off-by: dimapihtar <[email protected]> * update mcore commit Signed-off-by: dimapihtar <[email protected]> * fix Bert unit tests Signed-off-by: dimapihtar <[email protected]> * update bert tests Signed-off-by: dimapihtar <[email protected]> * fix bert mcore test Signed-off-by: dimapihtar <[email protected]> * fix gpt jenkins tests Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for dict data input type Signed-off-by: dimapihtar <[email protected]> * add mock ds test Signed-off-by: dimapihtar <[email protected]> * add test for dict data input type Signed-off-by: dimapihtar <[email protected]> * mcore ds fix Signed-off-by: dimapihtar <[email protected]> * data input fix Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: dimapihtar <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Pablo Garay <[email protected]> * MCore dataset compatibility for tokenizers (NVIDIA#8390) * Add unique_identifiers for all tokenizers and eod for SentencePieceTokenizer Signed-off-by: Valerie Sarge <[email protected]> * Add generalized token aliases to TokenizerSpec to conform with MegatronTokenizer's interface. Remove now-redundant individual fixes from AutoTokenizer and SentencePieceTokenizer. Signed-off-by: Valerie Sarge <[email protected]> --------- Signed-off-by: Valerie Sarge <[email protected]> Co-authored-by: Pablo Garay <[email protected]> * Mcore customization doc (NVIDIA#8298) * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (NVIDIA#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * Add Bert HF checkpoint converter (NVIDIA#8088) * Add Bert HF checkpoint converter Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Reformat Signed-off-by: yaoyu-33 <[email protected]> * Add BERT ONNX export * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add NeMo BERT to HF BERT script * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Clean code Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update argument names Signed-off-by: yaoyu-33 <[email protected]> * Update build_transformer_config in Bert Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> * initial placeholder Signed-off-by: Huiying Li <[email protected]> * add to intro/index.rst Signed-off-by: Huiying Li <[email protected]> * initial content update Signed-off-by: Huiying Li <[email protected]> * add diff images Signed-off-by: Huiying Li <[email protected]> size Signed-off-by: Huiying Li <[email protected]> * minor fixes * minor language change Signed-off-by: Chen Cui <[email protected]> * clean changes --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Chen Cui <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> Co-authored-by: Huiying Li <[email protected]> Co-authored-by: Chen Cui <[email protected]> * wer fix (NVIDIA#8404) Signed-off-by: Travis Bartley <[email protected]> * updated link to pubmed (NVIDIA#8402) Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * Update NFA video download link (NVIDIA#8406) * update nfa nasa video link Signed-off-by: Elena Rastorgueva <[email protected]> * update link in markdown Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> * revert changes (NVIDIA#8410) Signed-off-by: Chen Cui <[email protected]> * Fix dreambooth data sampler issue (NVIDIA#8400) * Turn on drop last Signed-off-by: yaoyu-33 <[email protected]> * Some neva fixes Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fixed errors in the CTM gen functions (NVIDIA#8416) Signed-off-by: Taejin Park <[email protected]> * add ensemble decoding fix (NVIDIA#8427) Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * SDE bugfix log (NVIDIA#8430) Signed-off-by: George <[email protected]> * mcore customization doc minor fix (NVIDIA#8421) Signed-off-by: Huiying Li <[email protected]> * NeMo-Mistral to HF converter bugfix. (NVIDIA#8353) Signed-off-by: Alexandros Koumparoulis <[email protected]> * Fixing mcore bert for TP, PP and SP (NVIDIA#8336) * Fixing mcore bert for TP, PP and SP * Fixing mcore bert for TP, PP and SP * Fixing mcore version * Fixing mcore version * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> --------- Signed-off-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Add settings to suppress bf16 compile errors in CI on V100 (NVIDIA#8481) * Add settings to suppress bf16 compile errors in CI on V100 Signed-off-by: Abhishree <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Abhishree <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * MoE parameter passing (NVIDIA#8255) * MoE parameter passing Signed-off-by: Alexandros Koumparoulis <[email protected]> * Pass EP/MoE params in consumer scripts. Signed-off-by: Alexandros Koumparoulis <[email protected]> * PR fixes Signed-off-by: Alexandros Koumparoulis <[email protected]> * Use latest commit of mcore-0.5 Signed-off-by: Alexandros Koumparoulis <[email protected]> * CI fix Signed-off-by: Alexandros Koumparoulis <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Update k2 version (NVIDIA#8478) (NVIDIA#8492) Signed-off-by: Vladimir Bataev <[email protected]> * Add fp8 support for SD/Update notebook paths (NVIDIA#8489) * Add fp8 support for SD/Update notebook paths Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <[email protected]> * pin to 0.5.0 (NVIDIA#8465) Signed-off-by: eharper <[email protected]> * Update NeMo Multimodal Requirements (NVIDIA#8515) * Update requirements_multimodal.txt Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * update github raw content link (NVIDIA#8517) Signed-off-by: Chen Cui <[email protected]> * Add dep notice for notebooks (NVIDIA#8522) * add dep notice Signed-off-by: eharper <[email protected]> * revert Signed-off-by: eharper <[email protected]> --------- Signed-off-by: eharper <[email protected]> * Revert FP8 integration (NVIDIA#8520) * Revert FP8 integration Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Update data prep notebook (NVIDIA#8532) Signed-off-by: Mingyuan Ma <[email protected]> * before update branch with latest r1.23.0 * update to run with MLM ae2817b3dde4efb1515061a5311d01d8f85bd99c (runnable training and saving checkpoint) * remove compile_helpers * reverse changes from main branch to r1.23.0 * adding *_legacy files * update MLM commit in Jenkinsfile to latest * debugging Jenkinstest: test different mcore import in retro_dataset * update Jenkinsfile edit megatron_retro_mutransfer_pretrain_legacy.py * removing all mcore RETRO to pass the Jenkinstest * fixing import legacy problem for tests/collections/nlp/test_indexed_retrieval_dataset.py * update Jenkinsfile file to use TE v0.7 * update NeMo to work with latest mcore RETRO (solving TE problems) * update TE commit Jenkinsfile to be the same with r1.23.0's Jenkinsfile * update commit for MLM * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * jenkinstest debugging * temporary fix RETRO's __init__ for jenkinstest * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * add model.data.dataloader_type=cyclic to jenkinsfile * runnable for inference * update code to work with latest megatron-lm main 81dab6067 * update M-LM commit in Jenkinsfile to latest main M-LM 81dab6067 * cleaning inference code * fix to by pass CI test bf16 problem (following this PR https://github.com/NVIDIA/NeMo/pull/8481/files) * isort and black * adjusting model.micro_batch_size to 1 * fix BRANCH = 'r1.23.0' * replace tutorials dir from main branch to huvu/mcore_retro * fix minor merges conflict * update Jenkinsfile * runnable with a temporary fix from Jacek (unfound -unfinished problem) * runnable with a temporary fix from Jacek (unfound -unfinished problem) * modified nlp_overrides.py back to original * fix checkpoint from Jacek Bieniusiewicz * config Jenkinsfile test * set RETRO Jenkins MBS to 1 * black fix * isort fix * update TE commit * update to latest Jenkinsfile with latest container and commits * remove new RETRO jenkinstest * merge latest main * put RETRO Jenkinstest to the right place * update code for megatron_retro_pretraining_legacy.py * update Jenkins and _legacy.py * update new RETRO jenkinstest to run faster * fixing errors from GitHub Advanced Security / CodeQL * fixing errors from GitHub Advanced Security / CodeQL * update manually branch to huvu/mcore_retro * remove DEBUGGING markers * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * copy paste scripts/tts_dataset_files/ipa_cmudict-0.7b_nv23.01.txt * update codes to fix Github warnings; adding cicd-main.yml action tests * cleaning code, addressing Shanmugam's comments * saving before pulling from main * cleaning code * adding deprecations note * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: eharper <[email protected]> Signed-off-by: Mikołaj Błaż <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: dimapihtar <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: Nithin Rao Koluguri <nithinraok> Signed-off-by: Jimmy Zhang <[email protected]> Signed-off-by: Chen Cui <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Alexandros Koumparoulis <[email protected]> Signed-off-by: Sangkug Lym <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: Krishna Puvvada <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Signed-off-by: Aishwarya Bhandare <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: Valerie Sarge <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Travis Bartley <[email protected]> Signed-off-by: Taejin Park <[email protected]> Signed-off-by: George <[email protected]> Signed-off-by: Shanmugam Ramasamy <[email protected]> Signed-off-by: Abhishree <[email protected]> Signed-off-by: Vladimir Bataev <[email protected]> Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: eharper <[email protected]> Co-authored-by: mikolajblaz <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: dimapihtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Piotr Żelasko <[email protected]> Co-authored-by: Elena Rastorgueva <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: JimmyZhang12 <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: Chen Cui <[email protected]> Co-authored-by: Huy Vu2 <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: Bobby Chen <[email protected]> Co-authored-by: akoumpa <[email protected]> Co-authored-by: Sangkug Lym <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: ashbhandare <[email protected]> Co-authored-by: Aishwarya Bhandare <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Pablo Garay <[email protected]> Co-authored-by: Valerie Sarge <[email protected]> Co-authored-by: Huiying <[email protected]> Co-authored-by: Huiying Li <[email protected]> Co-authored-by: tbartley94 <[email protected]> Co-authored-by: Taejin Park <[email protected]> Co-authored-by: George <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Abhishree Thittenamane <[email protected]> Co-authored-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: Vladimir Bataev <[email protected]> Co-authored-by: Ming <[email protected]> Co-authored-by: Huy Vu2 <[email protected]> Co-authored-by: root <[email protected]>
alxzhang-amazon
pushed a commit
to alxzhang-amazon/NeMo
that referenced
this pull request
Apr 26, 2024
* update branch Signed-off-by: eharper <[email protected]> * Add dist ckpt support for regular optimizers (NVIDIA#7749) * Add dist ckpt support for regular optimizers Signed-off-by: Mikołaj Błaż <[email protected]> * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * fix imports Signed-off-by: dimapihtar <[email protected]> * imports fix Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * ci imports fix Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert asr notebook Signed-off-by: dimapihtar <[email protected]> * revert asr notebook Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Mikołaj Błaż <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: dimapihtar <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: dimapihtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Pin lhotse=1.19.2 in r1.23.0 (NVIDIA#8303) Signed-off-by: Piotr Żelasko <[email protected]> * Cache Aware Streaming tutorial notebook (NVIDIA#8296) * add notebook Signed-off-by: Elena Rastorgueva <[email protected]> * rename old notebook to Buffered_Streaming Signed-off-by: Elena Rastorgueva <[email protected]> * call setup_streaming_params in set_default_att_context_size method Signed-off-by: Elena Rastorgueva <[email protected]> * update links in docs Signed-off-by: Elena Rastorgueva <[email protected]> * update links to tutorials in docs Signed-off-by: Elena Rastorgueva <[email protected]> * remove hard-coding Signed-off-by: Elena Rastorgueva <[email protected]> * rename var Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> * fix path location and branch (NVIDIA#8304) * fix path location and branch Signed-off-by: Nithin Rao Koluguri <nithinraok> * change to a floating point number Signed-off-by: Nithin Rao Koluguri <nithinraok> --------- Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Somshubra Majumdar <[email protected]> * add deallocate pipeline output optimization (NVIDIA#8279) * add deallocate pipeline output optimization Signed-off-by: Jimmy Zhang <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Jimmy Zhang <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fix memory leak caused by context parallelism hanging references by omegaconf (NVIDIA#8299) * save cp_size to self Signed-off-by: Jimmy Zhang <[email protected]> * use parallel_state instead of self Signed-off-by: Jimmy Zhang <[email protected]> --------- Signed-off-by: Jimmy Zhang <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: Eric Harper <[email protected]> * remove assertion (NVIDIA#8302) Signed-off-by: dimapihtar <[email protected]> * Update PEFT Doc (NVIDIA#8262) * update peft doc Signed-off-by: Chen Cui <[email protected]> * remove old prompt learning doc and notebook Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * Merge branch 'r1.23.0' into chcui/update_peft_doc Signed-off-by: Chen Cui <[email protected]> * revert accidental changes Signed-off-by: Chen Cui <[email protected]> * revert accidental changes Signed-off-by: Chen Cui <[email protected]> --------- Signed-off-by: Chen Cui <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks (NVIDIA#8242) (NVIDIA#8324) * Rebasing canary changes at current main Signed-off-by: Piotr Żelasko <[email protected]> * Move the changes from asr transformer to nlp transformer as originally intended Signed-off-by: Piotr Żelasko <[email protected]> * update eval to strip spaces before punctuations Signed-off-by: stevehuang52 <[email protected]> * update pc strip Signed-off-by: stevehuang52 <[email protected]> * [canary] Refactor: `PromptedAudioToTextLhotseDataset` and `EncDecMultiTaskModel` (NVIDIA#8247) * Create a separate CanaryDataset and use it inside `transformer_bpe_models.py`. Ditches `token_sequence_format`. Signed-off-by: Piotr Żelasko <[email protected]> * [canary] Refactor: move changes in transformer_bpe_models.py to Canar… (NVIDIA#8252) * [canary] Refactor: move changes in transformer_bpe_models.py to CanaryModel Signed-off-by: Piotr Żelasko <[email protected]> * Rename `CanaryModel` to `EncDecMultiTaskModel` and remove inheritance from `EncDecTransfModelBPE`; add a separate config for this model Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * Rename `CanaryDataset` to `PromptedAudioToTextLhotseDataset`; add `prompt_format_fn` argument; clean-up the `_canary_prompt_format` function a bit Signed-off-by: Piotr Żelasko <[email protected]> * Move tokenization into `prompt_format_fn`, fix usage, add docs Signed-off-by: Piotr Żelasko <[email protected]> * Backward-compatible utterance validation Signed-off-by: Piotr Żelasko <[email protected]> * Improve type annotations Signed-off-by: Piotr Żelasko <[email protected]> * config and prompt_fn registration changes from review Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * fix transcribe config Signed-off-by: stevehuang52 <[email protected]> * Refactor Canary to follow schema of remaining ASR models (NVIDIA#8260) * Initial draft of multi task beam decoding strategy Signed-off-by: smajumdar <[email protected]> * Stabilize inference Signed-off-by: smajumdar <[email protected]> * Update AED Multi Task model to mostly conform to Archetype-Type format. Update config Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add change decoding strategy Signed-off-by: smajumdar <[email protected]> * Remove redundant imports Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Cleanup Signed-off-by: smajumdar <[email protected]> * Cleanup Signed-off-by: smajumdar <[email protected]> * remove asr transformer dependency on nlp Signed-off-by: stevehuang52 <[email protected]> * clean up Signed-off-by: stevehuang52 <[email protected]> * copy token_classifier from nlp to asr Signed-off-by: stevehuang52 <[email protected]> * Address comments Signed-off-by: smajumdar <[email protected]> * Add typing to beam decoding Signed-off-by: smajumdar <[email protected]> * Make prompt format configurable Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * drop asr dependency on nlp Signed-off-by: stevehuang52 <[email protected]> --------- Signed-off-by: smajumdar <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: stevehuang52 <[email protected]> * fix transcribe, update asr evaluator Signed-off-by: stevehuang52 <[email protected]> * Extend the docs for the canary prompt_fn Signed-off-by: Piotr Żelasko <[email protected]> * Incorporate changes from Nithin's code review Signed-off-by: Piotr Żelasko <[email protected]> * training bug fix and adding launch script for speech_multitask (NVIDIA#8270) * bug fix and adding launch script for speech_multitask Signed-off-by: Krishna Puvvada <[email protected]> * update launch script example in speech_to_text_aed.py Signed-off-by: Krishna Puvvada <[email protected]> --------- Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> * Fix: drop_last must be true in validation/test otherwise the training will hang Signed-off-by: Piotr Żelasko <[email protected]> * revert to current transcribe API Signed-off-by: stevehuang52 <[email protected]> * revert changes to NLP, update docs Signed-off-by: stevehuang52 <[email protected]> * update eval utils Signed-off-by: stevehuang52 <[email protected]> * update docs Signed-off-by: stevehuang52 <[email protected]> * Remove DALI; rename compute_audio_loss to compute_loss Signed-off-by: Piotr Żelasko <[email protected]> * set default use_model_transcribe=False Signed-off-by: stevehuang52 <[email protected]> * change os.path.dirname to pathlib Signed-off-by: stevehuang52 <[email protected]> * [canary] Test for CanaryTokenizer + refactoring (NVIDIA#8285) * Test for CanaryTokenizer Signed-off-by: Piotr Żelasko <[email protected]> * Attempt at refactor... Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * Update config for AED models (NVIDIA#8294) Signed-off-by: smajumdar <[email protected]> * set default calculate_wer=False in transcribe_speech.py Signed-off-by: stevehuang52 <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks Signed-off-by: Piotr Żelasko <[email protected]> * Apply suggestions from code review, part 1 Co-authored-by: Nithin Rao <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> * Apply suggestions from code review, part 2 Signed-off-by: Piotr Żelasko <[email protected]> * Document compute_loss Signed-off-by: Piotr Żelasko <[email protected]> * update transcribe_speech.py Signed-off-by: stevehuang52 <[email protected]> * add docstring Signed-off-by: stevehuang52 <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: Krishna Puvvada <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> Co-authored-by: stevehuang52 <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: He Huang (Steve) <[email protected]> Co-authored-by: Nithin Rao <[email protected]> (cherry picked from commit d10726d) Co-authored-by: Piotr Żelasko <[email protected]> * add code for calling mcore_retro in NeMo * add code for calling mcore_retro in NeMo * runnable, training curve match retro mcore and nemo * working on retro inference * working on megatron_retro_eval.py and megatron_retro_inference.yaml * refactoring text_generation_utils code and retro inference relevant files * clean PR * resolving quick hacks (reading number of train/valid samples from workdir, discrepancy in total samples and samples with neighbors retrieved, tokenizers) * clean repository * revert changes to inference/eval code to original in main * clean code * runable training code, with already implemented eval code * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (NVIDIA#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * Add Bert HF checkpoint converter (NVIDIA#8088) * Add Bert HF checkpoint converter Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Reformat Signed-off-by: yaoyu-33 <[email protected]> * Add BERT ONNX export * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add NeMo BERT to HF BERT script * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Clean code Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update argument names Signed-off-by: yaoyu-33 <[email protected]> * Update build_transformer_config in Bert Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> * revert to original eval code files * revert to original eval code files 2 * revert to original eval code files 3 * revert to original eval code files 4 * clean code * clean code * update my code to support changes from lastest main * commit before rebase r1.23.0 * Multimodal r1.23.0 bug fix (NVIDIA#8315) * Rename quick-gelu Signed-off-by: yaoyu-33 <[email protected]> * ddpm config guard Signed-off-by: yaoyu-33 <[email protected]> * Fix ddpm edit api Signed-off-by: yaoyu-33 <[email protected]> * Fix insert_image_token cfg issue Signed-off-by: yaoyu-33 <[email protected]> * neva updates Signed-off-by: yaoyu-33 <[email protected]> * reformat Signed-off-by: yaoyu-33 <[email protected]> * Add back jenkins Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix jenkins Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix bugs Signed-off-by: yaoyu-33 <[email protected]> * Update default neva template Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * copy paste files from r1.23.0 * clean PR * Fixes for MoE parameter passing & use of AutoTokenizer/Model for mistral. (NVIDIA#8272) Signed-off-by: Alexandros Koumparoulis <[email protected]> * Keep max_seqlen and cu_seqlens_argmin for later micro-batches when PP>1 (NVIDIA#8334) Signed-off-by: Sangkug Lym <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Remove asr webapp (NVIDIA#8347) Signed-off-by: smajumdar <[email protected]> * remove _target_ at model level in aed config (NVIDIA#8351) Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> * revert changes for tts and asr * Add change_vocabulary and save_tokenizers() support to Multitask ASR models (NVIDIA#8357) * Add change_vocabulary and save_tokenizers() support Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update nemo/collections/asr/models/aed_multitask_models.py Co-authored-by: Piotr Żelasko <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> --------- Signed-off-by: smajumdar <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Piotr Żelasko <[email protected]> * Change default (NVIDIA#8371) Signed-off-by: smajumdar <[email protected]> * implement retro's own fwd_bwd_step() and validation_step() to not have argument first_val_step, which the MLM commit doesn't support * adding megatron compile_helpers(), in future can be fixed with correct MLM commit * bug fix in fast-conformer-aed.yaml and adding jenkins test for speech_to_text_aed model (NVIDIA#8368) Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> * Enable megatron core loggers for GPT pretraining (NVIDIA#8354) * Logging changes tested for gpt_pretraining Signed-off-by: Aishwarya Bhandare <[email protected]> * Additional args Signed-off-by: Aishwarya Bhandare <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Aishwarya Bhandare <[email protected]> Co-authored-by: Aishwarya Bhandare <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <[email protected]> * mcore ds fix (NVIDIA#8283) * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (NVIDIA#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * mcore ds fix Signed-off-by: Dmytro Pykhtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update mcore Signed-off-by: dimapihtar <[email protected]> * revert asr files Signed-off-by: dimapihtar <[email protected]> * add comments Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for mcore mock dataset Signed-off-by: dimapihtar <[email protected]> * update mcore version Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update gpt cfg Signed-off-by: dimapihtar <[email protected]> * update mcore commit Signed-off-by: dimapihtar <[email protected]> * fix Bert unit tests Signed-off-by: dimapihtar <[email protected]> * update bert tests Signed-off-by: dimapihtar <[email protected]> * fix bert mcore test Signed-off-by: dimapihtar <[email protected]> * fix gpt jenkins tests Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update apex & TE commits Signed-off-by: dimapihtar <[email protected]> * revert apex installation Signed-off-by: dimapihtar <[email protected]> * turn off the fusion for jenkins Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: dimapihtar <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Pablo Garay <[email protected]> * addressing Eric's reviews * adding existing implementation RETRO files * adding existing implementation RETRO files * Add Finetuning tutorial with HF Datasets (NVIDIA#8356) * Add Finetuning tutorial with HF Datasets Signed-off-by: Nithin Rao Koluguri <nithinraok> * update on Som comments Signed-off-by: Nithin Rao Koluguri <nithinraok> --------- Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * release updates (NVIDIA#8378) * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (NVIDIA#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * mcore ds fix Signed-off-by: Dmytro Pykhtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update mcore Signed-off-by: dimapihtar <[email protected]> * revert asr files Signed-off-by: dimapihtar <[email protected]> * add comments Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for mcore mock dataset Signed-off-by: dimapihtar <[email protected]> * update mcore version Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update gpt cfg Signed-off-by: dimapihtar <[email protected]> * update mcore commit Signed-off-by: dimapihtar <[email protected]> * fix Bert unit tests Signed-off-by: dimapihtar <[email protected]> * update bert tests Signed-off-by: dimapihtar <[email protected]> * fix bert mcore test Signed-off-by: dimapihtar <[email protected]> * fix gpt jenkins tests Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for dict data input type Signed-off-by: dimapihtar <[email protected]> * add mock ds test Signed-off-by: dimapihtar <[email protected]> * add test for dict data input type Signed-off-by: dimapihtar <[email protected]> * mcore ds fix Signed-off-by: dimapihtar <[email protected]> * data input fix Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: dimapihtar <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Pablo Garay <[email protected]> * MCore dataset compatibility for tokenizers (NVIDIA#8390) * Add unique_identifiers for all tokenizers and eod for SentencePieceTokenizer Signed-off-by: Valerie Sarge <[email protected]> * Add generalized token aliases to TokenizerSpec to conform with MegatronTokenizer's interface. Remove now-redundant individual fixes from AutoTokenizer and SentencePieceTokenizer. Signed-off-by: Valerie Sarge <[email protected]> --------- Signed-off-by: Valerie Sarge <[email protected]> Co-authored-by: Pablo Garay <[email protected]> * Mcore customization doc (NVIDIA#8298) * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (NVIDIA#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * Add Bert HF checkpoint converter (NVIDIA#8088) * Add Bert HF checkpoint converter Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Reformat Signed-off-by: yaoyu-33 <[email protected]> * Add BERT ONNX export * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add NeMo BERT to HF BERT script * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Clean code Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update argument names Signed-off-by: yaoyu-33 <[email protected]> * Update build_transformer_config in Bert Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> * initial placeholder Signed-off-by: Huiying Li <[email protected]> * add to intro/index.rst Signed-off-by: Huiying Li <[email protected]> * initial content update Signed-off-by: Huiying Li <[email protected]> * add diff images Signed-off-by: Huiying Li <[email protected]> size Signed-off-by: Huiying Li <[email protected]> * minor fixes * minor language change Signed-off-by: Chen Cui <[email protected]> * clean changes --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Chen Cui <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> Co-authored-by: Huiying Li <[email protected]> Co-authored-by: Chen Cui <[email protected]> * wer fix (NVIDIA#8404) Signed-off-by: Travis Bartley <[email protected]> * updated link to pubmed (NVIDIA#8402) Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * Update NFA video download link (NVIDIA#8406) * update nfa nasa video link Signed-off-by: Elena Rastorgueva <[email protected]> * update link in markdown Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> * revert changes (NVIDIA#8410) Signed-off-by: Chen Cui <[email protected]> * Fix dreambooth data sampler issue (NVIDIA#8400) * Turn on drop last Signed-off-by: yaoyu-33 <[email protected]> * Some neva fixes Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fixed errors in the CTM gen functions (NVIDIA#8416) Signed-off-by: Taejin Park <[email protected]> * add ensemble decoding fix (NVIDIA#8427) Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * SDE bugfix log (NVIDIA#8430) Signed-off-by: George <[email protected]> * mcore customization doc minor fix (NVIDIA#8421) Signed-off-by: Huiying Li <[email protected]> * NeMo-Mistral to HF converter bugfix. (NVIDIA#8353) Signed-off-by: Alexandros Koumparoulis <[email protected]> * Fixing mcore bert for TP, PP and SP (NVIDIA#8336) * Fixing mcore bert for TP, PP and SP * Fixing mcore bert for TP, PP and SP * Fixing mcore version * Fixing mcore version * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> --------- Signed-off-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Add settings to suppress bf16 compile errors in CI on V100 (NVIDIA#8481) * Add settings to suppress bf16 compile errors in CI on V100 Signed-off-by: Abhishree <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Abhishree <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * MoE parameter passing (NVIDIA#8255) * MoE parameter passing Signed-off-by: Alexandros Koumparoulis <[email protected]> * Pass EP/MoE params in consumer scripts. Signed-off-by: Alexandros Koumparoulis <[email protected]> * PR fixes Signed-off-by: Alexandros Koumparoulis <[email protected]> * Use latest commit of mcore-0.5 Signed-off-by: Alexandros Koumparoulis <[email protected]> * CI fix Signed-off-by: Alexandros Koumparoulis <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Update k2 version (NVIDIA#8478) (NVIDIA#8492) Signed-off-by: Vladimir Bataev <[email protected]> * Add fp8 support for SD/Update notebook paths (NVIDIA#8489) * Add fp8 support for SD/Update notebook paths Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <[email protected]> * pin to 0.5.0 (NVIDIA#8465) Signed-off-by: eharper <[email protected]> * Update NeMo Multimodal Requirements (NVIDIA#8515) * Update requirements_multimodal.txt Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * update github raw content link (NVIDIA#8517) Signed-off-by: Chen Cui <[email protected]> * Add dep notice for notebooks (NVIDIA#8522) * add dep notice Signed-off-by: eharper <[email protected]> * revert Signed-off-by: eharper <[email protected]> --------- Signed-off-by: eharper <[email protected]> * Revert FP8 integration (NVIDIA#8520) * Revert FP8 integration Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Update data prep notebook (NVIDIA#8532) Signed-off-by: Mingyuan Ma <[email protected]> * before update branch with latest r1.23.0 * update to run with MLM ae2817b3dde4efb1515061a5311d01d8f85bd99c (runnable training and saving checkpoint) * remove compile_helpers * reverse changes from main branch to r1.23.0 * adding *_legacy files * update MLM commit in Jenkinsfile to latest * debugging Jenkinstest: test different mcore import in retro_dataset * update Jenkinsfile edit megatron_retro_mutransfer_pretrain_legacy.py * removing all mcore RETRO to pass the Jenkinstest * fixing import legacy problem for tests/collections/nlp/test_indexed_retrieval_dataset.py * update Jenkinsfile file to use TE v0.7 * update NeMo to work with latest mcore RETRO (solving TE problems) * update TE commit Jenkinsfile to be the same with r1.23.0's Jenkinsfile * update commit for MLM * jenkinstest debugging * temporary fix RETRO's __init__ for jenkinstest * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * add model.data.dataloader_type=cyclic to jenkinsfile * update code to work with latest megatron-lm main 81dab6067 * update M-LM commit in Jenkinsfile to latest main M-LM 81dab6067 * fix to by pass CI test bf16 problem (following this PR https://github.com/NVIDIA/NeMo/pull/8481/files) * isort and black * adjusting model.micro_batch_size to 1 * fix BRANCH = 'r1.23.0' * replace tutorials dir from main branch to huvu/mcore_retro * fix minor merges conflict * update Jenkinsfile * runnable with a temporary fix from Jacek (unfound -unfinished problem) * runnable with a temporary fix from Jacek (unfound -unfinished problem) * modified nlp_overrides.py back to original * fix checkpoint from Jacek Bieniusiewicz * config Jenkinsfile test * set RETRO Jenkins MBS to 1 * black fix * isort fix * update TE commit * update to latest Jenkinsfile with latest container and commits * remove new RETRO jenkinstest * merge latest main * put RETRO Jenkinstest to the right place * update code for megatron_retro_pretraining_legacy.py * untrack ipa_cmudict-0.7b_nv23.01.txt * untrack ipa_cmudict-0.7b_nv23.01.txt * set config in megatron_retro_pretraining_legacy.py to megatron_retro_config_legacy * update new RETRO jenkinstest to run faster * merging latest main, and edit Jenkinstest * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * huvu/mcore_retro_docs first commit * update with main * update RETRO docs * fix scripts/tts_dataset_files/ipa_cmudict-0.7b_nv23.01.txt * update docs * update docs * udpate RETRO docs * update with Jennifer's comments --------- Signed-off-by: eharper <[email protected]> Signed-off-by: Mikołaj Błaż <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: dimapihtar <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: Nithin Rao Koluguri <nithinraok> Signed-off-by: Jimmy Zhang <[email protected]> Signed-off-by: Chen Cui <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Alexandros Koumparoulis <[email protected]> Signed-off-by: Sangkug Lym <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: Krishna Puvvada <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Signed-off-by: Aishwarya Bhandare <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: Valerie Sarge <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Travis Bartley <[email protected]> Signed-off-by: Taejin Park <[email protected]> Signed-off-by: George <[email protected]> Signed-off-by: Shanmugam Ramasamy <[email protected]> Signed-off-by: Abhishree <[email protected]> Signed-off-by: Vladimir Bataev <[email protected]> Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: eharper <[email protected]> Co-authored-by: mikolajblaz <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: dimapihtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Piotr Żelasko <[email protected]> Co-authored-by: Elena Rastorgueva <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: JimmyZhang12 <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: Chen Cui <[email protected]> Co-authored-by: Huy Vu2 <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: Bobby Chen <[email protected]> Co-authored-by: akoumpa <[email protected]> Co-authored-by: Sangkug Lym <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: ashbhandare <[email protected]> Co-authored-by: Aishwarya Bhandare <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Pablo Garay <[email protected]> Co-authored-by: Valerie Sarge <[email protected]> Co-authored-by: Huiying <[email protected]> Co-authored-by: Huiying Li <[email protected]> Co-authored-by: tbartley94 <[email protected]> Co-authored-by: Taejin Park <[email protected]> Co-authored-by: George <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Abhishree Thittenamane <[email protected]> Co-authored-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: Vladimir Bataev <[email protected]> Co-authored-by: Ming <[email protected]> Co-authored-by: Huy Vu2 <[email protected]>
galv
pushed a commit
to galv/NeMo
that referenced
this pull request
Apr 29, 2024
* update branch Signed-off-by: eharper <[email protected]> * Add dist ckpt support for regular optimizers (NVIDIA#7749) * Add dist ckpt support for regular optimizers Signed-off-by: Mikołaj Błaż <[email protected]> * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * fix imports Signed-off-by: dimapihtar <[email protected]> * imports fix Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * ci imports fix Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert asr notebook Signed-off-by: dimapihtar <[email protected]> * revert asr notebook Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Mikołaj Błaż <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: dimapihtar <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: dimapihtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Pin lhotse=1.19.2 in r1.23.0 (NVIDIA#8303) Signed-off-by: Piotr Żelasko <[email protected]> * Cache Aware Streaming tutorial notebook (NVIDIA#8296) * add notebook Signed-off-by: Elena Rastorgueva <[email protected]> * rename old notebook to Buffered_Streaming Signed-off-by: Elena Rastorgueva <[email protected]> * call setup_streaming_params in set_default_att_context_size method Signed-off-by: Elena Rastorgueva <[email protected]> * update links in docs Signed-off-by: Elena Rastorgueva <[email protected]> * update links to tutorials in docs Signed-off-by: Elena Rastorgueva <[email protected]> * remove hard-coding Signed-off-by: Elena Rastorgueva <[email protected]> * rename var Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> * fix path location and branch (NVIDIA#8304) * fix path location and branch Signed-off-by: Nithin Rao Koluguri <nithinraok> * change to a floating point number Signed-off-by: Nithin Rao Koluguri <nithinraok> --------- Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Somshubra Majumdar <[email protected]> * add deallocate pipeline output optimization (NVIDIA#8279) * add deallocate pipeline output optimization Signed-off-by: Jimmy Zhang <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Jimmy Zhang <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fix memory leak caused by context parallelism hanging references by omegaconf (NVIDIA#8299) * save cp_size to self Signed-off-by: Jimmy Zhang <[email protected]> * use parallel_state instead of self Signed-off-by: Jimmy Zhang <[email protected]> --------- Signed-off-by: Jimmy Zhang <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: Eric Harper <[email protected]> * remove assertion (NVIDIA#8302) Signed-off-by: dimapihtar <[email protected]> * Update PEFT Doc (NVIDIA#8262) * update peft doc Signed-off-by: Chen Cui <[email protected]> * remove old prompt learning doc and notebook Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * Merge branch 'r1.23.0' into chcui/update_peft_doc Signed-off-by: Chen Cui <[email protected]> * revert accidental changes Signed-off-by: Chen Cui <[email protected]> * revert accidental changes Signed-off-by: Chen Cui <[email protected]> --------- Signed-off-by: Chen Cui <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks (NVIDIA#8242) (NVIDIA#8324) * Rebasing canary changes at current main Signed-off-by: Piotr Żelasko <[email protected]> * Move the changes from asr transformer to nlp transformer as originally intended Signed-off-by: Piotr Żelasko <[email protected]> * update eval to strip spaces before punctuations Signed-off-by: stevehuang52 <[email protected]> * update pc strip Signed-off-by: stevehuang52 <[email protected]> * [canary] Refactor: `PromptedAudioToTextLhotseDataset` and `EncDecMultiTaskModel` (NVIDIA#8247) * Create a separate CanaryDataset and use it inside `transformer_bpe_models.py`. Ditches `token_sequence_format`. Signed-off-by: Piotr Żelasko <[email protected]> * [canary] Refactor: move changes in transformer_bpe_models.py to Canar… (NVIDIA#8252) * [canary] Refactor: move changes in transformer_bpe_models.py to CanaryModel Signed-off-by: Piotr Żelasko <[email protected]> * Rename `CanaryModel` to `EncDecMultiTaskModel` and remove inheritance from `EncDecTransfModelBPE`; add a separate config for this model Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * Rename `CanaryDataset` to `PromptedAudioToTextLhotseDataset`; add `prompt_format_fn` argument; clean-up the `_canary_prompt_format` function a bit Signed-off-by: Piotr Żelasko <[email protected]> * Move tokenization into `prompt_format_fn`, fix usage, add docs Signed-off-by: Piotr Żelasko <[email protected]> * Backward-compatible utterance validation Signed-off-by: Piotr Żelasko <[email protected]> * Improve type annotations Signed-off-by: Piotr Żelasko <[email protected]> * config and prompt_fn registration changes from review Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * fix transcribe config Signed-off-by: stevehuang52 <[email protected]> * Refactor Canary to follow schema of remaining ASR models (NVIDIA#8260) * Initial draft of multi task beam decoding strategy Signed-off-by: smajumdar <[email protected]> * Stabilize inference Signed-off-by: smajumdar <[email protected]> * Update AED Multi Task model to mostly conform to Archetype-Type format. Update config Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add change decoding strategy Signed-off-by: smajumdar <[email protected]> * Remove redundant imports Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Cleanup Signed-off-by: smajumdar <[email protected]> * Cleanup Signed-off-by: smajumdar <[email protected]> * remove asr transformer dependency on nlp Signed-off-by: stevehuang52 <[email protected]> * clean up Signed-off-by: stevehuang52 <[email protected]> * copy token_classifier from nlp to asr Signed-off-by: stevehuang52 <[email protected]> * Address comments Signed-off-by: smajumdar <[email protected]> * Add typing to beam decoding Signed-off-by: smajumdar <[email protected]> * Make prompt format configurable Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * drop asr dependency on nlp Signed-off-by: stevehuang52 <[email protected]> --------- Signed-off-by: smajumdar <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: stevehuang52 <[email protected]> * fix transcribe, update asr evaluator Signed-off-by: stevehuang52 <[email protected]> * Extend the docs for the canary prompt_fn Signed-off-by: Piotr Żelasko <[email protected]> * Incorporate changes from Nithin's code review Signed-off-by: Piotr Żelasko <[email protected]> * training bug fix and adding launch script for speech_multitask (NVIDIA#8270) * bug fix and adding launch script for speech_multitask Signed-off-by: Krishna Puvvada <[email protected]> * update launch script example in speech_to_text_aed.py Signed-off-by: Krishna Puvvada <[email protected]> --------- Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> * Fix: drop_last must be true in validation/test otherwise the training will hang Signed-off-by: Piotr Żelasko <[email protected]> * revert to current transcribe API Signed-off-by: stevehuang52 <[email protected]> * revert changes to NLP, update docs Signed-off-by: stevehuang52 <[email protected]> * update eval utils Signed-off-by: stevehuang52 <[email protected]> * update docs Signed-off-by: stevehuang52 <[email protected]> * Remove DALI; rename compute_audio_loss to compute_loss Signed-off-by: Piotr Żelasko <[email protected]> * set default use_model_transcribe=False Signed-off-by: stevehuang52 <[email protected]> * change os.path.dirname to pathlib Signed-off-by: stevehuang52 <[email protected]> * [canary] Test for CanaryTokenizer + refactoring (NVIDIA#8285) * Test for CanaryTokenizer Signed-off-by: Piotr Żelasko <[email protected]> * Attempt at refactor... Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * Update config for AED models (NVIDIA#8294) Signed-off-by: smajumdar <[email protected]> * set default calculate_wer=False in transcribe_speech.py Signed-off-by: stevehuang52 <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks Signed-off-by: Piotr Żelasko <[email protected]> * Apply suggestions from code review, part 1 Co-authored-by: Nithin Rao <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> * Apply suggestions from code review, part 2 Signed-off-by: Piotr Żelasko <[email protected]> * Document compute_loss Signed-off-by: Piotr Żelasko <[email protected]> * update transcribe_speech.py Signed-off-by: stevehuang52 <[email protected]> * add docstring Signed-off-by: stevehuang52 <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: Krishna Puvvada <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> Co-authored-by: stevehuang52 <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: He Huang (Steve) <[email protected]> Co-authored-by: Nithin Rao <[email protected]> (cherry picked from commit d10726d) Co-authored-by: Piotr Żelasko <[email protected]> * add code for calling mcore_retro in NeMo * add code for calling mcore_retro in NeMo * runnable, training curve match retro mcore and nemo * working on retro inference * working on megatron_retro_eval.py and megatron_retro_inference.yaml * refactoring text_generation_utils code and retro inference relevant files * clean PR * resolving quick hacks (reading number of train/valid samples from workdir, discrepancy in total samples and samples with neighbors retrieved, tokenizers) * clean repository * revert changes to inference/eval code to original in main * clean code * runable training code, with already implemented eval code * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (NVIDIA#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * Add Bert HF checkpoint converter (NVIDIA#8088) * Add Bert HF checkpoint converter Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Reformat Signed-off-by: yaoyu-33 <[email protected]> * Add BERT ONNX export * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add NeMo BERT to HF BERT script * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Clean code Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update argument names Signed-off-by: yaoyu-33 <[email protected]> * Update build_transformer_config in Bert Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> * revert to original eval code files * revert to original eval code files 2 * revert to original eval code files 3 * revert to original eval code files 4 * clean code * clean code * update my code to support changes from lastest main * commit before rebase r1.23.0 * Multimodal r1.23.0 bug fix (NVIDIA#8315) * Rename quick-gelu Signed-off-by: yaoyu-33 <[email protected]> * ddpm config guard Signed-off-by: yaoyu-33 <[email protected]> * Fix ddpm edit api Signed-off-by: yaoyu-33 <[email protected]> * Fix insert_image_token cfg issue Signed-off-by: yaoyu-33 <[email protected]> * neva updates Signed-off-by: yaoyu-33 <[email protected]> * reformat Signed-off-by: yaoyu-33 <[email protected]> * Add back jenkins Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix jenkins Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix bugs Signed-off-by: yaoyu-33 <[email protected]> * Update default neva template Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * copy paste files from r1.23.0 * clean PR * Fixes for MoE parameter passing & use of AutoTokenizer/Model for mistral. (NVIDIA#8272) Signed-off-by: Alexandros Koumparoulis <[email protected]> * Keep max_seqlen and cu_seqlens_argmin for later micro-batches when PP>1 (NVIDIA#8334) Signed-off-by: Sangkug Lym <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Remove asr webapp (NVIDIA#8347) Signed-off-by: smajumdar <[email protected]> * remove _target_ at model level in aed config (NVIDIA#8351) Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> * revert changes for tts and asr * Add change_vocabulary and save_tokenizers() support to Multitask ASR models (NVIDIA#8357) * Add change_vocabulary and save_tokenizers() support Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update nemo/collections/asr/models/aed_multitask_models.py Co-authored-by: Piotr Żelasko <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> --------- Signed-off-by: smajumdar <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Piotr Żelasko <[email protected]> * Change default (NVIDIA#8371) Signed-off-by: smajumdar <[email protected]> * implement retro's own fwd_bwd_step() and validation_step() to not have argument first_val_step, which the MLM commit doesn't support * adding megatron compile_helpers(), in future can be fixed with correct MLM commit * bug fix in fast-conformer-aed.yaml and adding jenkins test for speech_to_text_aed model (NVIDIA#8368) Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> * Enable megatron core loggers for GPT pretraining (NVIDIA#8354) * Logging changes tested for gpt_pretraining Signed-off-by: Aishwarya Bhandare <[email protected]> * Additional args Signed-off-by: Aishwarya Bhandare <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Aishwarya Bhandare <[email protected]> Co-authored-by: Aishwarya Bhandare <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <[email protected]> * mcore ds fix (NVIDIA#8283) * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (NVIDIA#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * mcore ds fix Signed-off-by: Dmytro Pykhtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update mcore Signed-off-by: dimapihtar <[email protected]> * revert asr files Signed-off-by: dimapihtar <[email protected]> * add comments Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for mcore mock dataset Signed-off-by: dimapihtar <[email protected]> * update mcore version Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update gpt cfg Signed-off-by: dimapihtar <[email protected]> * update mcore commit Signed-off-by: dimapihtar <[email protected]> * fix Bert unit tests Signed-off-by: dimapihtar <[email protected]> * update bert tests Signed-off-by: dimapihtar <[email protected]> * fix bert mcore test Signed-off-by: dimapihtar <[email protected]> * fix gpt jenkins tests Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update apex & TE commits Signed-off-by: dimapihtar <[email protected]> * revert apex installation Signed-off-by: dimapihtar <[email protected]> * turn off the fusion for jenkins Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: dimapihtar <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Pablo Garay <[email protected]> * addressing Eric's reviews * adding existing implementation RETRO files * adding existing implementation RETRO files * Add Finetuning tutorial with HF Datasets (NVIDIA#8356) * Add Finetuning tutorial with HF Datasets Signed-off-by: Nithin Rao Koluguri <nithinraok> * update on Som comments Signed-off-by: Nithin Rao Koluguri <nithinraok> --------- Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * release updates (NVIDIA#8378) * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (NVIDIA#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * mcore ds fix Signed-off-by: Dmytro Pykhtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update mcore Signed-off-by: dimapihtar <[email protected]> * revert asr files Signed-off-by: dimapihtar <[email protected]> * add comments Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for mcore mock dataset Signed-off-by: dimapihtar <[email protected]> * update mcore version Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update gpt cfg Signed-off-by: dimapihtar <[email protected]> * update mcore commit Signed-off-by: dimapihtar <[email protected]> * fix Bert unit tests Signed-off-by: dimapihtar <[email protected]> * update bert tests Signed-off-by: dimapihtar <[email protected]> * fix bert mcore test Signed-off-by: dimapihtar <[email protected]> * fix gpt jenkins tests Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for dict data input type Signed-off-by: dimapihtar <[email protected]> * add mock ds test Signed-off-by: dimapihtar <[email protected]> * add test for dict data input type Signed-off-by: dimapihtar <[email protected]> * mcore ds fix Signed-off-by: dimapihtar <[email protected]> * data input fix Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: dimapihtar <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Pablo Garay <[email protected]> * MCore dataset compatibility for tokenizers (NVIDIA#8390) * Add unique_identifiers for all tokenizers and eod for SentencePieceTokenizer Signed-off-by: Valerie Sarge <[email protected]> * Add generalized token aliases to TokenizerSpec to conform with MegatronTokenizer's interface. Remove now-redundant individual fixes from AutoTokenizer and SentencePieceTokenizer. Signed-off-by: Valerie Sarge <[email protected]> --------- Signed-off-by: Valerie Sarge <[email protected]> Co-authored-by: Pablo Garay <[email protected]> * Mcore customization doc (NVIDIA#8298) * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (NVIDIA#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * Add Bert HF checkpoint converter (NVIDIA#8088) * Add Bert HF checkpoint converter Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Reformat Signed-off-by: yaoyu-33 <[email protected]> * Add BERT ONNX export * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add NeMo BERT to HF BERT script * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Clean code Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update argument names Signed-off-by: yaoyu-33 <[email protected]> * Update build_transformer_config in Bert Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> * initial placeholder Signed-off-by: Huiying Li <[email protected]> * add to intro/index.rst Signed-off-by: Huiying Li <[email protected]> * initial content update Signed-off-by: Huiying Li <[email protected]> * add diff images Signed-off-by: Huiying Li <[email protected]> size Signed-off-by: Huiying Li <[email protected]> * minor fixes * minor language change Signed-off-by: Chen Cui <[email protected]> * clean changes --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Chen Cui <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> Co-authored-by: Huiying Li <[email protected]> Co-authored-by: Chen Cui <[email protected]> * wer fix (NVIDIA#8404) Signed-off-by: Travis Bartley <[email protected]> * updated link to pubmed (NVIDIA#8402) Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * Update NFA video download link (NVIDIA#8406) * update nfa nasa video link Signed-off-by: Elena Rastorgueva <[email protected]> * update link in markdown Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> * revert changes (NVIDIA#8410) Signed-off-by: Chen Cui <[email protected]> * Fix dreambooth data sampler issue (NVIDIA#8400) * Turn on drop last Signed-off-by: yaoyu-33 <[email protected]> * Some neva fixes Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fixed errors in the CTM gen functions (NVIDIA#8416) Signed-off-by: Taejin Park <[email protected]> * add ensemble decoding fix (NVIDIA#8427) Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * SDE bugfix log (NVIDIA#8430) Signed-off-by: George <[email protected]> * mcore customization doc minor fix (NVIDIA#8421) Signed-off-by: Huiying Li <[email protected]> * NeMo-Mistral to HF converter bugfix. (NVIDIA#8353) Signed-off-by: Alexandros Koumparoulis <[email protected]> * Fixing mcore bert for TP, PP and SP (NVIDIA#8336) * Fixing mcore bert for TP, PP and SP * Fixing mcore bert for TP, PP and SP * Fixing mcore version * Fixing mcore version * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> --------- Signed-off-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Add settings to suppress bf16 compile errors in CI on V100 (NVIDIA#8481) * Add settings to suppress bf16 compile errors in CI on V100 Signed-off-by: Abhishree <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Abhishree <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * MoE parameter passing (NVIDIA#8255) * MoE parameter passing Signed-off-by: Alexandros Koumparoulis <[email protected]> * Pass EP/MoE params in consumer scripts. Signed-off-by: Alexandros Koumparoulis <[email protected]> * PR fixes Signed-off-by: Alexandros Koumparoulis <[email protected]> * Use latest commit of mcore-0.5 Signed-off-by: Alexandros Koumparoulis <[email protected]> * CI fix Signed-off-by: Alexandros Koumparoulis <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Update k2 version (NVIDIA#8478) (NVIDIA#8492) Signed-off-by: Vladimir Bataev <[email protected]> * Add fp8 support for SD/Update notebook paths (NVIDIA#8489) * Add fp8 support for SD/Update notebook paths Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <[email protected]> * pin to 0.5.0 (NVIDIA#8465) Signed-off-by: eharper <[email protected]> * Update NeMo Multimodal Requirements (NVIDIA#8515) * Update requirements_multimodal.txt Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * update github raw content link (NVIDIA#8517) Signed-off-by: Chen Cui <[email protected]> * Add dep notice for notebooks (NVIDIA#8522) * add dep notice Signed-off-by: eharper <[email protected]> * revert Signed-off-by: eharper <[email protected]> --------- Signed-off-by: eharper <[email protected]> * Revert FP8 integration (NVIDIA#8520) * Revert FP8 integration Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Update data prep notebook (NVIDIA#8532) Signed-off-by: Mingyuan Ma <[email protected]> * before update branch with latest r1.23.0 * update to run with MLM ae2817b3dde4efb1515061a5311d01d8f85bd99c (runnable training and saving checkpoint) * remove compile_helpers * reverse changes from main branch to r1.23.0 * adding *_legacy files * update MLM commit in Jenkinsfile to latest * debugging Jenkinstest: test different mcore import in retro_dataset * update Jenkinsfile edit megatron_retro_mutransfer_pretrain_legacy.py * removing all mcore RETRO to pass the Jenkinstest * fixing import legacy problem for tests/collections/nlp/test_indexed_retrieval_dataset.py * update Jenkinsfile file to use TE v0.7 * update NeMo to work with latest mcore RETRO (solving TE problems) * update TE commit Jenkinsfile to be the same with r1.23.0's Jenkinsfile * update commit for MLM * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * jenkinstest debugging * temporary fix RETRO's __init__ for jenkinstest * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * add model.data.dataloader_type=cyclic to jenkinsfile * runnable for inference * update code to work with latest megatron-lm main 81dab6067 * update M-LM commit in Jenkinsfile to latest main M-LM 81dab6067 * cleaning inference code * fix to by pass CI test bf16 problem (following this PR https://github.com/NVIDIA/NeMo/pull/8481/files) * isort and black * adjusting model.micro_batch_size to 1 * fix BRANCH = 'r1.23.0' * replace tutorials dir from main branch to huvu/mcore_retro * fix minor merges conflict * update Jenkinsfile * runnable with a temporary fix from Jacek (unfound -unfinished problem) * runnable with a temporary fix from Jacek (unfound -unfinished problem) * modified nlp_overrides.py back to original * fix checkpoint from Jacek Bieniusiewicz * config Jenkinsfile test * set RETRO Jenkins MBS to 1 * black fix * isort fix * update TE commit * update to latest Jenkinsfile with latest container and commits * remove new RETRO jenkinstest * merge latest main * put RETRO Jenkinstest to the right place * update code for megatron_retro_pretraining_legacy.py * update Jenkins and _legacy.py * update new RETRO jenkinstest to run faster * fixing errors from GitHub Advanced Security / CodeQL * fixing errors from GitHub Advanced Security / CodeQL * update manually branch to huvu/mcore_retro * remove DEBUGGING markers * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * copy paste scripts/tts_dataset_files/ipa_cmudict-0.7b_nv23.01.txt * update codes to fix Github warnings; adding cicd-main.yml action tests * cleaning code, addressing Shanmugam's comments * saving before pulling from main * cleaning code * adding deprecations note * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: eharper <[email protected]> Signed-off-by: Mikołaj Błaż <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: dimapihtar <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: Nithin Rao Koluguri <nithinraok> Signed-off-by: Jimmy Zhang <[email protected]> Signed-off-by: Chen Cui <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Alexandros Koumparoulis <[email protected]> Signed-off-by: Sangkug Lym <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: Krishna Puvvada <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Signed-off-by: Aishwarya Bhandare <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: Valerie Sarge <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Travis Bartley <[email protected]> Signed-off-by: Taejin Park <[email protected]> Signed-off-by: George <[email protected]> Signed-off-by: Shanmugam Ramasamy <[email protected]> Signed-off-by: Abhishree <[email protected]> Signed-off-by: Vladimir Bataev <[email protected]> Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: eharper <[email protected]> Co-authored-by: mikolajblaz <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: dimapihtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Piotr Żelasko <[email protected]> Co-authored-by: Elena Rastorgueva <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: JimmyZhang12 <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: Chen Cui <[email protected]> Co-authored-by: Huy Vu2 <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: Bobby Chen <[email protected]> Co-authored-by: akoumpa <[email protected]> Co-authored-by: Sangkug Lym <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: ashbhandare <[email protected]> Co-authored-by: Aishwarya Bhandare <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Pablo Garay <[email protected]> Co-authored-by: Valerie Sarge <[email protected]> Co-authored-by: Huiying <[email protected]> Co-authored-by: Huiying Li <[email protected]> Co-authored-by: tbartley94 <[email protected]> Co-authored-by: Taejin Park <[email protected]> Co-authored-by: George <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Abhishree Thittenamane <[email protected]> Co-authored-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: Vladimir Bataev <[email protected]> Co-authored-by: Ming <[email protected]> Co-authored-by: Huy Vu2 <[email protected]> Co-authored-by: root <[email protected]>
Victor49152
added a commit
that referenced
this pull request
May 1, 2024
* update branch Signed-off-by: eharper <[email protected]> * Add dist ckpt support for regular optimizers (#7749) * Add dist ckpt support for regular optimizers Signed-off-by: Mikołaj Błaż <[email protected]> * [tutorial] fixed missing RIR scripts file. (#8257) Signed-off-by: Xuesong Yang <[email protected]> * fix imports Signed-off-by: dimapihtar <[email protected]> * imports fix Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * ci imports fix Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert asr notebook Signed-off-by: dimapihtar <[email protected]> * revert asr notebook Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Mikołaj Błaż <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: dimapihtar <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: dimapihtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Pin lhotse=1.19.2 in r1.23.0 (#8303) Signed-off-by: Piotr Żelasko <[email protected]> * Cache Aware Streaming tutorial notebook (#8296) * add notebook Signed-off-by: Elena Rastorgueva <[email protected]> * rename old notebook to Buffered_Streaming Signed-off-by: Elena Rastorgueva <[email protected]> * call setup_streaming_params in set_default_att_context_size method Signed-off-by: Elena Rastorgueva <[email protected]> * update links in docs Signed-off-by: Elena Rastorgueva <[email protected]> * update links to tutorials in docs Signed-off-by: Elena Rastorgueva <[email protected]> * remove hard-coding Signed-off-by: Elena Rastorgueva <[email protected]> * rename var Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> * fix path location and branch (#8304) * fix path location and branch Signed-off-by: Nithin Rao Koluguri <nithinraok> * change to a floating point number Signed-off-by: Nithin Rao Koluguri <nithinraok> --------- Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Somshubra Majumdar <[email protected]> * add deallocate pipeline output optimization (#8279) * add deallocate pipeline output optimization Signed-off-by: Jimmy Zhang <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Jimmy Zhang <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fix memory leak caused by context parallelism hanging references by omegaconf (#8299) * save cp_size to self Signed-off-by: Jimmy Zhang <[email protected]> * use parallel_state instead of self Signed-off-by: Jimmy Zhang <[email protected]> --------- Signed-off-by: Jimmy Zhang <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: Eric Harper <[email protected]> * remove assertion (#8302) Signed-off-by: dimapihtar <[email protected]> * Update PEFT Doc (#8262) * update peft doc Signed-off-by: Chen Cui <[email protected]> * remove old prompt learning doc and notebook Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * Merge branch 'r1.23.0' into chcui/update_peft_doc Signed-off-by: Chen Cui <[email protected]> * revert accidental changes Signed-off-by: Chen Cui <[email protected]> * revert accidental changes Signed-off-by: Chen Cui <[email protected]> --------- Signed-off-by: Chen Cui <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks (#8242) (#8324) * Rebasing canary changes at current main Signed-off-by: Piotr Żelasko <[email protected]> * Move the changes from asr transformer to nlp transformer as originally intended Signed-off-by: Piotr Żelasko <[email protected]> * update eval to strip spaces before punctuations Signed-off-by: stevehuang52 <[email protected]> * update pc strip Signed-off-by: stevehuang52 <[email protected]> * [canary] Refactor: `PromptedAudioToTextLhotseDataset` and `EncDecMultiTaskModel` (#8247) * Create a separate CanaryDataset and use it inside `transformer_bpe_models.py`. Ditches `token_sequence_format`. Signed-off-by: Piotr Żelasko <[email protected]> * [canary] Refactor: move changes in transformer_bpe_models.py to Canar… (#8252) * [canary] Refactor: move changes in transformer_bpe_models.py to CanaryModel Signed-off-by: Piotr Żelasko <[email protected]> * Rename `CanaryModel` to `EncDecMultiTaskModel` and remove inheritance from `EncDecTransfModelBPE`; add a separate config for this model Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * Rename `CanaryDataset` to `PromptedAudioToTextLhotseDataset`; add `prompt_format_fn` argument; clean-up the `_canary_prompt_format` function a bit Signed-off-by: Piotr Żelasko <[email protected]> * Move tokenization into `prompt_format_fn`, fix usage, add docs Signed-off-by: Piotr Żelasko <[email protected]> * Backward-compatible utterance validation Signed-off-by: Piotr Żelasko <[email protected]> * Improve type annotations Signed-off-by: Piotr Żelasko <[email protected]> * config and prompt_fn registration changes from review Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * fix transcribe config Signed-off-by: stevehuang52 <[email protected]> * Refactor Canary to follow schema of remaining ASR models (#8260) * Initial draft of multi task beam decoding strategy Signed-off-by: smajumdar <[email protected]> * Stabilize inference Signed-off-by: smajumdar <[email protected]> * Update AED Multi Task model to mostly conform to Archetype-Type format. Update config Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add change decoding strategy Signed-off-by: smajumdar <[email protected]> * Remove redundant imports Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Cleanup Signed-off-by: smajumdar <[email protected]> * Cleanup Signed-off-by: smajumdar <[email protected]> * remove asr transformer dependency on nlp Signed-off-by: stevehuang52 <[email protected]> * clean up Signed-off-by: stevehuang52 <[email protected]> * copy token_classifier from nlp to asr Signed-off-by: stevehuang52 <[email protected]> * Address comments Signed-off-by: smajumdar <[email protected]> * Add typing to beam decoding Signed-off-by: smajumdar <[email protected]> * Make prompt format configurable Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * drop asr dependency on nlp Signed-off-by: stevehuang52 <[email protected]> --------- Signed-off-by: smajumdar <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: stevehuang52 <[email protected]> * fix transcribe, update asr evaluator Signed-off-by: stevehuang52 <[email protected]> * Extend the docs for the canary prompt_fn Signed-off-by: Piotr Żelasko <[email protected]> * Incorporate changes from Nithin's code review Signed-off-by: Piotr Żelasko <[email protected]> * training bug fix and adding launch script for speech_multitask (#8270) * bug fix and adding launch script for speech_multitask Signed-off-by: Krishna Puvvada <[email protected]> * update launch script example in speech_to_text_aed.py Signed-off-by: Krishna Puvvada <[email protected]> --------- Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> * Fix: drop_last must be true in validation/test otherwise the training will hang Signed-off-by: Piotr Żelasko <[email protected]> * revert to current transcribe API Signed-off-by: stevehuang52 <[email protected]> * revert changes to NLP, update docs Signed-off-by: stevehuang52 <[email protected]> * update eval utils Signed-off-by: stevehuang52 <[email protected]> * update docs Signed-off-by: stevehuang52 <[email protected]> * Remove DALI; rename compute_audio_loss to compute_loss Signed-off-by: Piotr Żelasko <[email protected]> * set default use_model_transcribe=False Signed-off-by: stevehuang52 <[email protected]> * change os.path.dirname to pathlib Signed-off-by: stevehuang52 <[email protected]> * [canary] Test for CanaryTokenizer + refactoring (#8285) * Test for CanaryTokenizer Signed-off-by: Piotr Żelasko <[email protected]> * Attempt at refactor... Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * Update config for AED models (#8294) Signed-off-by: smajumdar <[email protected]> * set default calculate_wer=False in transcribe_speech.py Signed-off-by: stevehuang52 <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks Signed-off-by: Piotr Żelasko <[email protected]> * Apply suggestions from code review, part 1 Co-authored-by: Nithin Rao <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> * Apply suggestions from code review, part 2 Signed-off-by: Piotr Żelasko <[email protected]> * Document compute_loss Signed-off-by: Piotr Żelasko <[email protected]> * update transcribe_speech.py Signed-off-by: stevehuang52 <[email protected]> * add docstring Signed-off-by: stevehuang52 <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: Krishna Puvvada <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> Co-authored-by: stevehuang52 <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: He Huang (Steve) <[email protected]> Co-authored-by: Nithin Rao <[email protected]> (cherry picked from commit d10726d) Co-authored-by: Piotr Żelasko <[email protected]> * Multimodal r1.23.0 bug fix (#8315) * Rename quick-gelu Signed-off-by: yaoyu-33 <[email protected]> * ddpm config guard Signed-off-by: yaoyu-33 <[email protected]> * Fix ddpm edit api Signed-off-by: yaoyu-33 <[email protected]> * Fix insert_image_token cfg issue Signed-off-by: yaoyu-33 <[email protected]> * neva updates Signed-off-by: yaoyu-33 <[email protected]> * reformat Signed-off-by: yaoyu-33 <[email protected]> * Add back jenkins Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix jenkins Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix bugs Signed-off-by: yaoyu-33 <[email protected]> * Update default neva template Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fixes for MoE parameter passing & use of AutoTokenizer/Model for mistral. (#8272) Signed-off-by: Alexandros Koumparoulis <[email protected]> * Keep max_seqlen and cu_seqlens_argmin for later micro-batches when PP>1 (#8334) Signed-off-by: Sangkug Lym <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Remove asr webapp (#8347) Signed-off-by: smajumdar <[email protected]> * remove _target_ at model level in aed config (#8351) Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> * Add change_vocabulary and save_tokenizers() support to Multitask ASR models (#8357) * Add change_vocabulary and save_tokenizers() support Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update nemo/collections/asr/models/aed_multitask_models.py Co-authored-by: Piotr Żelasko <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> --------- Signed-off-by: smajumdar <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Piotr Żelasko <[email protected]> * Change default (#8371) Signed-off-by: smajumdar <[email protected]> * bug fix in fast-conformer-aed.yaml and adding jenkins test for speech_to_text_aed model (#8368) Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> * Enable megatron core loggers for GPT pretraining (#8354) * Logging changes tested for gpt_pretraining Signed-off-by: Aishwarya Bhandare <[email protected]> * Additional args Signed-off-by: Aishwarya Bhandare <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Aishwarya Bhandare <[email protected]> Co-authored-by: Aishwarya Bhandare <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <[email protected]> * mcore ds fix (#8283) * [tutorial] fixed missing RIR scripts file. (#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * mcore ds fix Signed-off-by: Dmytro Pykhtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update mcore Signed-off-by: dimapihtar <[email protected]> * revert asr files Signed-off-by: dimapihtar <[email protected]> * add comments Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for mcore mock dataset Signed-off-by: dimapihtar <[email protected]> * update mcore version Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update gpt cfg Signed-off-by: dimapihtar <[email protected]> * update mcore commit Signed-off-by: dimapihtar <[email protected]> * fix Bert unit tests Signed-off-by: dimapihtar <[email protected]> * update bert tests Signed-off-by: dimapihtar <[email protected]> * fix bert mcore test Signed-off-by: dimapihtar <[email protected]> * fix gpt jenkins tests Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update apex & TE commits Signed-off-by: dimapihtar <[email protected]> * revert apex installation Signed-off-by: dimapihtar <[email protected]> * turn off the fusion for jenkins Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: dimapihtar <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Pablo Garay <[email protected]> * Add Finetuning tutorial with HF Datasets (#8356) * Add Finetuning tutorial with HF Datasets Signed-off-by: Nithin Rao Koluguri <nithinraok> * update on Som comments Signed-off-by: Nithin Rao Koluguri <nithinraok> --------- Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * release updates (#8378) * [tutorial] fixed missing RIR scripts file. (#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * mcore ds fix Signed-off-by: Dmytro Pykhtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update mcore Signed-off-by: dimapihtar <[email protected]> * revert asr files Signed-off-by: dimapihtar <[email protected]> * add comments Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for mcore mock dataset Signed-off-by: dimapihtar <[email protected]> * update mcore version Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update gpt cfg Signed-off-by: dimapihtar <[email protected]> * update mcore commit Signed-off-by: dimapihtar <[email protected]> * fix Bert unit tests Signed-off-by: dimapihtar <[email protected]> * update bert tests Signed-off-by: dimapihtar <[email protected]> * fix bert mcore test Signed-off-by: dimapihtar <[email protected]> * fix gpt jenkins tests Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for dict data input type Signed-off-by: dimapihtar <[email protected]> * add mock ds test Signed-off-by: dimapihtar <[email protected]> * add test for dict data input type Signed-off-by: dimapihtar <[email protected]> * mcore ds fix Signed-off-by: dimapihtar <[email protected]> * data input fix Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: dimapihtar <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Pablo Garay <[email protected]> * MCore dataset compatibility for tokenizers (#8390) * Add unique_identifiers for all tokenizers and eod for SentencePieceTokenizer Signed-off-by: Valerie Sarge <[email protected]> * Add generalized token aliases to TokenizerSpec to conform with MegatronTokenizer's interface. Remove now-redundant individual fixes from AutoTokenizer and SentencePieceTokenizer. Signed-off-by: Valerie Sarge <[email protected]> --------- Signed-off-by: Valerie Sarge <[email protected]> Co-authored-by: Pablo Garay <[email protected]> * Mcore customization doc (#8298) * [tutorial] fixed missing RIR scripts file. (#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * Add Bert HF checkpoint converter (#8088) * Add Bert HF checkpoint converter Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Reformat Signed-off-by: yaoyu-33 <[email protected]> * Add BERT ONNX export * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add NeMo BERT to HF BERT script * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Clean code Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update argument names Signed-off-by: yaoyu-33 <[email protected]> * Update build_transformer_config in Bert Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> * initial placeholder Signed-off-by: Huiying Li <[email protected]> * add to intro/index.rst Signed-off-by: Huiying Li <[email protected]> * initial content update Signed-off-by: Huiying Li <[email protected]> * add diff images Signed-off-by: Huiying Li <[email protected]> size Signed-off-by: Huiying Li <[email protected]> * minor fixes * minor language change Signed-off-by: Chen Cui <[email protected]> * clean changes --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Chen Cui <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> Co-authored-by: Huiying Li <[email protected]> Co-authored-by: Chen Cui <[email protected]> * wer fix (#8404) Signed-off-by: Travis Bartley <[email protected]> * updated link to pubmed (#8402) Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * Update NFA video download link (#8406) * update nfa nasa video link Signed-off-by: Elena Rastorgueva <[email protected]> * update link in markdown Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> * revert changes (#8410) Signed-off-by: Chen Cui <[email protected]> * Fix dreambooth data sampler issue (#8400) * Turn on drop last Signed-off-by: yaoyu-33 <[email protected]> * Some neva fixes Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fixed errors in the CTM gen functions (#8416) Signed-off-by: Taejin Park <[email protected]> * add ensemble decoding fix (#8427) Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * SDE bugfix log (#8430) Signed-off-by: George <[email protected]> * mcore customization doc minor fix (#8421) Signed-off-by: Huiying Li <[email protected]> * NeMo-Mistral to HF converter bugfix. (#8353) Signed-off-by: Alexandros Koumparoulis <[email protected]> * Fixing mcore bert for TP, PP and SP (#8336) * Fixing mcore bert for TP, PP and SP * Fixing mcore bert for TP, PP and SP * Fixing mcore version * Fixing mcore version * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> --------- Signed-off-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Add settings to suppress bf16 compile errors in CI on V100 (#8481) * Add settings to suppress bf16 compile errors in CI on V100 Signed-off-by: Abhishree <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Abhishree <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * MoE parameter passing (#8255) * MoE parameter passing Signed-off-by: Alexandros Koumparoulis <[email protected]> * Pass EP/MoE params in consumer scripts. Signed-off-by: Alexandros Koumparoulis <[email protected]> * PR fixes Signed-off-by: Alexandros Koumparoulis <[email protected]> * Use latest commit of mcore-0.5 Signed-off-by: Alexandros Koumparoulis <[email protected]> * CI fix Signed-off-by: Alexandros Koumparoulis <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Update k2 version (#8478) (#8492) Signed-off-by: Vladimir Bataev <[email protected]> * Add fp8 support for SD/Update notebook paths (#8489) * Add fp8 support for SD/Update notebook paths Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <[email protected]> * pin to 0.5.0 (#8465) Signed-off-by: eharper <[email protected]> * Update NeMo Multimodal Requirements (#8515) * Update requirements_multimodal.txt Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * update github raw content link (#8517) Signed-off-by: Chen Cui <[email protected]> * Add dep notice for notebooks (#8522) * add dep notice Signed-off-by: eharper <[email protected]> * revert Signed-off-by: eharper <[email protected]> --------- Signed-off-by: eharper <[email protected]> * Revert FP8 integration (#8520) * Revert FP8 integration Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Update data prep notebook (#8532) Signed-off-by: Mingyuan Ma <[email protected]> * Add back fp8 support * SD-FP8: fix the bug of normalization location Signed-off-by: Mingyuan Ma <[email protected]> * map potential FP8 ckpt to FP16 Signed-off-by: Mingyuan Ma <[email protected]> * Add TE fp8 training Signed-off-by: Mingyuan Ma <[email protected]> * Only overwrite unet precision when self.megatron_amp_O2 is true Signed-off-by: Mingyuan Ma <[email protected]> * New structure is now compatible with old ckpts Signed-off-by: Mingyuan Ma <[email protected]> * Add support on mapping old unet checkpoint to new structure and FP8 structure Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Sync with main branch Signed-off-by: Mingyuan Ma <[email protected]> --------- Signed-off-by: eharper <[email protected]> Signed-off-by: Mikołaj Błaż <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: dimapihtar <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: Nithin Rao Koluguri <nithinraok> Signed-off-by: Jimmy Zhang <[email protected]> Signed-off-by: Chen Cui <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Alexandros Koumparoulis <[email protected]> Signed-off-by: Sangkug Lym <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: Krishna Puvvada <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Signed-off-by: Aishwarya Bhandare <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: Valerie Sarge <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Travis Bartley <[email protected]> Signed-off-by: Taejin Park <[email protected]> Signed-off-by: George <[email protected]> Signed-off-by: Shanmugam Ramasamy <[email protected]> Signed-off-by: Abhishree <[email protected]> Signed-off-by: Vladimir Bataev <[email protected]> Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: eharper <[email protected]> Co-authored-by: mikolajblaz <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: dimapihtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Piotr Żelasko <[email protected]> Co-authored-by: Elena Rastorgueva <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: JimmyZhang12 <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: Chen Cui <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: akoumpa <[email protected]> Co-authored-by: Sangkug Lym <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: ashbhandare <[email protected]> Co-authored-by: Aishwarya Bhandare <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Pablo Garay <[email protected]> Co-authored-by: Valerie Sarge <[email protected]> Co-authored-by: Huiying <[email protected]> Co-authored-by: Bobby Chen <[email protected]> Co-authored-by: Huiying Li <[email protected]> Co-authored-by: tbartley94 <[email protected]> Co-authored-by: Taejin Park <[email protected]> Co-authored-by: George <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Abhishree Thittenamane <[email protected]> Co-authored-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: Vladimir Bataev <[email protected]> Co-authored-by: Mengdi Wang <[email protected]>
suiyoubi
pushed a commit
that referenced
this pull request
May 2, 2024
* update branch Signed-off-by: eharper <[email protected]> * Add dist ckpt support for regular optimizers (#7749) * Add dist ckpt support for regular optimizers Signed-off-by: Mikołaj Błaż <[email protected]> * [tutorial] fixed missing RIR scripts file. (#8257) Signed-off-by: Xuesong Yang <[email protected]> * fix imports Signed-off-by: dimapihtar <[email protected]> * imports fix Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * ci imports fix Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert asr notebook Signed-off-by: dimapihtar <[email protected]> * revert asr notebook Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Mikołaj Błaż <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: dimapihtar <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: dimapihtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Pin lhotse=1.19.2 in r1.23.0 (#8303) Signed-off-by: Piotr Żelasko <[email protected]> * Cache Aware Streaming tutorial notebook (#8296) * add notebook Signed-off-by: Elena Rastorgueva <[email protected]> * rename old notebook to Buffered_Streaming Signed-off-by: Elena Rastorgueva <[email protected]> * call setup_streaming_params in set_default_att_context_size method Signed-off-by: Elena Rastorgueva <[email protected]> * update links in docs Signed-off-by: Elena Rastorgueva <[email protected]> * update links to tutorials in docs Signed-off-by: Elena Rastorgueva <[email protected]> * remove hard-coding Signed-off-by: Elena Rastorgueva <[email protected]> * rename var Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> * fix path location and branch (#8304) * fix path location and branch Signed-off-by: Nithin Rao Koluguri <nithinraok> * change to a floating point number Signed-off-by: Nithin Rao Koluguri <nithinraok> --------- Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Somshubra Majumdar <[email protected]> * add deallocate pipeline output optimization (#8279) * add deallocate pipeline output optimization Signed-off-by: Jimmy Zhang <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Jimmy Zhang <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fix memory leak caused by context parallelism hanging references by omegaconf (#8299) * save cp_size to self Signed-off-by: Jimmy Zhang <[email protected]> * use parallel_state instead of self Signed-off-by: Jimmy Zhang <[email protected]> --------- Signed-off-by: Jimmy Zhang <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: Eric Harper <[email protected]> * remove assertion (#8302) Signed-off-by: dimapihtar <[email protected]> * Update PEFT Doc (#8262) * update peft doc Signed-off-by: Chen Cui <[email protected]> * remove old prompt learning doc and notebook Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * Merge branch 'r1.23.0' into chcui/update_peft_doc Signed-off-by: Chen Cui <[email protected]> * revert accidental changes Signed-off-by: Chen Cui <[email protected]> * revert accidental changes Signed-off-by: Chen Cui <[email protected]> --------- Signed-off-by: Chen Cui <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks (#8242) (#8324) * Rebasing canary changes at current main Signed-off-by: Piotr Żelasko <[email protected]> * Move the changes from asr transformer to nlp transformer as originally intended Signed-off-by: Piotr Żelasko <[email protected]> * update eval to strip spaces before punctuations Signed-off-by: stevehuang52 <[email protected]> * update pc strip Signed-off-by: stevehuang52 <[email protected]> * [canary] Refactor: `PromptedAudioToTextLhotseDataset` and `EncDecMultiTaskModel` (#8247) * Create a separate CanaryDataset and use it inside `transformer_bpe_models.py`. Ditches `token_sequence_format`. Signed-off-by: Piotr Żelasko <[email protected]> * [canary] Refactor: move changes in transformer_bpe_models.py to Canar… (#8252) * [canary] Refactor: move changes in transformer_bpe_models.py to CanaryModel Signed-off-by: Piotr Żelasko <[email protected]> * Rename `CanaryModel` to `EncDecMultiTaskModel` and remove inheritance from `EncDecTransfModelBPE`; add a separate config for this model Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * Rename `CanaryDataset` to `PromptedAudioToTextLhotseDataset`; add `prompt_format_fn` argument; clean-up the `_canary_prompt_format` function a bit Signed-off-by: Piotr Żelasko <[email protected]> * Move tokenization into `prompt_format_fn`, fix usage, add docs Signed-off-by: Piotr Żelasko <[email protected]> * Backward-compatible utterance validation Signed-off-by: Piotr Żelasko <[email protected]> * Improve type annotations Signed-off-by: Piotr Żelasko <[email protected]> * config and prompt_fn registration changes from review Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * fix transcribe config Signed-off-by: stevehuang52 <[email protected]> * Refactor Canary to follow schema of remaining ASR models (#8260) * Initial draft of multi task beam decoding strategy Signed-off-by: smajumdar <[email protected]> * Stabilize inference Signed-off-by: smajumdar <[email protected]> * Update AED Multi Task model to mostly conform to Archetype-Type format. Update config Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add change decoding strategy Signed-off-by: smajumdar <[email protected]> * Remove redundant imports Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Cleanup Signed-off-by: smajumdar <[email protected]> * Cleanup Signed-off-by: smajumdar <[email protected]> * remove asr transformer dependency on nlp Signed-off-by: stevehuang52 <[email protected]> * clean up Signed-off-by: stevehuang52 <[email protected]> * copy token_classifier from nlp to asr Signed-off-by: stevehuang52 <[email protected]> * Address comments Signed-off-by: smajumdar <[email protected]> * Add typing to beam decoding Signed-off-by: smajumdar <[email protected]> * Make prompt format configurable Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * drop asr dependency on nlp Signed-off-by: stevehuang52 <[email protected]> --------- Signed-off-by: smajumdar <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: stevehuang52 <[email protected]> * fix transcribe, update asr evaluator Signed-off-by: stevehuang52 <[email protected]> * Extend the docs for the canary prompt_fn Signed-off-by: Piotr Żelasko <[email protected]> * Incorporate changes from Nithin's code review Signed-off-by: Piotr Żelasko <[email protected]> * training bug fix and adding launch script for speech_multitask (#8270) * bug fix and adding launch script for speech_multitask Signed-off-by: Krishna Puvvada <[email protected]> * update launch script example in speech_to_text_aed.py Signed-off-by: Krishna Puvvada <[email protected]> --------- Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> * Fix: drop_last must be true in validation/test otherwise the training will hang Signed-off-by: Piotr Żelasko <[email protected]> * revert to current transcribe API Signed-off-by: stevehuang52 <[email protected]> * revert changes to NLP, update docs Signed-off-by: stevehuang52 <[email protected]> * update eval utils Signed-off-by: stevehuang52 <[email protected]> * update docs Signed-off-by: stevehuang52 <[email protected]> * Remove DALI; rename compute_audio_loss to compute_loss Signed-off-by: Piotr Żelasko <[email protected]> * set default use_model_transcribe=False Signed-off-by: stevehuang52 <[email protected]> * change os.path.dirname to pathlib Signed-off-by: stevehuang52 <[email protected]> * [canary] Test for CanaryTokenizer + refactoring (#8285) * Test for CanaryTokenizer Signed-off-by: Piotr Żelasko <[email protected]> * Attempt at refactor... Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * Update config for AED models (#8294) Signed-off-by: smajumdar <[email protected]> * set default calculate_wer=False in transcribe_speech.py Signed-off-by: stevehuang52 <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks Signed-off-by: Piotr Żelasko <[email protected]> * Apply suggestions from code review, part 1 Co-authored-by: Nithin Rao <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> * Apply suggestions from code review, part 2 Signed-off-by: Piotr Żelasko <[email protected]> * Document compute_loss Signed-off-by: Piotr Żelasko <[email protected]> * update transcribe_speech.py Signed-off-by: stevehuang52 <[email protected]> * add docstring Signed-off-by: stevehuang52 <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: Krishna Puvvada <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> Co-authored-by: stevehuang52 <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: He Huang (Steve) <[email protected]> Co-authored-by: Nithin Rao <[email protected]> (cherry picked from commit d10726d) Co-authored-by: Piotr Żelasko <[email protected]> * add code for calling mcore_retro in NeMo * add code for calling mcore_retro in NeMo * runnable, training curve match retro mcore and nemo * working on retro inference * working on megatron_retro_eval.py and megatron_retro_inference.yaml * refactoring text_generation_utils code and retro inference relevant files * clean PR * resolving quick hacks (reading number of train/valid samples from workdir, discrepancy in total samples and samples with neighbors retrieved, tokenizers) * clean repository * revert changes to inference/eval code to original in main * clean code * runable training code, with already implemented eval code * [tutorial] fixed missing RIR scripts file. (#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * Add Bert HF checkpoint converter (#8088) * Add Bert HF checkpoint converter Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Reformat Signed-off-by: yaoyu-33 <[email protected]> * Add BERT ONNX export * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add NeMo BERT to HF BERT script * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Clean code Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update argument names Signed-off-by: yaoyu-33 <[email protected]> * Update build_transformer_config in Bert Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> * revert to original eval code files * revert to original eval code files 2 * revert to original eval code files 3 * revert to original eval code files 4 * clean code * clean code * update my code to support changes from lastest main * commit before rebase r1.23.0 * Multimodal r1.23.0 bug fix (#8315) * Rename quick-gelu Signed-off-by: yaoyu-33 <[email protected]> * ddpm config guard Signed-off-by: yaoyu-33 <[email protected]> * Fix ddpm edit api Signed-off-by: yaoyu-33 <[email protected]> * Fix insert_image_token cfg issue Signed-off-by: yaoyu-33 <[email protected]> * neva updates Signed-off-by: yaoyu-33 <[email protected]> * reformat Signed-off-by: yaoyu-33 <[email protected]> * Add back jenkins Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix jenkins Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix bugs Signed-off-by: yaoyu-33 <[email protected]> * Update default neva template Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * copy paste files from r1.23.0 * clean PR * Fixes for MoE parameter passing & use of AutoTokenizer/Model for mistral. (#8272) Signed-off-by: Alexandros Koumparoulis <[email protected]> * Keep max_seqlen and cu_seqlens_argmin for later micro-batches when PP>1 (#8334) Signed-off-by: Sangkug Lym <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Remove asr webapp (#8347) Signed-off-by: smajumdar <[email protected]> * remove _target_ at model level in aed config (#8351) Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> * revert changes for tts and asr * Add change_vocabulary and save_tokenizers() support to Multitask ASR models (#8357) * Add change_vocabulary and save_tokenizers() support Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update nemo/collections/asr/models/aed_multitask_models.py Co-authored-by: Piotr Żelasko <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> --------- Signed-off-by: smajumdar <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Piotr Żelasko <[email protected]> * Change default (#8371) Signed-off-by: smajumdar <[email protected]> * implement retro's own fwd_bwd_step() and validation_step() to not have argument first_val_step, which the MLM commit doesn't support * adding megatron compile_helpers(), in future can be fixed with correct MLM commit * bug fix in fast-conformer-aed.yaml and adding jenkins test for speech_to_text_aed model (#8368) Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> * Enable megatron core loggers for GPT pretraining (#8354) * Logging changes tested for gpt_pretraining Signed-off-by: Aishwarya Bhandare <[email protected]> * Additional args Signed-off-by: Aishwarya Bhandare <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Aishwarya Bhandare <[email protected]> Co-authored-by: Aishwarya Bhandare <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <[email protected]> * mcore ds fix (#8283) * [tutorial] fixed missing RIR scripts file. (#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * mcore ds fix Signed-off-by: Dmytro Pykhtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update mcore Signed-off-by: dimapihtar <[email protected]> * revert asr files Signed-off-by: dimapihtar <[email protected]> * add comments Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for mcore mock dataset Signed-off-by: dimapihtar <[email protected]> * update mcore version Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update gpt cfg Signed-off-by: dimapihtar <[email protected]> * update mcore commit Signed-off-by: dimapihtar <[email protected]> * fix Bert unit tests Signed-off-by: dimapihtar <[email protected]> * update bert tests Signed-off-by: dimapihtar <[email protected]> * fix bert mcore test Signed-off-by: dimapihtar <[email protected]> * fix gpt jenkins tests Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update apex & TE commits Signed-off-by: dimapihtar <[email protected]> * revert apex installation Signed-off-by: dimapihtar <[email protected]> * turn off the fusion for jenkins Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: dimapihtar <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Pablo Garay <[email protected]> * addressing Eric's reviews * adding existing implementation RETRO files * adding existing implementation RETRO files * Add Finetuning tutorial with HF Datasets (#8356) * Add Finetuning tutorial with HF Datasets Signed-off-by: Nithin Rao Koluguri <nithinraok> * update on Som comments Signed-off-by: Nithin Rao Koluguri <nithinraok> --------- Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * release updates (#8378) * [tutorial] fixed missing RIR scripts file. (#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * mcore ds fix Signed-off-by: Dmytro Pykhtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update mcore Signed-off-by: dimapihtar <[email protected]> * revert asr files Signed-off-by: dimapihtar <[email protected]> * add comments Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for mcore mock dataset Signed-off-by: dimapihtar <[email protected]> * update mcore version Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update gpt cfg Signed-off-by: dimapihtar <[email protected]> * update mcore commit Signed-off-by: dimapihtar <[email protected]> * fix Bert unit tests Signed-off-by: dimapihtar <[email protected]> * update bert tests Signed-off-by: dimapihtar <[email protected]> * fix bert mcore test Signed-off-by: dimapihtar <[email protected]> * fix gpt jenkins tests Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for dict data input type Signed-off-by: dimapihtar <[email protected]> * add mock ds test Signed-off-by: dimapihtar <[email protected]> * add test for dict data input type Signed-off-by: dimapihtar <[email protected]> * mcore ds fix Signed-off-by: dimapihtar <[email protected]> * data input fix Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: dimapihtar <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Pablo Garay <[email protected]> * MCore dataset compatibility for tokenizers (#8390) * Add unique_identifiers for all tokenizers and eod for SentencePieceTokenizer Signed-off-by: Valerie Sarge <[email protected]> * Add generalized token aliases to TokenizerSpec to conform with MegatronTokenizer's interface. Remove now-redundant individual fixes from AutoTokenizer and SentencePieceTokenizer. Signed-off-by: Valerie Sarge <[email protected]> --------- Signed-off-by: Valerie Sarge <[email protected]> Co-authored-by: Pablo Garay <[email protected]> * Mcore customization doc (#8298) * [tutorial] fixed missing RIR scripts file. (#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * Add Bert HF checkpoint converter (#8088) * Add Bert HF checkpoint converter Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Reformat Signed-off-by: yaoyu-33 <[email protected]> * Add BERT ONNX export * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add NeMo BERT to HF BERT script * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Clean code Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update argument names Signed-off-by: yaoyu-33 <[email protected]> * Update build_transformer_config in Bert Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> * initial placeholder Signed-off-by: Huiying Li <[email protected]> * add to intro/index.rst Signed-off-by: Huiying Li <[email protected]> * initial content update Signed-off-by: Huiying Li <[email protected]> * add diff images Signed-off-by: Huiying Li <[email protected]> size Signed-off-by: Huiying Li <[email protected]> * minor fixes * minor language change Signed-off-by: Chen Cui <[email protected]> * clean changes --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Chen Cui <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> Co-authored-by: Huiying Li <[email protected]> Co-authored-by: Chen Cui <[email protected]> * wer fix (#8404) Signed-off-by: Travis Bartley <[email protected]> * updated link to pubmed (#8402) Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * Update NFA video download link (#8406) * update nfa nasa video link Signed-off-by: Elena Rastorgueva <[email protected]> * update link in markdown Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> * revert changes (#8410) Signed-off-by: Chen Cui <[email protected]> * Fix dreambooth data sampler issue (#8400) * Turn on drop last Signed-off-by: yaoyu-33 <[email protected]> * Some neva fixes Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fixed errors in the CTM gen functions (#8416) Signed-off-by: Taejin Park <[email protected]> * add ensemble decoding fix (#8427) Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * SDE bugfix log (#8430) Signed-off-by: George <[email protected]> * mcore customization doc minor fix (#8421) Signed-off-by: Huiying Li <[email protected]> * NeMo-Mistral to HF converter bugfix. (#8353) Signed-off-by: Alexandros Koumparoulis <[email protected]> * Fixing mcore bert for TP, PP and SP (#8336) * Fixing mcore bert for TP, PP and SP * Fixing mcore bert for TP, PP and SP * Fixing mcore version * Fixing mcore version * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> --------- Signed-off-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Add settings to suppress bf16 compile errors in CI on V100 (#8481) * Add settings to suppress bf16 compile errors in CI on V100 Signed-off-by: Abhishree <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Abhishree <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * MoE parameter passing (#8255) * MoE parameter passing Signed-off-by: Alexandros Koumparoulis <[email protected]> * Pass EP/MoE params in consumer scripts. Signed-off-by: Alexandros Koumparoulis <[email protected]> * PR fixes Signed-off-by: Alexandros Koumparoulis <[email protected]> * Use latest commit of mcore-0.5 Signed-off-by: Alexandros Koumparoulis <[email protected]> * CI fix Signed-off-by: Alexandros Koumparoulis <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Update k2 version (#8478) (#8492) Signed-off-by: Vladimir Bataev <[email protected]> * Add fp8 support for SD/Update notebook paths (#8489) * Add fp8 support for SD/Update notebook paths Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <[email protected]> * pin to 0.5.0 (#8465) Signed-off-by: eharper <[email protected]> * Update NeMo Multimodal Requirements (#8515) * Update requirements_multimodal.txt Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * update github raw content link (#8517) Signed-off-by: Chen Cui <[email protected]> * Add dep notice for notebooks (#8522) * add dep notice Signed-off-by: eharper <[email protected]> * revert Signed-off-by: eharper <[email protected]> --------- Signed-off-by: eharper <[email protected]> * Revert FP8 integration (#8520) * Revert FP8 integration Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Update data prep notebook (#8532) Signed-off-by: Mingyuan Ma <[email protected]> * before update branch with latest r1.23.0 * update to run with MLM ae2817b3dde4efb1515061a5311d01d8f85bd99c (runnable training and saving checkpoint) * remove compile_helpers * reverse changes from main branch to r1.23.0 * adding *_legacy files * update MLM commit in Jenkinsfile to latest * debugging Jenkinstest: test different mcore import in retro_dataset * update Jenkinsfile edit megatron_retro_mutransfer_pretrain_legacy.py * removing all mcore RETRO to pass the Jenkinstest * fixing import legacy problem for tests/collections/nlp/test_indexed_retrieval_dataset.py * update Jenkinsfile file to use TE v0.7 * update NeMo to work with latest mcore RETRO (solving TE problems) * update TE commit Jenkinsfile to be the same with r1.23.0's Jenkinsfile * update commit for MLM * jenkinstest debugging * temporary fix RETRO's __init__ for jenkinstest * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * add model.data.dataloader_type=cyclic to jenkinsfile * update code to work with latest megatron-lm main 81dab6067 * update M-LM commit in Jenkinsfile to latest main M-LM 81dab6067 * fix to by pass CI test bf16 problem (following this PR https://github.com/NVIDIA/NeMo/pull/8481/files) * isort and black * adjusting model.micro_batch_size to 1 * fix BRANCH = 'r1.23.0' * replace tutorials dir from main branch to huvu/mcore_retro * fix minor merges conflict * update Jenkinsfile * runnable with a temporary fix from Jacek (unfound -unfinished problem) * runnable with a temporary fix from Jacek (unfound -unfinished problem) * modified nlp_overrides.py back to original * fix checkpoint from Jacek Bieniusiewicz * config Jenkinsfile test * set RETRO Jenkins MBS to 1 * black fix * isort fix * update TE commit * update to latest Jenkinsfile with latest container and commits * remove new RETRO jenkinstest * merge latest main * put RETRO Jenkinstest to the right place * update code for megatron_retro_pretraining_legacy.py * untrack ipa_cmudict-0.7b_nv23.01.txt * untrack ipa_cmudict-0.7b_nv23.01.txt * set config in megatron_retro_pretraining_legacy.py to megatron_retro_config_legacy * update new RETRO jenkinstest to run faster * merging latest main, and edit Jenkinstest * update Jenkinstest for new RETRO to run faster * fix isort * fix whitespace Signed-off-by: eharper <[email protected]> --------- Signed-off-by: eharper <[email protected]> Signed-off-by: Mikołaj Błaż <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: dimapihtar <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: Nithin Rao Koluguri <nithinraok> Signed-off-by: Jimmy Zhang <[email protected]> Signed-off-by: Chen Cui <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Alexandros Koumparoulis <[email protected]> Signed-off-by: Sangkug Lym <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: Krishna Puvvada <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Signed-off-by: Aishwarya Bhandare <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: Valerie Sarge <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Travis Bartley <[email protected]> Signed-off-by: Taejin Park <[email protected]> Signed-off-by: George <[email protected]> Signed-off-by: Shanmugam Ramasamy <[email protected]> Signed-off-by: Abhishree <[email protected]> Signed-off-by: Vladimir Bataev <[email protected]> Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: eharper <[email protected]> Co-authored-by: mikolajblaz <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: dimapihtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Piotr Żelasko <[email protected]> Co-authored-by: Elena Rastorgueva <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: JimmyZhang12 <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: Chen Cui <[email protected]> Co-authored-by: Huy Vu2 <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: Bobby Chen <[email protected]> Co-authored-by: akoumpa <[email protected]> Co-authored-by: Sangkug Lym <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: ashbhandare <[email protected]> Co-authored-by: Aishwarya Bhandare <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Pablo Garay <[email protected]> Co-authored-by: Valerie Sarge <[email protected]> Co-authored-by: Huiying <[email protected]> Co-authored-by: Huiying Li <[email protected]> Co-authored-by: tbartley94 <[email protected]> Co-authored-by: Taejin Park <[email protected]> Co-authored-by: George <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Abhishree Thittenamane <[email protected]> Co-authored-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: Vladimir Bataev <[email protected]> Co-authored-by: Ming <[email protected]> Co-authored-by: Huy Vu2 <[email protected]> Signed-off-by: Ao Tang <[email protected]>
suiyoubi
pushed a commit
that referenced
this pull request
May 2, 2024
* update branch Signed-off-by: eharper <[email protected]> * Add dist ckpt support for regular optimizers (#7749) * Add dist ckpt support for regular optimizers Signed-off-by: Mikołaj Błaż <[email protected]> * [tutorial] fixed missing RIR scripts file. (#8257) Signed-off-by: Xuesong Yang <[email protected]> * fix imports Signed-off-by: dimapihtar <[email protected]> * imports fix Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * ci imports fix Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert asr notebook Signed-off-by: dimapihtar <[email protected]> * revert asr notebook Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Mikołaj Błaż <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: dimapihtar <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: dimapihtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Pin lhotse=1.19.2 in r1.23.0 (#8303) Signed-off-by: Piotr Żelasko <[email protected]> * Cache Aware Streaming tutorial notebook (#8296) * add notebook Signed-off-by: Elena Rastorgueva <[email protected]> * rename old notebook to Buffered_Streaming Signed-off-by: Elena Rastorgueva <[email protected]> * call setup_streaming_params in set_default_att_context_size method Signed-off-by: Elena Rastorgueva <[email protected]> * update links in docs Signed-off-by: Elena Rastorgueva <[email protected]> * update links to tutorials in docs Signed-off-by: Elena Rastorgueva <[email protected]> * remove hard-coding Signed-off-by: Elena Rastorgueva <[email protected]> * rename var Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> * fix path location and branch (#8304) * fix path location and branch Signed-off-by: Nithin Rao Koluguri <nithinraok> * change to a floating point number Signed-off-by: Nithin Rao Koluguri <nithinraok> --------- Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Somshubra Majumdar <[email protected]> * add deallocate pipeline output optimization (#8279) * add deallocate pipeline output optimization Signed-off-by: Jimmy Zhang <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Jimmy Zhang <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fix memory leak caused by context parallelism hanging references by omegaconf (#8299) * save cp_size to self Signed-off-by: Jimmy Zhang <[email protected]> * use parallel_state instead of self Signed-off-by: Jimmy Zhang <[email protected]> --------- Signed-off-by: Jimmy Zhang <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: Eric Harper <[email protected]> * remove assertion (#8302) Signed-off-by: dimapihtar <[email protected]> * Update PEFT Doc (#8262) * update peft doc Signed-off-by: Chen Cui <[email protected]> * remove old prompt learning doc and notebook Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * Merge branch 'r1.23.0' into chcui/update_peft_doc Signed-off-by: Chen Cui <[email protected]> * revert accidental changes Signed-off-by: Chen Cui <[email protected]> * revert accidental changes Signed-off-by: Chen Cui <[email protected]> --------- Signed-off-by: Chen Cui <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks (#8242) (#8324) * Rebasing canary changes at current main Signed-off-by: Piotr Żelasko <[email protected]> * Move the changes from asr transformer to nlp transformer as originally intended Signed-off-by: Piotr Żelasko <[email protected]> * update eval to strip spaces before punctuations Signed-off-by: stevehuang52 <[email protected]> * update pc strip Signed-off-by: stevehuang52 <[email protected]> * [canary] Refactor: `PromptedAudioToTextLhotseDataset` and `EncDecMultiTaskModel` (#8247) * Create a separate CanaryDataset and use it inside `transformer_bpe_models.py`. Ditches `token_sequence_format`. Signed-off-by: Piotr Żelasko <[email protected]> * [canary] Refactor: move changes in transformer_bpe_models.py to Canar… (#8252) * [canary] Refactor: move changes in transformer_bpe_models.py to CanaryModel Signed-off-by: Piotr Żelasko <[email protected]> * Rename `CanaryModel` to `EncDecMultiTaskModel` and remove inheritance from `EncDecTransfModelBPE`; add a separate config for this model Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * Rename `CanaryDataset` to `PromptedAudioToTextLhotseDataset`; add `prompt_format_fn` argument; clean-up the `_canary_prompt_format` function a bit Signed-off-by: Piotr Żelasko <[email protected]> * Move tokenization into `prompt_format_fn`, fix usage, add docs Signed-off-by: Piotr Żelasko <[email protected]> * Backward-compatible utterance validation Signed-off-by: Piotr Żelasko <[email protected]> * Improve type annotations Signed-off-by: Piotr Żelasko <[email protected]> * config and prompt_fn registration changes from review Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * fix transcribe config Signed-off-by: stevehuang52 <[email protected]> * Refactor Canary to follow schema of remaining ASR models (#8260) * Initial draft of multi task beam decoding strategy Signed-off-by: smajumdar <[email protected]> * Stabilize inference Signed-off-by: smajumdar <[email protected]> * Update AED Multi Task model to mostly conform to Archetype-Type format. Update config Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add change decoding strategy Signed-off-by: smajumdar <[email protected]> * Remove redundant imports Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Cleanup Signed-off-by: smajumdar <[email protected]> * Cleanup Signed-off-by: smajumdar <[email protected]> * remove asr transformer dependency on nlp Signed-off-by: stevehuang52 <[email protected]> * clean up Signed-off-by: stevehuang52 <[email protected]> * copy token_classifier from nlp to asr Signed-off-by: stevehuang52 <[email protected]> * Address comments Signed-off-by: smajumdar <[email protected]> * Add typing to beam decoding Signed-off-by: smajumdar <[email protected]> * Make prompt format configurable Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * drop asr dependency on nlp Signed-off-by: stevehuang52 <[email protected]> --------- Signed-off-by: smajumdar <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: stevehuang52 <[email protected]> * fix transcribe, update asr evaluator Signed-off-by: stevehuang52 <[email protected]> * Extend the docs for the canary prompt_fn Signed-off-by: Piotr Żelasko <[email protected]> * Incorporate changes from Nithin's code review Signed-off-by: Piotr Żelasko <[email protected]> * training bug fix and adding launch script for speech_multitask (#8270) * bug fix and adding launch script for speech_multitask Signed-off-by: Krishna Puvvada <[email protected]> * update launch script example in speech_to_text_aed.py Signed-off-by: Krishna Puvvada <[email protected]> --------- Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> * Fix: drop_last must be true in validation/test otherwise the training will hang Signed-off-by: Piotr Żelasko <[email protected]> * revert to current transcribe API Signed-off-by: stevehuang52 <[email protected]> * revert changes to NLP, update docs Signed-off-by: stevehuang52 <[email protected]> * update eval utils Signed-off-by: stevehuang52 <[email protected]> * update docs Signed-off-by: stevehuang52 <[email protected]> * Remove DALI; rename compute_audio_loss to compute_loss Signed-off-by: Piotr Żelasko <[email protected]> * set default use_model_transcribe=False Signed-off-by: stevehuang52 <[email protected]> * change os.path.dirname to pathlib Signed-off-by: stevehuang52 <[email protected]> * [canary] Test for CanaryTokenizer + refactoring (#8285) * Test for CanaryTokenizer Signed-off-by: Piotr Żelasko <[email protected]> * Attempt at refactor... Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * Update config for AED models (#8294) Signed-off-by: smajumdar <[email protected]> * set default calculate_wer=False in transcribe_speech.py Signed-off-by: stevehuang52 <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks Signed-off-by: Piotr Żelasko <[email protected]> * Apply suggestions from code review, part 1 Co-authored-by: Nithin Rao <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> * Apply suggestions from code review, part 2 Signed-off-by: Piotr Żelasko <[email protected]> * Document compute_loss Signed-off-by: Piotr Żelasko <[email protected]> * update transcribe_speech.py Signed-off-by: stevehuang52 <[email protected]> * add docstring Signed-off-by: stevehuang52 <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: Krishna Puvvada <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> Co-authored-by: stevehuang52 <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: He Huang (Steve) <[email protected]> Co-authored-by: Nithin Rao <[email protected]> (cherry picked from commit d10726d) Co-authored-by: Piotr Żelasko <[email protected]> * add code for calling mcore_retro in NeMo * add code for calling mcore_retro in NeMo * runnable, training curve match retro mcore and nemo * working on retro inference * working on megatron_retro_eval.py and megatron_retro_inference.yaml * refactoring text_generation_utils code and retro inference relevant files * clean PR * resolving quick hacks (reading number of train/valid samples from workdir, discrepancy in total samples and samples with neighbors retrieved, tokenizers) * clean repository * revert changes to inference/eval code to original in main * clean code * runable training code, with already implemented eval code * [tutorial] fixed missing RIR scripts file. (#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * Add Bert HF checkpoint converter (#8088) * Add Bert HF checkpoint converter Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Reformat Signed-off-by: yaoyu-33 <[email protected]> * Add BERT ONNX export * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add NeMo BERT to HF BERT script * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Clean code Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update argument names Signed-off-by: yaoyu-33 <[email protected]> * Update build_transformer_config in Bert Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> * revert to original eval code files * revert to original eval code files 2 * revert to original eval code files 3 * revert to original eval code files 4 * clean code * clean code * update my code to support changes from lastest main * commit before rebase r1.23.0 * Multimodal r1.23.0 bug fix (#8315) * Rename quick-gelu Signed-off-by: yaoyu-33 <[email protected]> * ddpm config guard Signed-off-by: yaoyu-33 <[email protected]> * Fix ddpm edit api Signed-off-by: yaoyu-33 <[email protected]> * Fix insert_image_token cfg issue Signed-off-by: yaoyu-33 <[email protected]> * neva updates Signed-off-by: yaoyu-33 <[email protected]> * reformat Signed-off-by: yaoyu-33 <[email protected]> * Add back jenkins Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix jenkins Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix bugs Signed-off-by: yaoyu-33 <[email protected]> * Update default neva template Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * copy paste files from r1.23.0 * clean PR * Fixes for MoE parameter passing & use of AutoTokenizer/Model for mistral. (#8272) Signed-off-by: Alexandros Koumparoulis <[email protected]> * Keep max_seqlen and cu_seqlens_argmin for later micro-batches when PP>1 (#8334) Signed-off-by: Sangkug Lym <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Remove asr webapp (#8347) Signed-off-by: smajumdar <[email protected]> * remove _target_ at model level in aed config (#8351) Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> * revert changes for tts and asr * Add change_vocabulary and save_tokenizers() support to Multitask ASR models (#8357) * Add change_vocabulary and save_tokenizers() support Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update nemo/collections/asr/models/aed_multitask_models.py Co-authored-by: Piotr Żelasko <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> --------- Signed-off-by: smajumdar <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Piotr Żelasko <[email protected]> * Change default (#8371) Signed-off-by: smajumdar <[email protected]> * implement retro's own fwd_bwd_step() and validation_step() to not have argument first_val_step, which the MLM commit doesn't support * adding megatron compile_helpers(), in future can be fixed with correct MLM commit * bug fix in fast-conformer-aed.yaml and adding jenkins test for speech_to_text_aed model (#8368) Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> * Enable megatron core loggers for GPT pretraining (#8354) * Logging changes tested for gpt_pretraining Signed-off-by: Aishwarya Bhandare <[email protected]> * Additional args Signed-off-by: Aishwarya Bhandare <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Aishwarya Bhandare <[email protected]> Co-authored-by: Aishwarya Bhandare <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <[email protected]> * mcore ds fix (#8283) * [tutorial] fixed missing RIR scripts file. (#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * mcore ds fix Signed-off-by: Dmytro Pykhtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update mcore Signed-off-by: dimapihtar <[email protected]> * revert asr files Signed-off-by: dimapihtar <[email protected]> * add comments Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for mcore mock dataset Signed-off-by: dimapihtar <[email protected]> * update mcore version Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update gpt cfg Signed-off-by: dimapihtar <[email protected]> * update mcore commit Signed-off-by: dimapihtar <[email protected]> * fix Bert unit tests Signed-off-by: dimapihtar <[email protected]> * update bert tests Signed-off-by: dimapihtar <[email protected]> * fix bert mcore test Signed-off-by: dimapihtar <[email protected]> * fix gpt jenkins tests Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update apex & TE commits Signed-off-by: dimapihtar <[email protected]> * revert apex installation Signed-off-by: dimapihtar <[email protected]> * turn off the fusion for jenkins Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: dimapihtar <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Pablo Garay <[email protected]> * addressing Eric's reviews * adding existing implementation RETRO files * adding existing implementation RETRO files * Add Finetuning tutorial with HF Datasets (#8356) * Add Finetuning tutorial with HF Datasets Signed-off-by: Nithin Rao Koluguri <nithinraok> * update on Som comments Signed-off-by: Nithin Rao Koluguri <nithinraok> --------- Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * release updates (#8378) * [tutorial] fixed missing RIR scripts file. (#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * mcore ds fix Signed-off-by: Dmytro Pykhtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update mcore Signed-off-by: dimapihtar <[email protected]> * revert asr files Signed-off-by: dimapihtar <[email protected]> * add comments Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for mcore mock dataset Signed-off-by: dimapihtar <[email protected]> * update mcore version Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update gpt cfg Signed-off-by: dimapihtar <[email protected]> * update mcore commit Signed-off-by: dimapihtar <[email protected]> * fix Bert unit tests Signed-off-by: dimapihtar <[email protected]> * update bert tests Signed-off-by: dimapihtar <[email protected]> * fix bert mcore test Signed-off-by: dimapihtar <[email protected]> * fix gpt jenkins tests Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for dict data input type Signed-off-by: dimapihtar <[email protected]> * add mock ds test Signed-off-by: dimapihtar <[email protected]> * add test for dict data input type Signed-off-by: dimapihtar <[email protected]> * mcore ds fix Signed-off-by: dimapihtar <[email protected]> * data input fix Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: dimapihtar <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Pablo Garay <[email protected]> * MCore dataset compatibility for tokenizers (#8390) * Add unique_identifiers for all tokenizers and eod for SentencePieceTokenizer Signed-off-by: Valerie Sarge <[email protected]> * Add generalized token aliases to TokenizerSpec to conform with MegatronTokenizer's interface. Remove now-redundant individual fixes from AutoTokenizer and SentencePieceTokenizer. Signed-off-by: Valerie Sarge <[email protected]> --------- Signed-off-by: Valerie Sarge <[email protected]> Co-authored-by: Pablo Garay <[email protected]> * Mcore customization doc (#8298) * [tutorial] fixed missing RIR scripts file. (#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * Add Bert HF checkpoint converter (#8088) * Add Bert HF checkpoint converter Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Reformat Signed-off-by: yaoyu-33 <[email protected]> * Add BERT ONNX export * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add NeMo BERT to HF BERT script * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Clean code Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update argument names Signed-off-by: yaoyu-33 <[email protected]> * Update build_transformer_config in Bert Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> * initial placeholder Signed-off-by: Huiying Li <[email protected]> * add to intro/index.rst Signed-off-by: Huiying Li <[email protected]> * initial content update Signed-off-by: Huiying Li <[email protected]> * add diff images Signed-off-by: Huiying Li <[email protected]> size Signed-off-by: Huiying Li <[email protected]> * minor fixes * minor language change Signed-off-by: Chen Cui <[email protected]> * clean changes --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Chen Cui <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> Co-authored-by: Huiying Li <[email protected]> Co-authored-by: Chen Cui <[email protected]> * wer fix (#8404) Signed-off-by: Travis Bartley <[email protected]> * updated link to pubmed (#8402) Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * Update NFA video download link (#8406) * update nfa nasa video link Signed-off-by: Elena Rastorgueva <[email protected]> * update link in markdown Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> * revert changes (#8410) Signed-off-by: Chen Cui <[email protected]> * Fix dreambooth data sampler issue (#8400) * Turn on drop last Signed-off-by: yaoyu-33 <[email protected]> * Some neva fixes Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fixed errors in the CTM gen functions (#8416) Signed-off-by: Taejin Park <[email protected]> * add ensemble decoding fix (#8427) Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * SDE bugfix log (#8430) Signed-off-by: George <[email protected]> * mcore customization doc minor fix (#8421) Signed-off-by: Huiying Li <[email protected]> * NeMo-Mistral to HF converter bugfix. (#8353) Signed-off-by: Alexandros Koumparoulis <[email protected]> * Fixing mcore bert for TP, PP and SP (#8336) * Fixing mcore bert for TP, PP and SP * Fixing mcore bert for TP, PP and SP * Fixing mcore version * Fixing mcore version * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> --------- Signed-off-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Add settings to suppress bf16 compile errors in CI on V100 (#8481) * Add settings to suppress bf16 compile errors in CI on V100 Signed-off-by: Abhishree <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Abhishree <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * MoE parameter passing (#8255) * MoE parameter passing Signed-off-by: Alexandros Koumparoulis <[email protected]> * Pass EP/MoE params in consumer scripts. Signed-off-by: Alexandros Koumparoulis <[email protected]> * PR fixes Signed-off-by: Alexandros Koumparoulis <[email protected]> * Use latest commit of mcore-0.5 Signed-off-by: Alexandros Koumparoulis <[email protected]> * CI fix Signed-off-by: Alexandros Koumparoulis <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Update k2 version (#8478) (#8492) Signed-off-by: Vladimir Bataev <[email protected]> * Add fp8 support for SD/Update notebook paths (#8489) * Add fp8 support for SD/Update notebook paths Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <[email protected]> * pin to 0.5.0 (#8465) Signed-off-by: eharper <[email protected]> * Update NeMo Multimodal Requirements (#8515) * Update requirements_multimodal.txt Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * update github raw content link (#8517) Signed-off-by: Chen Cui <[email protected]> * Add dep notice for notebooks (#8522) * add dep notice Signed-off-by: eharper <[email protected]> * revert Signed-off-by: eharper <[email protected]> --------- Signed-off-by: eharper <[email protected]> * Revert FP8 integration (#8520) * Revert FP8 integration Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Update data prep notebook (#8532) Signed-off-by: Mingyuan Ma <[email protected]> * before update branch with latest r1.23.0 * update to run with MLM ae2817b3dde4efb1515061a5311d01d8f85bd99c (runnable training and saving checkpoint) * remove compile_helpers * reverse changes from main branch to r1.23.0 * adding *_legacy files * update MLM commit in Jenkinsfile to latest * debugging Jenkinstest: test different mcore import in retro_dataset * update Jenkinsfile edit megatron_retro_mutransfer_pretrain_legacy.py * removing all mcore RETRO to pass the Jenkinstest * fixing import legacy problem for tests/collections/nlp/test_indexed_retrieval_dataset.py * update Jenkinsfile file to use TE v0.7 * update NeMo to work with latest mcore RETRO (solving TE problems) * update TE commit Jenkinsfile to be the same with r1.23.0's Jenkinsfile * update commit for MLM * jenkinstest debugging * temporary fix RETRO's __init__ for jenkinstest * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * add model.data.dataloader_type=cyclic to jenkinsfile * update code to work with latest megatron-lm main 81dab6067 * update M-LM commit in Jenkinsfile to latest main M-LM 81dab6067 * fix to by pass CI test bf16 problem (following this PR https://github.com/NVIDIA/NeMo/pull/8481/files) * isort and black * adjusting model.micro_batch_size to 1 * fix BRANCH = 'r1.23.0' * replace tutorials dir from main branch to huvu/mcore_retro * fix minor merges conflict * update Jenkinsfile * runnable with a temporary fix from Jacek (unfound -unfinished problem) * runnable with a temporary fix from Jacek (unfound -unfinished problem) * modified nlp_overrides.py back to original * fix checkpoint from Jacek Bieniusiewicz * config Jenkinsfile test * set RETRO Jenkins MBS to 1 * black fix * isort fix * update TE commit * update to latest Jenkinsfile with latest container and commits * remove new RETRO jenkinstest * merge latest main * put RETRO Jenkinstest to the right place * update code for megatron_retro_pretraining_legacy.py * untrack ipa_cmudict-0.7b_nv23.01.txt * untrack ipa_cmudict-0.7b_nv23.01.txt * set config in megatron_retro_pretraining_legacy.py to megatron_retro_config_legacy * update new RETRO jenkinstest to run faster * merging latest main, and edit Jenkinstest * update Jenkinstest for new RETRO to run faster * fix isort * adding RETRO tests to cicd-main.yml action tests * update ipa_cmudict-0.7b_nv23.01.txt * remove quotes for model.data for legacy RETRO action tests --------- Signed-off-by: eharper <[email protected]> Signed-off-by: Mikołaj Błaż <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: dimapihtar <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: Nithin Rao Koluguri <nithinraok> Signed-off-by: Jimmy Zhang <[email protected]> Signed-off-by: Chen Cui <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Alexandros Koumparoulis <[email protected]> Signed-off-by: Sangkug Lym <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: Krishna Puvvada <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Signed-off-by: Aishwarya Bhandare <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: Valerie Sarge <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Travis Bartley <[email protected]> Signed-off-by: Taejin Park <[email protected]> Signed-off-by: George <[email protected]> Signed-off-by: Shanmugam Ramasamy <[email protected]> Signed-off-by: Abhishree <[email protected]> Signed-off-by: Vladimir Bataev <[email protected]> Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: eharper <[email protected]> Co-authored-by: mikolajblaz <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: dimapihtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Piotr Żelasko <[email protected]> Co-authored-by: Elena Rastorgueva <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: JimmyZhang12 <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: Chen Cui <[email protected]> Co-authored-by: Huy Vu2 <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: Bobby Chen <[email protected]> Co-authored-by: akoumpa <[email protected]> Co-authored-by: Sangkug Lym <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: ashbhandare <[email protected]> Co-authored-by: Aishwarya Bhandare <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Pablo Garay <[email protected]> Co-authored-by: Valerie Sarge <[email protected]> Co-authored-by: Huiying <[email protected]> Co-authored-by: Huiying Li <[email protected]> Co-authored-by: tbartley94 <[email protected]> Co-authored-by: Taejin Park <[email protected]> Co-authored-by: George <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Abhishree Thittenamane <[email protected]> Co-authored-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: Vladimir Bataev <[email protected]> Co-authored-by: Ming <[email protected]> Co-authored-by: Huy Vu2 <[email protected]> Signed-off-by: Ao Tang <[email protected]>
suiyoubi
pushed a commit
that referenced
this pull request
May 2, 2024
* update branch Signed-off-by: eharper <[email protected]> * Add dist ckpt support for regular optimizers (#7749) * Add dist ckpt support for regular optimizers Signed-off-by: Mikołaj Błaż <[email protected]> * [tutorial] fixed missing RIR scripts file. (#8257) Signed-off-by: Xuesong Yang <[email protected]> * fix imports Signed-off-by: dimapihtar <[email protected]> * imports fix Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * ci imports fix Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert asr notebook Signed-off-by: dimapihtar <[email protected]> * revert asr notebook Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Mikołaj Błaż <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: dimapihtar <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: dimapihtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Pin lhotse=1.19.2 in r1.23.0 (#8303) Signed-off-by: Piotr Żelasko <[email protected]> * Cache Aware Streaming tutorial notebook (#8296) * add notebook Signed-off-by: Elena Rastorgueva <[email protected]> * rename old notebook to Buffered_Streaming Signed-off-by: Elena Rastorgueva <[email protected]> * call setup_streaming_params in set_default_att_context_size method Signed-off-by: Elena Rastorgueva <[email protected]> * update links in docs Signed-off-by: Elena Rastorgueva <[email protected]> * update links to tutorials in docs Signed-off-by: Elena Rastorgueva <[email protected]> * remove hard-coding Signed-off-by: Elena Rastorgueva <[email protected]> * rename var Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> * fix path location and branch (#8304) * fix path location and branch Signed-off-by: Nithin Rao Koluguri <nithinraok> * change to a floating point number Signed-off-by: Nithin Rao Koluguri <nithinraok> --------- Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Somshubra Majumdar <[email protected]> * add deallocate pipeline output optimization (#8279) * add deallocate pipeline output optimization Signed-off-by: Jimmy Zhang <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Jimmy Zhang <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fix memory leak caused by context parallelism hanging references by omegaconf (#8299) * save cp_size to self Signed-off-by: Jimmy Zhang <[email protected]> * use parallel_state instead of self Signed-off-by: Jimmy Zhang <[email protected]> --------- Signed-off-by: Jimmy Zhang <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: Eric Harper <[email protected]> * remove assertion (#8302) Signed-off-by: dimapihtar <[email protected]> * Update PEFT Doc (#8262) * update peft doc Signed-off-by: Chen Cui <[email protected]> * remove old prompt learning doc and notebook Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * Merge branch 'r1.23.0' into chcui/update_peft_doc Signed-off-by: Chen Cui <[email protected]> * revert accidental changes Signed-off-by: Chen Cui <[email protected]> * revert accidental changes Signed-off-by: Chen Cui <[email protected]> --------- Signed-off-by: Chen Cui <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks (#8242) (#8324) * Rebasing canary changes at current main Signed-off-by: Piotr Żelasko <[email protected]> * Move the changes from asr transformer to nlp transformer as originally intended Signed-off-by: Piotr Żelasko <[email protected]> * update eval to strip spaces before punctuations Signed-off-by: stevehuang52 <[email protected]> * update pc strip Signed-off-by: stevehuang52 <[email protected]> * [canary] Refactor: `PromptedAudioToTextLhotseDataset` and `EncDecMultiTaskModel` (#8247) * Create a separate CanaryDataset and use it inside `transformer_bpe_models.py`. Ditches `token_sequence_format`. Signed-off-by: Piotr Żelasko <[email protected]> * [canary] Refactor: move changes in transformer_bpe_models.py to Canar… (#8252) * [canary] Refactor: move changes in transformer_bpe_models.py to CanaryModel Signed-off-by: Piotr Żelasko <[email protected]> * Rename `CanaryModel` to `EncDecMultiTaskModel` and remove inheritance from `EncDecTransfModelBPE`; add a separate config for this model Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * Rename `CanaryDataset` to `PromptedAudioToTextLhotseDataset`; add `prompt_format_fn` argument; clean-up the `_canary_prompt_format` function a bit Signed-off-by: Piotr Żelasko <[email protected]> * Move tokenization into `prompt_format_fn`, fix usage, add docs Signed-off-by: Piotr Żelasko <[email protected]> * Backward-compatible utterance validation Signed-off-by: Piotr Żelasko <[email protected]> * Improve type annotations Signed-off-by: Piotr Żelasko <[email protected]> * config and prompt_fn registration changes from review Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * fix transcribe config Signed-off-by: stevehuang52 <[email protected]> * Refactor Canary to follow schema of remaining ASR models (#8260) * Initial draft of multi task beam decoding strategy Signed-off-by: smajumdar <[email protected]> * Stabilize inference Signed-off-by: smajumdar <[email protected]> * Update AED Multi Task model to mostly conform to Archetype-Type format. Update config Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add change decoding strategy Signed-off-by: smajumdar <[email protected]> * Remove redundant imports Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Cleanup Signed-off-by: smajumdar <[email protected]> * Cleanup Signed-off-by: smajumdar <[email protected]> * remove asr transformer dependency on nlp Signed-off-by: stevehuang52 <[email protected]> * clean up Signed-off-by: stevehuang52 <[email protected]> * copy token_classifier from nlp to asr Signed-off-by: stevehuang52 <[email protected]> * Address comments Signed-off-by: smajumdar <[email protected]> * Add typing to beam decoding Signed-off-by: smajumdar <[email protected]> * Make prompt format configurable Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * drop asr dependency on nlp Signed-off-by: stevehuang52 <[email protected]> --------- Signed-off-by: smajumdar <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: stevehuang52 <[email protected]> * fix transcribe, update asr evaluator Signed-off-by: stevehuang52 <[email protected]> * Extend the docs for the canary prompt_fn Signed-off-by: Piotr Żelasko <[email protected]> * Incorporate changes from Nithin's code review Signed-off-by: Piotr Żelasko <[email protected]> * training bug fix and adding launch script for speech_multitask (#8270) * bug fix and adding launch script for speech_multitask Signed-off-by: Krishna Puvvada <[email protected]> * update launch script example in speech_to_text_aed.py Signed-off-by: Krishna Puvvada <[email protected]> --------- Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> * Fix: drop_last must be true in validation/test otherwise the training will hang Signed-off-by: Piotr Żelasko <[email protected]> * revert to current transcribe API Signed-off-by: stevehuang52 <[email protected]> * revert changes to NLP, update docs Signed-off-by: stevehuang52 <[email protected]> * update eval utils Signed-off-by: stevehuang52 <[email protected]> * update docs Signed-off-by: stevehuang52 <[email protected]> * Remove DALI; rename compute_audio_loss to compute_loss Signed-off-by: Piotr Żelasko <[email protected]> * set default use_model_transcribe=False Signed-off-by: stevehuang52 <[email protected]> * change os.path.dirname to pathlib Signed-off-by: stevehuang52 <[email protected]> * [canary] Test for CanaryTokenizer + refactoring (#8285) * Test for CanaryTokenizer Signed-off-by: Piotr Żelasko <[email protected]> * Attempt at refactor... Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * Update config for AED models (#8294) Signed-off-by: smajumdar <[email protected]> * set default calculate_wer=False in transcribe_speech.py Signed-off-by: stevehuang52 <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks Signed-off-by: Piotr Żelasko <[email protected]> * Apply suggestions from code review, part 1 Co-authored-by: Nithin Rao <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> * Apply suggestions from code review, part 2 Signed-off-by: Piotr Żelasko <[email protected]> * Document compute_loss Signed-off-by: Piotr Żelasko <[email protected]> * update transcribe_speech.py Signed-off-by: stevehuang52 <[email protected]> * add docstring Signed-off-by: stevehuang52 <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: Krishna Puvvada <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> Co-authored-by: stevehuang52 <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: He Huang (Steve) <[email protected]> Co-authored-by: Nithin Rao <[email protected]> (cherry picked from commit d10726d) Co-authored-by: Piotr Żelasko <[email protected]> * add code for calling mcore_retro in NeMo * add code for calling mcore_retro in NeMo * runnable, training curve match retro mcore and nemo * working on retro inference * working on megatron_retro_eval.py and megatron_retro_inference.yaml * refactoring text_generation_utils code and retro inference relevant files * clean PR * resolving quick hacks (reading number of train/valid samples from workdir, discrepancy in total samples and samples with neighbors retrieved, tokenizers) * clean repository * revert changes to inference/eval code to original in main * clean code * runable training code, with already implemented eval code * [tutorial] fixed missing RIR scripts file. (#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * Add Bert HF checkpoint converter (#8088) * Add Bert HF checkpoint converter Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Reformat Signed-off-by: yaoyu-33 <[email protected]> * Add BERT ONNX export * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add NeMo BERT to HF BERT script * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Clean code Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update argument names Signed-off-by: yaoyu-33 <[email protected]> * Update build_transformer_config in Bert Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> * revert to original eval code files * revert to original eval code files 2 * revert to original eval code files 3 * revert to original eval code files 4 * clean code * clean code * update my code to support changes from lastest main * commit before rebase r1.23.0 * Multimodal r1.23.0 bug fix (#8315) * Rename quick-gelu Signed-off-by: yaoyu-33 <[email protected]> * ddpm config guard Signed-off-by: yaoyu-33 <[email protected]> * Fix ddpm edit api Signed-off-by: yaoyu-33 <[email protected]> * Fix insert_image_token cfg issue Signed-off-by: yaoyu-33 <[email protected]> * neva updates Signed-off-by: yaoyu-33 <[email protected]> * reformat Signed-off-by: yaoyu-33 <[email protected]> * Add back jenkins Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix jenkins Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix bugs Signed-off-by: yaoyu-33 <[email protected]> * Update default neva template Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * copy paste files from r1.23.0 * clean PR * Fixes for MoE parameter passing & use of AutoTokenizer/Model for mistral. (#8272) Signed-off-by: Alexandros Koumparoulis <[email protected]> * Keep max_seqlen and cu_seqlens_argmin for later micro-batches when PP>1 (#8334) Signed-off-by: Sangkug Lym <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Remove asr webapp (#8347) Signed-off-by: smajumdar <[email protected]> * remove _target_ at model level in aed config (#8351) Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> * revert changes for tts and asr * Add change_vocabulary and save_tokenizers() support to Multitask ASR models (#8357) * Add change_vocabulary and save_tokenizers() support Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update nemo/collections/asr/models/aed_multitask_models.py Co-authored-by: Piotr Żelasko <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> --------- Signed-off-by: smajumdar <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Piotr Żelasko <[email protected]> * Change default (#8371) Signed-off-by: smajumdar <[email protected]> * implement retro's own fwd_bwd_step() and validation_step() to not have argument first_val_step, which the MLM commit doesn't support * adding megatron compile_helpers(), in future can be fixed with correct MLM commit * bug fix in fast-conformer-aed.yaml and adding jenkins test for speech_to_text_aed model (#8368) Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> * Enable megatron core loggers for GPT pretraining (#8354) * Logging changes tested for gpt_pretraining Signed-off-by: Aishwarya Bhandare <[email protected]> * Additional args Signed-off-by: Aishwarya Bhandare <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Aishwarya Bhandare <[email protected]> Co-authored-by: Aishwarya Bhandare <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <[email protected]> * mcore ds fix (#8283) * [tutorial] fixed missing RIR scripts file. (#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * mcore ds fix Signed-off-by: Dmytro Pykhtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update mcore Signed-off-by: dimapihtar <[email protected]> * revert asr files Signed-off-by: dimapihtar <[email protected]> * add comments Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for mcore mock dataset Signed-off-by: dimapihtar <[email protected]> * update mcore version Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update gpt cfg Signed-off-by: dimapihtar <[email protected]> * update mcore commit Signed-off-by: dimapihtar <[email protected]> * fix Bert unit tests Signed-off-by: dimapihtar <[email protected]> * update bert tests Signed-off-by: dimapihtar <[email protected]> * fix bert mcore test Signed-off-by: dimapihtar <[email protected]> * fix gpt jenkins tests Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update apex & TE commits Signed-off-by: dimapihtar <[email protected]> * revert apex installation Signed-off-by: dimapihtar <[email protected]> * turn off the fusion for jenkins Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: dimapihtar <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Pablo Garay <[email protected]> * addressing Eric's reviews * adding existing implementation RETRO files * adding existing implementation RETRO files * Add Finetuning tutorial with HF Datasets (#8356) * Add Finetuning tutorial with HF Datasets Signed-off-by: Nithin Rao Koluguri <nithinraok> * update on Som comments Signed-off-by: Nithin Rao Koluguri <nithinraok> --------- Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * release updates (#8378) * [tutorial] fixed missing RIR scripts file. (#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * mcore ds fix Signed-off-by: Dmytro Pykhtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update mcore Signed-off-by: dimapihtar <[email protected]> * revert asr files Signed-off-by: dimapihtar <[email protected]> * add comments Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for mcore mock dataset Signed-off-by: dimapihtar <[email protected]> * update mcore version Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update gpt cfg Signed-off-by: dimapihtar <[email protected]> * update mcore commit Signed-off-by: dimapihtar <[email protected]> * fix Bert unit tests Signed-off-by: dimapihtar <[email protected]> * update bert tests Signed-off-by: dimapihtar <[email protected]> * fix bert mcore test Signed-off-by: dimapihtar <[email protected]> * fix gpt jenkins tests Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for dict data input type Signed-off-by: dimapihtar <[email protected]> * add mock ds test Signed-off-by: dimapihtar <[email protected]> * add test for dict data input type Signed-off-by: dimapihtar <[email protected]> * mcore ds fix Signed-off-by: dimapihtar <[email protected]> * data input fix Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: dimapihtar <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Pablo Garay <[email protected]> * MCore dataset compatibility for tokenizers (#8390) * Add unique_identifiers for all tokenizers and eod for SentencePieceTokenizer Signed-off-by: Valerie Sarge <[email protected]> * Add generalized token aliases to TokenizerSpec to conform with MegatronTokenizer's interface. Remove now-redundant individual fixes from AutoTokenizer and SentencePieceTokenizer. Signed-off-by: Valerie Sarge <[email protected]> --------- Signed-off-by: Valerie Sarge <[email protected]> Co-authored-by: Pablo Garay <[email protected]> * Mcore customization doc (#8298) * [tutorial] fixed missing RIR scripts file. (#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * Add Bert HF checkpoint converter (#8088) * Add Bert HF checkpoint converter Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Reformat Signed-off-by: yaoyu-33 <[email protected]> * Add BERT ONNX export * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add NeMo BERT to HF BERT script * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Clean code Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update argument names Signed-off-by: yaoyu-33 <[email protected]> * Update build_transformer_config in Bert Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> * initial placeholder Signed-off-by: Huiying Li <[email protected]> * add to intro/index.rst Signed-off-by: Huiying Li <[email protected]> * initial content update Signed-off-by: Huiying Li <[email protected]> * add diff images Signed-off-by: Huiying Li <[email protected]> size Signed-off-by: Huiying Li <[email protected]> * minor fixes * minor language change Signed-off-by: Chen Cui <[email protected]> * clean changes --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Chen Cui <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> Co-authored-by: Huiying Li <[email protected]> Co-authored-by: Chen Cui <[email protected]> * wer fix (#8404) Signed-off-by: Travis Bartley <[email protected]> * updated link to pubmed (#8402) Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * Update NFA video download link (#8406) * update nfa nasa video link Signed-off-by: Elena Rastorgueva <[email protected]> * update link in markdown Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> * revert changes (#8410) Signed-off-by: Chen Cui <[email protected]> * Fix dreambooth data sampler issue (#8400) * Turn on drop last Signed-off-by: yaoyu-33 <[email protected]> * Some neva fixes Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fixed errors in the CTM gen functions (#8416) Signed-off-by: Taejin Park <[email protected]> * add ensemble decoding fix (#8427) Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * SDE bugfix log (#8430) Signed-off-by: George <[email protected]> * mcore customization doc minor fix (#8421) Signed-off-by: Huiying Li <[email protected]> * NeMo-Mistral to HF converter bugfix. (#8353) Signed-off-by: Alexandros Koumparoulis <[email protected]> * Fixing mcore bert for TP, PP and SP (#8336) * Fixing mcore bert for TP, PP and SP * Fixing mcore bert for TP, PP and SP * Fixing mcore version * Fixing mcore version * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> --------- Signed-off-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Add settings to suppress bf16 compile errors in CI on V100 (#8481) * Add settings to suppress bf16 compile errors in CI on V100 Signed-off-by: Abhishree <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Abhishree <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * MoE parameter passing (#8255) * MoE parameter passing Signed-off-by: Alexandros Koumparoulis <[email protected]> * Pass EP/MoE params in consumer scripts. Signed-off-by: Alexandros Koumparoulis <[email protected]> * PR fixes Signed-off-by: Alexandros Koumparoulis <[email protected]> * Use latest commit of mcore-0.5 Signed-off-by: Alexandros Koumparoulis <[email protected]> * CI fix Signed-off-by: Alexandros Koumparoulis <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Update k2 version (#8478) (#8492) Signed-off-by: Vladimir Bataev <[email protected]> * Add fp8 support for SD/Update notebook paths (#8489) * Add fp8 support for SD/Update notebook paths Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <[email protected]> * pin to 0.5.0 (#8465) Signed-off-by: eharper <[email protected]> * Update NeMo Multimodal Requirements (#8515) * Update requirements_multimodal.txt Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * update github raw content link (#8517) Signed-off-by: Chen Cui <[email protected]> * Add dep notice for notebooks (#8522) * add dep notice Signed-off-by: eharper <[email protected]> * revert Signed-off-by: eharper <[email protected]> --------- Signed-off-by: eharper <[email protected]> * Revert FP8 integration (#8520) * Revert FP8 integration Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Update data prep notebook (#8532) Signed-off-by: Mingyuan Ma <[email protected]> * before update branch with latest r1.23.0 * update to run with MLM ae2817b3dde4efb1515061a5311d01d8f85bd99c (runnable training and saving checkpoint) * remove compile_helpers * reverse changes from main branch to r1.23.0 * adding *_legacy files * update MLM commit in Jenkinsfile to latest * debugging Jenkinstest: test different mcore import in retro_dataset * update Jenkinsfile edit megatron_retro_mutransfer_pretrain_legacy.py * removing all mcore RETRO to pass the Jenkinstest * fixing import legacy problem for tests/collections/nlp/test_indexed_retrieval_dataset.py * update Jenkinsfile file to use TE v0.7 * update NeMo to work with latest mcore RETRO (solving TE problems) * update TE commit Jenkinsfile to be the same with r1.23.0's Jenkinsfile * update commit for MLM * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * jenkinstest debugging * temporary fix RETRO's __init__ for jenkinstest * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * add model.data.dataloader_type=cyclic to jenkinsfile * runnable for inference * update code to work with latest megatron-lm main 81dab6067 * update M-LM commit in Jenkinsfile to latest main M-LM 81dab6067 * cleaning inference code * fix to by pass CI test bf16 problem (following this PR https://github.com/NVIDIA/NeMo/pull/8481/files) * isort and black * adjusting model.micro_batch_size to 1 * fix BRANCH = 'r1.23.0' * replace tutorials dir from main branch to huvu/mcore_retro * fix minor merges conflict * update Jenkinsfile * runnable with a temporary fix from Jacek (unfound -unfinished problem) * runnable with a temporary fix from Jacek (unfound -unfinished problem) * modified nlp_overrides.py back to original * fix checkpoint from Jacek Bieniusiewicz * config Jenkinsfile test * set RETRO Jenkins MBS to 1 * black fix * isort fix * update TE commit * update to latest Jenkinsfile with latest container and commits * remove new RETRO jenkinstest * merge latest main * put RETRO Jenkinstest to the right place * update code for megatron_retro_pretraining_legacy.py * update Jenkins and _legacy.py * update new RETRO jenkinstest to run faster * fixing errors from GitHub Advanced Security / CodeQL * fixing errors from GitHub Advanced Security / CodeQL * update manually branch to huvu/mcore_retro * remove DEBUGGING markers * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * copy paste scripts/tts_dataset_files/ipa_cmudict-0.7b_nv23.01.txt * update codes to fix Github warnings; adding cicd-main.yml action tests * cleaning code, addressing Shanmugam's comments * saving before pulling from main * cleaning code * adding deprecations note * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: eharper <[email protected]> Signed-off-by: Mikołaj Błaż <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: dimapihtar <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: Nithin Rao Koluguri <nithinraok> Signed-off-by: Jimmy Zhang <[email protected]> Signed-off-by: Chen Cui <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Alexandros Koumparoulis <[email protected]> Signed-off-by: Sangkug Lym <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: Krishna Puvvada <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Signed-off-by: Aishwarya Bhandare <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: Valerie Sarge <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Travis Bartley <[email protected]> Signed-off-by: Taejin Park <[email protected]> Signed-off-by: George <[email protected]> Signed-off-by: Shanmugam Ramasamy <[email protected]> Signed-off-by: Abhishree <[email protected]> Signed-off-by: Vladimir Bataev <[email protected]> Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: eharper <[email protected]> Co-authored-by: mikolajblaz <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: dimapihtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Piotr Żelasko <[email protected]> Co-authored-by: Elena Rastorgueva <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: JimmyZhang12 <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: Chen Cui <[email protected]> Co-authored-by: Huy Vu2 <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: Bobby Chen <[email protected]> Co-authored-by: akoumpa <[email protected]> Co-authored-by: Sangkug Lym <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: ashbhandare <[email protected]> Co-authored-by: Aishwarya Bhandare <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Pablo Garay <[email protected]> Co-authored-by: Valerie Sarge <[email protected]> Co-authored-by: Huiying <[email protected]> Co-authored-by: Huiying Li <[email protected]> Co-authored-by: tbartley94 <[email protected]> Co-authored-by: Taejin Park <[email protected]> Co-authored-by: George <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Abhishree Thittenamane <[email protected]> Co-authored-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: Vladimir Bataev <[email protected]> Co-authored-by: Ming <[email protected]> Co-authored-by: Huy Vu2 <[email protected]> Co-authored-by: root <[email protected]> Signed-off-by: Ao Tang <[email protected]>
suiyoubi
pushed a commit
that referenced
this pull request
May 2, 2024
* update branch Signed-off-by: eharper <[email protected]> * Add dist ckpt support for regular optimizers (#7749) * Add dist ckpt support for regular optimizers Signed-off-by: Mikołaj Błaż <[email protected]> * [tutorial] fixed missing RIR scripts file. (#8257) Signed-off-by: Xuesong Yang <[email protected]> * fix imports Signed-off-by: dimapihtar <[email protected]> * imports fix Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * ci imports fix Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert asr notebook Signed-off-by: dimapihtar <[email protected]> * revert asr notebook Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Mikołaj Błaż <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: dimapihtar <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: dimapihtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Pin lhotse=1.19.2 in r1.23.0 (#8303) Signed-off-by: Piotr Żelasko <[email protected]> * Cache Aware Streaming tutorial notebook (#8296) * add notebook Signed-off-by: Elena Rastorgueva <[email protected]> * rename old notebook to Buffered_Streaming Signed-off-by: Elena Rastorgueva <[email protected]> * call setup_streaming_params in set_default_att_context_size method Signed-off-by: Elena Rastorgueva <[email protected]> * update links in docs Signed-off-by: Elena Rastorgueva <[email protected]> * update links to tutorials in docs Signed-off-by: Elena Rastorgueva <[email protected]> * remove hard-coding Signed-off-by: Elena Rastorgueva <[email protected]> * rename var Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> * fix path location and branch (#8304) * fix path location and branch Signed-off-by: Nithin Rao Koluguri <nithinraok> * change to a floating point number Signed-off-by: Nithin Rao Koluguri <nithinraok> --------- Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Somshubra Majumdar <[email protected]> * add deallocate pipeline output optimization (#8279) * add deallocate pipeline output optimization Signed-off-by: Jimmy Zhang <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Jimmy Zhang <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fix memory leak caused by context parallelism hanging references by omegaconf (#8299) * save cp_size to self Signed-off-by: Jimmy Zhang <[email protected]> * use parallel_state instead of self Signed-off-by: Jimmy Zhang <[email protected]> --------- Signed-off-by: Jimmy Zhang <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: Eric Harper <[email protected]> * remove assertion (#8302) Signed-off-by: dimapihtar <[email protected]> * Update PEFT Doc (#8262) * update peft doc Signed-off-by: Chen Cui <[email protected]> * remove old prompt learning doc and notebook Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * Merge branch 'r1.23.0' into chcui/update_peft_doc Signed-off-by: Chen Cui <[email protected]> * revert accidental changes Signed-off-by: Chen Cui <[email protected]> * revert accidental changes Signed-off-by: Chen Cui <[email protected]> --------- Signed-off-by: Chen Cui <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks (#8242) (#8324) * Rebasing canary changes at current main Signed-off-by: Piotr Żelasko <[email protected]> * Move the changes from asr transformer to nlp transformer as originally intended Signed-off-by: Piotr Żelasko <[email protected]> * update eval to strip spaces before punctuations Signed-off-by: stevehuang52 <[email protected]> * update pc strip Signed-off-by: stevehuang52 <[email protected]> * [canary] Refactor: `PromptedAudioToTextLhotseDataset` and `EncDecMultiTaskModel` (#8247) * Create a separate CanaryDataset and use it inside `transformer_bpe_models.py`. Ditches `token_sequence_format`. Signed-off-by: Piotr Żelasko <[email protected]> * [canary] Refactor: move changes in transformer_bpe_models.py to Canar… (#8252) * [canary] Refactor: move changes in transformer_bpe_models.py to CanaryModel Signed-off-by: Piotr Żelasko <[email protected]> * Rename `CanaryModel` to `EncDecMultiTaskModel` and remove inheritance from `EncDecTransfModelBPE`; add a separate config for this model Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * Rename `CanaryDataset` to `PromptedAudioToTextLhotseDataset`; add `prompt_format_fn` argument; clean-up the `_canary_prompt_format` function a bit Signed-off-by: Piotr Żelasko <[email protected]> * Move tokenization into `prompt_format_fn`, fix usage, add docs Signed-off-by: Piotr Żelasko <[email protected]> * Backward-compatible utterance validation Signed-off-by: Piotr Żelasko <[email protected]> * Improve type annotations Signed-off-by: Piotr Żelasko <[email protected]> * config and prompt_fn registration changes from review Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * fix transcribe config Signed-off-by: stevehuang52 <[email protected]> * Refactor Canary to follow schema of remaining ASR models (#8260) * Initial draft of multi task beam decoding strategy Signed-off-by: smajumdar <[email protected]> * Stabilize inference Signed-off-by: smajumdar <[email protected]> * Update AED Multi Task model to mostly conform to Archetype-Type format. Update config Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add change decoding strategy Signed-off-by: smajumdar <[email protected]> * Remove redundant imports Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Cleanup Signed-off-by: smajumdar <[email protected]> * Cleanup Signed-off-by: smajumdar <[email protected]> * remove asr transformer dependency on nlp Signed-off-by: stevehuang52 <[email protected]> * clean up Signed-off-by: stevehuang52 <[email protected]> * copy token_classifier from nlp to asr Signed-off-by: stevehuang52 <[email protected]> * Address comments Signed-off-by: smajumdar <[email protected]> * Add typing to beam decoding Signed-off-by: smajumdar <[email protected]> * Make prompt format configurable Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * drop asr dependency on nlp Signed-off-by: stevehuang52 <[email protected]> --------- Signed-off-by: smajumdar <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: stevehuang52 <[email protected]> * fix transcribe, update asr evaluator Signed-off-by: stevehuang52 <[email protected]> * Extend the docs for the canary prompt_fn Signed-off-by: Piotr Żelasko <[email protected]> * Incorporate changes from Nithin's code review Signed-off-by: Piotr Żelasko <[email protected]> * training bug fix and adding launch script for speech_multitask (#8270) * bug fix and adding launch script for speech_multitask Signed-off-by: Krishna Puvvada <[email protected]> * update launch script example in speech_to_text_aed.py Signed-off-by: Krishna Puvvada <[email protected]> --------- Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> * Fix: drop_last must be true in validation/test otherwise the training will hang Signed-off-by: Piotr Żelasko <[email protected]> * revert to current transcribe API Signed-off-by: stevehuang52 <[email protected]> * revert changes to NLP, update docs Signed-off-by: stevehuang52 <[email protected]> * update eval utils Signed-off-by: stevehuang52 <[email protected]> * update docs Signed-off-by: stevehuang52 <[email protected]> * Remove DALI; rename compute_audio_loss to compute_loss Signed-off-by: Piotr Żelasko <[email protected]> * set default use_model_transcribe=False Signed-off-by: stevehuang52 <[email protected]> * change os.path.dirname to pathlib Signed-off-by: stevehuang52 <[email protected]> * [canary] Test for CanaryTokenizer + refactoring (#8285) * Test for CanaryTokenizer Signed-off-by: Piotr Żelasko <[email protected]> * Attempt at refactor... Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * Update config for AED models (#8294) Signed-off-by: smajumdar <[email protected]> * set default calculate_wer=False in transcribe_speech.py Signed-off-by: stevehuang52 <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks Signed-off-by: Piotr Żelasko <[email protected]> * Apply suggestions from code review, part 1 Co-authored-by: Nithin Rao <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> * Apply suggestions from code review, part 2 Signed-off-by: Piotr Żelasko <[email protected]> * Document compute_loss Signed-off-by: Piotr Żelasko <[email protected]> * update transcribe_speech.py Signed-off-by: stevehuang52 <[email protected]> * add docstring Signed-off-by: stevehuang52 <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: Krishna Puvvada <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> Co-authored-by: stevehuang52 <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: He Huang (Steve) <[email protected]> Co-authored-by: Nithin Rao <[email protected]> (cherry picked from commit d10726d) Co-authored-by: Piotr Żelasko <[email protected]> * add code for calling mcore_retro in NeMo * add code for calling mcore_retro in NeMo * runnable, training curve match retro mcore and nemo * working on retro inference * working on megatron_retro_eval.py and megatron_retro_inference.yaml * refactoring text_generation_utils code and retro inference relevant files * clean PR * resolving quick hacks (reading number of train/valid samples from workdir, discrepancy in total samples and samples with neighbors retrieved, tokenizers) * clean repository * revert changes to inference/eval code to original in main * clean code * runable training code, with already implemented eval code * [tutorial] fixed missing RIR scripts file. (#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * Add Bert HF checkpoint converter (#8088) * Add Bert HF checkpoint converter Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Reformat Signed-off-by: yaoyu-33 <[email protected]> * Add BERT ONNX export * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add NeMo BERT to HF BERT script * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Clean code Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update argument names Signed-off-by: yaoyu-33 <[email protected]> * Update build_transformer_config in Bert Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> * revert to original eval code files * revert to original eval code files 2 * revert to original eval code files 3 * revert to original eval code files 4 * clean code * clean code * update my code to support changes from lastest main * commit before rebase r1.23.0 * Multimodal r1.23.0 bug fix (#8315) * Rename quick-gelu Signed-off-by: yaoyu-33 <[email protected]> * ddpm config guard Signed-off-by: yaoyu-33 <[email protected]> * Fix ddpm edit api Signed-off-by: yaoyu-33 <[email protected]> * Fix insert_image_token cfg issue Signed-off-by: yaoyu-33 <[email protected]> * neva updates Signed-off-by: yaoyu-33 <[email protected]> * reformat Signed-off-by: yaoyu-33 <[email protected]> * Add back jenkins Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix jenkins Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix bugs Signed-off-by: yaoyu-33 <[email protected]> * Update default neva template Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * copy paste files from r1.23.0 * clean PR * Fixes for MoE parameter passing & use of AutoTokenizer/Model for mistral. (#8272) Signed-off-by: Alexandros Koumparoulis <[email protected]> * Keep max_seqlen and cu_seqlens_argmin for later micro-batches when PP>1 (#8334) Signed-off-by: Sangkug Lym <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Remove asr webapp (#8347) Signed-off-by: smajumdar <[email protected]> * remove _target_ at model level in aed config (#8351) Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> * revert changes for tts and asr * Add change_vocabulary and save_tokenizers() support to Multitask ASR models (#8357) * Add change_vocabulary and save_tokenizers() support Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update nemo/collections/asr/models/aed_multitask_models.py Co-authored-by: Piotr Żelasko <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> --------- Signed-off-by: smajumdar <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Piotr Żelasko <[email protected]> * Change default (#8371) Signed-off-by: smajumdar <[email protected]> * implement retro's own fwd_bwd_step() and validation_step() to not have argument first_val_step, which the MLM commit doesn't support * adding megatron compile_helpers(), in future can be fixed with correct MLM commit * bug fix in fast-conformer-aed.yaml and adding jenkins test for speech_to_text_aed model (#8368) Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> * Enable megatron core loggers for GPT pretraining (#8354) * Logging changes tested for gpt_pretraining Signed-off-by: Aishwarya Bhandare <[email protected]> * Additional args Signed-off-by: Aishwarya Bhandare <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Aishwarya Bhandare <[email protected]> Co-authored-by: Aishwarya Bhandare <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <[email protected]> * mcore ds fix (#8283) * [tutorial] fixed missing RIR scripts file. (#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * mcore ds fix Signed-off-by: Dmytro Pykhtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update mcore Signed-off-by: dimapihtar <[email protected]> * revert asr files Signed-off-by: dimapihtar <[email protected]> * add comments Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for mcore mock dataset Signed-off-by: dimapihtar <[email protected]> * update mcore version Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update gpt cfg Signed-off-by: dimapihtar <[email protected]> * update mcore commit Signed-off-by: dimapihtar <[email protected]> * fix Bert unit tests Signed-off-by: dimapihtar <[email protected]> * update bert tests Signed-off-by: dimapihtar <[email protected]> * fix bert mcore test Signed-off-by: dimapihtar <[email protected]> * fix gpt jenkins tests Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update apex & TE commits Signed-off-by: dimapihtar <[email protected]> * revert apex installation Signed-off-by: dimapihtar <[email protected]> * turn off the fusion for jenkins Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: dimapihtar <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Pablo Garay <[email protected]> * addressing Eric's reviews * adding existing implementation RETRO files * adding existing implementation RETRO files * Add Finetuning tutorial with HF Datasets (#8356) * Add Finetuning tutorial with HF Datasets Signed-off-by: Nithin Rao Koluguri <nithinraok> * update on Som comments Signed-off-by: Nithin Rao Koluguri <nithinraok> --------- Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * release updates (#8378) * [tutorial] fixed missing RIR scripts file. (#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * mcore ds fix Signed-off-by: Dmytro Pykhtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update mcore Signed-off-by: dimapihtar <[email protected]> * revert asr files Signed-off-by: dimapihtar <[email protected]> * add comments Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for mcore mock dataset Signed-off-by: dimapihtar <[email protected]> * update mcore version Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update gpt cfg Signed-off-by: dimapihtar <[email protected]> * update mcore commit Signed-off-by: dimapihtar <[email protected]> * fix Bert unit tests Signed-off-by: dimapihtar <[email protected]> * update bert tests Signed-off-by: dimapihtar <[email protected]> * fix bert mcore test Signed-off-by: dimapihtar <[email protected]> * fix gpt jenkins tests Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for dict data input type Signed-off-by: dimapihtar <[email protected]> * add mock ds test Signed-off-by: dimapihtar <[email protected]> * add test for dict data input type Signed-off-by: dimapihtar <[email protected]> * mcore ds fix Signed-off-by: dimapihtar <[email protected]> * data input fix Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: dimapihtar <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Pablo Garay <[email protected]> * MCore dataset compatibility for tokenizers (#8390) * Add unique_identifiers for all tokenizers and eod for SentencePieceTokenizer Signed-off-by: Valerie Sarge <[email protected]> * Add generalized token aliases to TokenizerSpec to conform with MegatronTokenizer's interface. Remove now-redundant individual fixes from AutoTokenizer and SentencePieceTokenizer. Signed-off-by: Valerie Sarge <[email protected]> --------- Signed-off-by: Valerie Sarge <[email protected]> Co-authored-by: Pablo Garay <[email protected]> * Mcore customization doc (#8298) * [tutorial] fixed missing RIR scripts file. (#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * Add Bert HF checkpoint converter (#8088) * Add Bert HF checkpoint converter Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Reformat Signed-off-by: yaoyu-33 <[email protected]> * Add BERT ONNX export * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add NeMo BERT to HF BERT script * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Clean code Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update argument names Signed-off-by: yaoyu-33 <[email protected]> * Update build_transformer_config in Bert Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> * initial placeholder Signed-off-by: Huiying Li <[email protected]> * add to intro/index.rst Signed-off-by: Huiying Li <[email protected]> * initial content update Signed-off-by: Huiying Li <[email protected]> * add diff images Signed-off-by: Huiying Li <[email protected]> size Signed-off-by: Huiying Li <[email protected]> * minor fixes * minor language change Signed-off-by: Chen Cui <[email protected]> * clean changes --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Chen Cui <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> Co-authored-by: Huiying Li <[email protected]> Co-authored-by: Chen Cui <[email protected]> * wer fix (#8404) Signed-off-by: Travis Bartley <[email protected]> * updated link to pubmed (#8402) Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * Update NFA video download link (#8406) * update nfa nasa video link Signed-off-by: Elena Rastorgueva <[email protected]> * update link in markdown Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> * revert changes (#8410) Signed-off-by: Chen Cui <[email protected]> * Fix dreambooth data sampler issue (#8400) * Turn on drop last Signed-off-by: yaoyu-33 <[email protected]> * Some neva fixes Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fixed errors in the CTM gen functions (#8416) Signed-off-by: Taejin Park <[email protected]> * add ensemble decoding fix (#8427) Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * SDE bugfix log (#8430) Signed-off-by: George <[email protected]> * mcore customization doc minor fix (#8421) Signed-off-by: Huiying Li <[email protected]> * NeMo-Mistral to HF converter bugfix. (#8353) Signed-off-by: Alexandros Koumparoulis <[email protected]> * Fixing mcore bert for TP, PP and SP (#8336) * Fixing mcore bert for TP, PP and SP * Fixing mcore bert for TP, PP and SP * Fixing mcore version * Fixing mcore version * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> --------- Signed-off-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Add settings to suppress bf16 compile errors in CI on V100 (#8481) * Add settings to suppress bf16 compile errors in CI on V100 Signed-off-by: Abhishree <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Abhishree <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * MoE parameter passing (#8255) * MoE parameter passing Signed-off-by: Alexandros Koumparoulis <[email protected]> * Pass EP/MoE params in consumer scripts. Signed-off-by: Alexandros Koumparoulis <[email protected]> * PR fixes Signed-off-by: Alexandros Koumparoulis <[email protected]> * Use latest commit of mcore-0.5 Signed-off-by: Alexandros Koumparoulis <[email protected]> * CI fix Signed-off-by: Alexandros Koumparoulis <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Update k2 version (#8478) (#8492) Signed-off-by: Vladimir Bataev <[email protected]> * Add fp8 support for SD/Update notebook paths (#8489) * Add fp8 support for SD/Update notebook paths Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <[email protected]> * pin to 0.5.0 (#8465) Signed-off-by: eharper <[email protected]> * Update NeMo Multimodal Requirements (#8515) * Update requirements_multimodal.txt Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * update github raw content link (#8517) Signed-off-by: Chen Cui <[email protected]> * Add dep notice for notebooks (#8522) * add dep notice Signed-off-by: eharper <[email protected]> * revert Signed-off-by: eharper <[email protected]> --------- Signed-off-by: eharper <[email protected]> * Revert FP8 integration (#8520) * Revert FP8 integration Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Update data prep notebook (#8532) Signed-off-by: Mingyuan Ma <[email protected]> * before update branch with latest r1.23.0 * update to run with MLM ae2817b3dde4efb1515061a5311d01d8f85bd99c (runnable training and saving checkpoint) * remove compile_helpers * reverse changes from main branch to r1.23.0 * adding *_legacy files * update MLM commit in Jenkinsfile to latest * debugging Jenkinstest: test different mcore import in retro_dataset * update Jenkinsfile edit megatron_retro_mutransfer_pretrain_legacy.py * removing all mcore RETRO to pass the Jenkinstest * fixing import legacy problem for tests/collections/nlp/test_indexed_retrieval_dataset.py * update Jenkinsfile file to use TE v0.7 * update NeMo to work with latest mcore RETRO (solving TE problems) * update TE commit Jenkinsfile to be the same with r1.23.0's Jenkinsfile * update commit for MLM * jenkinstest debugging * temporary fix RETRO's __init__ for jenkinstest * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * add model.data.dataloader_type=cyclic to jenkinsfile * update code to work with latest megatron-lm main 81dab6067 * update M-LM commit in Jenkinsfile to latest main M-LM 81dab6067 * fix to by pass CI test bf16 problem (following this PR https://github.com/NVIDIA/NeMo/pull/8481/files) * isort and black * adjusting model.micro_batch_size to 1 * fix BRANCH = 'r1.23.0' * replace tutorials dir from main branch to huvu/mcore_retro * fix minor merges conflict * update Jenkinsfile * runnable with a temporary fix from Jacek (unfound -unfinished problem) * runnable with a temporary fix from Jacek (unfound -unfinished problem) * modified nlp_overrides.py back to original * fix checkpoint from Jacek Bieniusiewicz * config Jenkinsfile test * set RETRO Jenkins MBS to 1 * black fix * isort fix * update TE commit * update to latest Jenkinsfile with latest container and commits * remove new RETRO jenkinstest * merge latest main * put RETRO Jenkinstest to the right place * update code for megatron_retro_pretraining_legacy.py * untrack ipa_cmudict-0.7b_nv23.01.txt * untrack ipa_cmudict-0.7b_nv23.01.txt * set config in megatron_retro_pretraining_legacy.py to megatron_retro_config_legacy * update new RETRO jenkinstest to run faster * merging latest main, and edit Jenkinstest * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * huvu/mcore_retro_docs first commit * update with main * update RETRO docs * fix scripts/tts_dataset_files/ipa_cmudict-0.7b_nv23.01.txt * update docs * update docs * udpate RETRO docs * update with Jennifer's comments --------- Signed-off-by: eharper <[email protected]> Signed-off-by: Mikołaj Błaż <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: dimapihtar <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: Nithin Rao Koluguri <nithinraok> Signed-off-by: Jimmy Zhang <[email protected]> Signed-off-by: Chen Cui <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Alexandros Koumparoulis <[email protected]> Signed-off-by: Sangkug Lym <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: Krishna Puvvada <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Signed-off-by: Aishwarya Bhandare <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: Valerie Sarge <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Travis Bartley <[email protected]> Signed-off-by: Taejin Park <[email protected]> Signed-off-by: George <[email protected]> Signed-off-by: Shanmugam Ramasamy <[email protected]> Signed-off-by: Abhishree <[email protected]> Signed-off-by: Vladimir Bataev <[email protected]> Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: eharper <[email protected]> Co-authored-by: mikolajblaz <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: dimapihtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Piotr Żelasko <[email protected]> Co-authored-by: Elena Rastorgueva <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: JimmyZhang12 <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: Chen Cui <[email protected]> Co-authored-by: Huy Vu2 <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: Bobby Chen <[email protected]> Co-authored-by: akoumpa <[email protected]> Co-authored-by: Sangkug Lym <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: ashbhandare <[email protected]> Co-authored-by: Aishwarya Bhandare <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Pablo Garay <[email protected]> Co-authored-by: Valerie Sarge <[email protected]> Co-authored-by: Huiying <[email protected]> Co-authored-by: Huiying Li <[email protected]> Co-authored-by: tbartley94 <[email protected]> Co-authored-by: Taejin Park <[email protected]> Co-authored-by: George <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Abhishree Thittenamane <[email protected]> Co-authored-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: Vladimir Bataev <[email protected]> Co-authored-by: Ming <[email protected]> Co-authored-by: Huy Vu2 <[email protected]> Signed-off-by: Ao Tang <[email protected]>
suiyoubi
pushed a commit
that referenced
this pull request
May 2, 2024
* update branch Signed-off-by: eharper <[email protected]> * Add dist ckpt support for regular optimizers (#7749) * Add dist ckpt support for regular optimizers Signed-off-by: Mikołaj Błaż <[email protected]> * [tutorial] fixed missing RIR scripts file. (#8257) Signed-off-by: Xuesong Yang <[email protected]> * fix imports Signed-off-by: dimapihtar <[email protected]> * imports fix Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * ci imports fix Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert asr notebook Signed-off-by: dimapihtar <[email protected]> * revert asr notebook Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Mikołaj Błaż <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: dimapihtar <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: dimapihtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Pin lhotse=1.19.2 in r1.23.0 (#8303) Signed-off-by: Piotr Żelasko <[email protected]> * Cache Aware Streaming tutorial notebook (#8296) * add notebook Signed-off-by: Elena Rastorgueva <[email protected]> * rename old notebook to Buffered_Streaming Signed-off-by: Elena Rastorgueva <[email protected]> * call setup_streaming_params in set_default_att_context_size method Signed-off-by: Elena Rastorgueva <[email protected]> * update links in docs Signed-off-by: Elena Rastorgueva <[email protected]> * update links to tutorials in docs Signed-off-by: Elena Rastorgueva <[email protected]> * remove hard-coding Signed-off-by: Elena Rastorgueva <[email protected]> * rename var Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> * fix path location and branch (#8304) * fix path location and branch Signed-off-by: Nithin Rao Koluguri <nithinraok> * change to a floating point number Signed-off-by: Nithin Rao Koluguri <nithinraok> --------- Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Somshubra Majumdar <[email protected]> * add deallocate pipeline output optimization (#8279) * add deallocate pipeline output optimization Signed-off-by: Jimmy Zhang <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Jimmy Zhang <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fix memory leak caused by context parallelism hanging references by omegaconf (#8299) * save cp_size to self Signed-off-by: Jimmy Zhang <[email protected]> * use parallel_state instead of self Signed-off-by: Jimmy Zhang <[email protected]> --------- Signed-off-by: Jimmy Zhang <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: Eric Harper <[email protected]> * remove assertion (#8302) Signed-off-by: dimapihtar <[email protected]> * Update PEFT Doc (#8262) * update peft doc Signed-off-by: Chen Cui <[email protected]> * remove old prompt learning doc and notebook Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * Merge branch 'r1.23.0' into chcui/update_peft_doc Signed-off-by: Chen Cui <[email protected]> * revert accidental changes Signed-off-by: Chen Cui <[email protected]> * revert accidental changes Signed-off-by: Chen Cui <[email protected]> --------- Signed-off-by: Chen Cui <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks (#8242) (#8324) * Rebasing canary changes at current main Signed-off-by: Piotr Żelasko <[email protected]> * Move the changes from asr transformer to nlp transformer as originally intended Signed-off-by: Piotr Żelasko <[email protected]> * update eval to strip spaces before punctuations Signed-off-by: stevehuang52 <[email protected]> * update pc strip Signed-off-by: stevehuang52 <[email protected]> * [canary] Refactor: `PromptedAudioToTextLhotseDataset` and `EncDecMultiTaskModel` (#8247) * Create a separate CanaryDataset and use it inside `transformer_bpe_models.py`. Ditches `token_sequence_format`. Signed-off-by: Piotr Żelasko <[email protected]> * [canary] Refactor: move changes in transformer_bpe_models.py to Canar… (#8252) * [canary] Refactor: move changes in transformer_bpe_models.py to CanaryModel Signed-off-by: Piotr Żelasko <[email protected]> * Rename `CanaryModel` to `EncDecMultiTaskModel` and remove inheritance from `EncDecTransfModelBPE`; add a separate config for this model Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * Rename `CanaryDataset` to `PromptedAudioToTextLhotseDataset`; add `prompt_format_fn` argument; clean-up the `_canary_prompt_format` function a bit Signed-off-by: Piotr Żelasko <[email protected]> * Move tokenization into `prompt_format_fn`, fix usage, add docs Signed-off-by: Piotr Żelasko <[email protected]> * Backward-compatible utterance validation Signed-off-by: Piotr Żelasko <[email protected]> * Improve type annotations Signed-off-by: Piotr Żelasko <[email protected]> * config and prompt_fn registration changes from review Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * fix transcribe config Signed-off-by: stevehuang52 <[email protected]> * Refactor Canary to follow schema of remaining ASR models (#8260) * Initial draft of multi task beam decoding strategy Signed-off-by: smajumdar <[email protected]> * Stabilize inference Signed-off-by: smajumdar <[email protected]> * Update AED Multi Task model to mostly conform to Archetype-Type format. Update config Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add change decoding strategy Signed-off-by: smajumdar <[email protected]> * Remove redundant imports Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Cleanup Signed-off-by: smajumdar <[email protected]> * Cleanup Signed-off-by: smajumdar <[email protected]> * remove asr transformer dependency on nlp Signed-off-by: stevehuang52 <[email protected]> * clean up Signed-off-by: stevehuang52 <[email protected]> * copy token_classifier from nlp to asr Signed-off-by: stevehuang52 <[email protected]> * Address comments Signed-off-by: smajumdar <[email protected]> * Add typing to beam decoding Signed-off-by: smajumdar <[email protected]> * Make prompt format configurable Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * drop asr dependency on nlp Signed-off-by: stevehuang52 <[email protected]> --------- Signed-off-by: smajumdar <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: stevehuang52 <[email protected]> * fix transcribe, update asr evaluator Signed-off-by: stevehuang52 <[email protected]> * Extend the docs for the canary prompt_fn Signed-off-by: Piotr Żelasko <[email protected]> * Incorporate changes from Nithin's code review Signed-off-by: Piotr Żelasko <[email protected]> * training bug fix and adding launch script for speech_multitask (#8270) * bug fix and adding launch script for speech_multitask Signed-off-by: Krishna Puvvada <[email protected]> * update launch script example in speech_to_text_aed.py Signed-off-by: Krishna Puvvada <[email protected]> --------- Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> * Fix: drop_last must be true in validation/test otherwise the training will hang Signed-off-by: Piotr Żelasko <[email protected]> * revert to current transcribe API Signed-off-by: stevehuang52 <[email protected]> * revert changes to NLP, update docs Signed-off-by: stevehuang52 <[email protected]> * update eval utils Signed-off-by: stevehuang52 <[email protected]> * update docs Signed-off-by: stevehuang52 <[email protected]> * Remove DALI; rename compute_audio_loss to compute_loss Signed-off-by: Piotr Żelasko <[email protected]> * set default use_model_transcribe=False Signed-off-by: stevehuang52 <[email protected]> * change os.path.dirname to pathlib Signed-off-by: stevehuang52 <[email protected]> * [canary] Test for CanaryTokenizer + refactoring (#8285) * Test for CanaryTokenizer Signed-off-by: Piotr Żelasko <[email protected]> * Attempt at refactor... Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * Update config for AED models (#8294) Signed-off-by: smajumdar <[email protected]> * set default calculate_wer=False in transcribe_speech.py Signed-off-by: stevehuang52 <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks Signed-off-by: Piotr Żelasko <[email protected]> * Apply suggestions from code review, part 1 Co-authored-by: Nithin Rao <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> * Apply suggestions from code review, part 2 Signed-off-by: Piotr Żelasko <[email protected]> * Document compute_loss Signed-off-by: Piotr Żelasko <[email protected]> * update transcribe_speech.py Signed-off-by: stevehuang52 <[email protected]> * add docstring Signed-off-by: stevehuang52 <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: Krishna Puvvada <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> Co-authored-by: stevehuang52 <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: He Huang (Steve) <[email protected]> Co-authored-by: Nithin Rao <[email protected]> (cherry picked from commit d10726d) Co-authored-by: Piotr Żelasko <[email protected]> * Multimodal r1.23.0 bug fix (#8315) * Rename quick-gelu Signed-off-by: yaoyu-33 <[email protected]> * ddpm config guard Signed-off-by: yaoyu-33 <[email protected]> * Fix ddpm edit api Signed-off-by: yaoyu-33 <[email protected]> * Fix insert_image_token cfg issue Signed-off-by: yaoyu-33 <[email protected]> * neva updates Signed-off-by: yaoyu-33 <[email protected]> * reformat Signed-off-by: yaoyu-33 <[email protected]> * Add back jenkins Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix jenkins Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix bugs Signed-off-by: yaoyu-33 <[email protected]> * Update default neva template Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fixes for MoE parameter passing & use of AutoTokenizer/Model for mistral. (#8272) Signed-off-by: Alexandros Koumparoulis <[email protected]> * Keep max_seqlen and cu_seqlens_argmin for later micro-batches when PP>1 (#8334) Signed-off-by: Sangkug Lym <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Remove asr webapp (#8347) Signed-off-by: smajumdar <[email protected]> * remove _target_ at model level in aed config (#8351) Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> * Add change_vocabulary and save_tokenizers() support to Multitask ASR models (#8357) * Add change_vocabulary and save_tokenizers() support Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update nemo/collections/asr/models/aed_multitask_models.py Co-authored-by: Piotr Żelasko <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> --------- Signed-off-by: smajumdar <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Piotr Żelasko <[email protected]> * Change default (#8371) Signed-off-by: smajumdar <[email protected]> * bug fix in fast-conformer-aed.yaml and adding jenkins test for speech_to_text_aed model (#8368) Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> * Enable megatron core loggers for GPT pretraining (#8354) * Logging changes tested for gpt_pretraining Signed-off-by: Aishwarya Bhandare <[email protected]> * Additional args Signed-off-by: Aishwarya Bhandare <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Aishwarya Bhandare <[email protected]> Co-authored-by: Aishwarya Bhandare <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <[email protected]> * mcore ds fix (#8283) * [tutorial] fixed missing RIR scripts file. (#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * mcore ds fix Signed-off-by: Dmytro Pykhtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update mcore Signed-off-by: dimapihtar <[email protected]> * revert asr files Signed-off-by: dimapihtar <[email protected]> * add comments Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for mcore mock dataset Signed-off-by: dimapihtar <[email protected]> * update mcore version Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update gpt cfg Signed-off-by: dimapihtar <[email protected]> * update mcore commit Signed-off-by: dimapihtar <[email protected]> * fix Bert unit tests Signed-off-by: dimapihtar <[email protected]> * update bert tests Signed-off-by: dimapihtar <[email protected]> * fix bert mcore test Signed-off-by: dimapihtar <[email protected]> * fix gpt jenkins tests Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update apex & TE commits Signed-off-by: dimapihtar <[email protected]> * revert apex installation Signed-off-by: dimapihtar <[email protected]> * turn off the fusion for jenkins Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: dimapihtar <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Pablo Garay <[email protected]> * Add Finetuning tutorial with HF Datasets (#8356) * Add Finetuning tutorial with HF Datasets Signed-off-by: Nithin Rao Koluguri <nithinraok> * update on Som comments Signed-off-by: Nithin Rao Koluguri <nithinraok> --------- Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * release updates (#8378) * [tutorial] fixed missing RIR scripts file. (#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * mcore ds fix Signed-off-by: Dmytro Pykhtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update mcore Signed-off-by: dimapihtar <[email protected]> * revert asr files Signed-off-by: dimapihtar <[email protected]> * add comments Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for mcore mock dataset Signed-off-by: dimapihtar <[email protected]> * update mcore version Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update gpt cfg Signed-off-by: dimapihtar <[email protected]> * update mcore commit Signed-off-by: dimapihtar <[email protected]> * fix Bert unit tests Signed-off-by: dimapihtar <[email protected]> * update bert tests Signed-off-by: dimapihtar <[email protected]> * fix bert mcore test Signed-off-by: dimapihtar <[email protected]> * fix gpt jenkins tests Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for dict data input type Signed-off-by: dimapihtar <[email protected]> * add mock ds test Signed-off-by: dimapihtar <[email protected]> * add test for dict data input type Signed-off-by: dimapihtar <[email protected]> * mcore ds fix Signed-off-by: dimapihtar <[email protected]> * data input fix Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: dimapihtar <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Pablo Garay <[email protected]> * MCore dataset compatibility for tokenizers (#8390) * Add unique_identifiers for all tokenizers and eod for SentencePieceTokenizer Signed-off-by: Valerie Sarge <[email protected]> * Add generalized token aliases to TokenizerSpec to conform with MegatronTokenizer's interface. Remove now-redundant individual fixes from AutoTokenizer and SentencePieceTokenizer. Signed-off-by: Valerie Sarge <[email protected]> --------- Signed-off-by: Valerie Sarge <[email protected]> Co-authored-by: Pablo Garay <[email protected]> * Mcore customization doc (#8298) * [tutorial] fixed missing RIR scripts file. (#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * Add Bert HF checkpoint converter (#8088) * Add Bert HF checkpoint converter Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Reformat Signed-off-by: yaoyu-33 <[email protected]> * Add BERT ONNX export * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add NeMo BERT to HF BERT script * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Clean code Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update argument names Signed-off-by: yaoyu-33 <[email protected]> * Update build_transformer_config in Bert Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> * initial placeholder Signed-off-by: Huiying Li <[email protected]> * add to intro/index.rst Signed-off-by: Huiying Li <[email protected]> * initial content update Signed-off-by: Huiying Li <[email protected]> * add diff images Signed-off-by: Huiying Li <[email protected]> size Signed-off-by: Huiying Li <[email protected]> * minor fixes * minor language change Signed-off-by: Chen Cui <[email protected]> * clean changes --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Chen Cui <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> Co-authored-by: Huiying Li <[email protected]> Co-authored-by: Chen Cui <[email protected]> * wer fix (#8404) Signed-off-by: Travis Bartley <[email protected]> * updated link to pubmed (#8402) Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * Update NFA video download link (#8406) * update nfa nasa video link Signed-off-by: Elena Rastorgueva <[email protected]> * update link in markdown Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> * revert changes (#8410) Signed-off-by: Chen Cui <[email protected]> * Fix dreambooth data sampler issue (#8400) * Turn on drop last Signed-off-by: yaoyu-33 <[email protected]> * Some neva fixes Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fixed errors in the CTM gen functions (#8416) Signed-off-by: Taejin Park <[email protected]> * add ensemble decoding fix (#8427) Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * SDE bugfix log (#8430) Signed-off-by: George <[email protected]> * mcore customization doc minor fix (#8421) Signed-off-by: Huiying Li <[email protected]> * NeMo-Mistral to HF converter bugfix. (#8353) Signed-off-by: Alexandros Koumparoulis <[email protected]> * Fixing mcore bert for TP, PP and SP (#8336) * Fixing mcore bert for TP, PP and SP * Fixing mcore bert for TP, PP and SP * Fixing mcore version * Fixing mcore version * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> --------- Signed-off-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Add settings to suppress bf16 compile errors in CI on V100 (#8481) * Add settings to suppress bf16 compile errors in CI on V100 Signed-off-by: Abhishree <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Abhishree <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * MoE parameter passing (#8255) * MoE parameter passing Signed-off-by: Alexandros Koumparoulis <[email protected]> * Pass EP/MoE params in consumer scripts. Signed-off-by: Alexandros Koumparoulis <[email protected]> * PR fixes Signed-off-by: Alexandros Koumparoulis <[email protected]> * Use latest commit of mcore-0.5 Signed-off-by: Alexandros Koumparoulis <[email protected]> * CI fix Signed-off-by: Alexandros Koumparoulis <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Update k2 version (#8478) (#8492) Signed-off-by: Vladimir Bataev <[email protected]> * Add fp8 support for SD/Update notebook paths (#8489) * Add fp8 support for SD/Update notebook paths Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <[email protected]> * pin to 0.5.0 (#8465) Signed-off-by: eharper <[email protected]> * Update NeMo Multimodal Requirements (#8515) * Update requirements_multimodal.txt Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * update github raw content link (#8517) Signed-off-by: Chen Cui <[email protected]> * Add dep notice for notebooks (#8522) * add dep notice Signed-off-by: eharper <[email protected]> * revert Signed-off-by: eharper <[email protected]> --------- Signed-off-by: eharper <[email protected]> * Revert FP8 integration (#8520) * Revert FP8 integration Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Update data prep notebook (#8532) Signed-off-by: Mingyuan Ma <[email protected]> * Add back fp8 support * SD-FP8: fix the bug of normalization location Signed-off-by: Mingyuan Ma <[email protected]> * map potential FP8 ckpt to FP16 Signed-off-by: Mingyuan Ma <[email protected]> * Add TE fp8 training Signed-off-by: Mingyuan Ma <[email protected]> * Only overwrite unet precision when self.megatron_amp_O2 is true Signed-off-by: Mingyuan Ma <[email protected]> * New structure is now compatible with old ckpts Signed-off-by: Mingyuan Ma <[email protected]> * Add support on mapping old unet checkpoint to new structure and FP8 structure Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Sync with main branch Signed-off-by: Mingyuan Ma <[email protected]> --------- Signed-off-by: eharper <[email protected]> Signed-off-by: Mikołaj Błaż <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: dimapihtar <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: Nithin Rao Koluguri <nithinraok> Signed-off-by: Jimmy Zhang <[email protected]> Signed-off-by: Chen Cui <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Alexandros Koumparoulis <[email protected]> Signed-off-by: Sangkug Lym <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: Krishna Puvvada <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Signed-off-by: Aishwarya Bhandare <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: Valerie Sarge <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Travis Bartley <[email protected]> Signed-off-by: Taejin Park <[email protected]> Signed-off-by: George <[email protected]> Signed-off-by: Shanmugam Ramasamy <[email protected]> Signed-off-by: Abhishree <[email protected]> Signed-off-by: Vladimir Bataev <[email protected]> Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: eharper <[email protected]> Co-authored-by: mikolajblaz <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: dimapihtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Piotr Żelasko <[email protected]> Co-authored-by: Elena Rastorgueva <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: JimmyZhang12 <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: Chen Cui <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: akoumpa <[email protected]> Co-authored-by: Sangkug Lym <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: ashbhandare <[email protected]> Co-authored-by: Aishwarya Bhandare <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Pablo Garay <[email protected]> Co-authored-by: Valerie Sarge <[email protected]> Co-authored-by: Huiying <[email protected]> Co-authored-by: Bobby Chen <[email protected]> Co-authored-by: Huiying Li <[email protected]> Co-authored-by: tbartley94 <[email protected]> Co-authored-by: Taejin Park <[email protected]> Co-authored-by: George <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Abhishree Thittenamane <[email protected]> Co-authored-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: Vladimir Bataev <[email protected]> Co-authored-by: Mengdi Wang <[email protected]> Signed-off-by: Ao Tang <[email protected]>
rohitrango
pushed a commit
to rohitrango/NeMo
that referenced
this pull request
Jun 25, 2024
* update branch Signed-off-by: eharper <[email protected]> * Add dist ckpt support for regular optimizers (NVIDIA#7749) * Add dist ckpt support for regular optimizers Signed-off-by: Mikołaj Błaż <[email protected]> * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * fix imports Signed-off-by: dimapihtar <[email protected]> * imports fix Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * ci imports fix Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert asr notebook Signed-off-by: dimapihtar <[email protected]> * revert asr notebook Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Mikołaj Błaż <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: dimapihtar <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: dimapihtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Pin lhotse=1.19.2 in r1.23.0 (NVIDIA#8303) Signed-off-by: Piotr Żelasko <[email protected]> * Cache Aware Streaming tutorial notebook (NVIDIA#8296) * add notebook Signed-off-by: Elena Rastorgueva <[email protected]> * rename old notebook to Buffered_Streaming Signed-off-by: Elena Rastorgueva <[email protected]> * call setup_streaming_params in set_default_att_context_size method Signed-off-by: Elena Rastorgueva <[email protected]> * update links in docs Signed-off-by: Elena Rastorgueva <[email protected]> * update links to tutorials in docs Signed-off-by: Elena Rastorgueva <[email protected]> * remove hard-coding Signed-off-by: Elena Rastorgueva <[email protected]> * rename var Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> * fix path location and branch (NVIDIA#8304) * fix path location and branch Signed-off-by: Nithin Rao Koluguri <nithinraok> * change to a floating point number Signed-off-by: Nithin Rao Koluguri <nithinraok> --------- Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Somshubra Majumdar <[email protected]> * add deallocate pipeline output optimization (NVIDIA#8279) * add deallocate pipeline output optimization Signed-off-by: Jimmy Zhang <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Jimmy Zhang <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fix memory leak caused by context parallelism hanging references by omegaconf (NVIDIA#8299) * save cp_size to self Signed-off-by: Jimmy Zhang <[email protected]> * use parallel_state instead of self Signed-off-by: Jimmy Zhang <[email protected]> --------- Signed-off-by: Jimmy Zhang <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: Eric Harper <[email protected]> * remove assertion (NVIDIA#8302) Signed-off-by: dimapihtar <[email protected]> * Update PEFT Doc (NVIDIA#8262) * update peft doc Signed-off-by: Chen Cui <[email protected]> * remove old prompt learning doc and notebook Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * Merge branch 'r1.23.0' into chcui/update_peft_doc Signed-off-by: Chen Cui <[email protected]> * revert accidental changes Signed-off-by: Chen Cui <[email protected]> * revert accidental changes Signed-off-by: Chen Cui <[email protected]> --------- Signed-off-by: Chen Cui <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks (NVIDIA#8242) (NVIDIA#8324) * Rebasing canary changes at current main Signed-off-by: Piotr Żelasko <[email protected]> * Move the changes from asr transformer to nlp transformer as originally intended Signed-off-by: Piotr Żelasko <[email protected]> * update eval to strip spaces before punctuations Signed-off-by: stevehuang52 <[email protected]> * update pc strip Signed-off-by: stevehuang52 <[email protected]> * [canary] Refactor: `PromptedAudioToTextLhotseDataset` and `EncDecMultiTaskModel` (NVIDIA#8247) * Create a separate CanaryDataset and use it inside `transformer_bpe_models.py`. Ditches `token_sequence_format`. Signed-off-by: Piotr Żelasko <[email protected]> * [canary] Refactor: move changes in transformer_bpe_models.py to Canar… (NVIDIA#8252) * [canary] Refactor: move changes in transformer_bpe_models.py to CanaryModel Signed-off-by: Piotr Żelasko <[email protected]> * Rename `CanaryModel` to `EncDecMultiTaskModel` and remove inheritance from `EncDecTransfModelBPE`; add a separate config for this model Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * Rename `CanaryDataset` to `PromptedAudioToTextLhotseDataset`; add `prompt_format_fn` argument; clean-up the `_canary_prompt_format` function a bit Signed-off-by: Piotr Żelasko <[email protected]> * Move tokenization into `prompt_format_fn`, fix usage, add docs Signed-off-by: Piotr Żelasko <[email protected]> * Backward-compatible utterance validation Signed-off-by: Piotr Żelasko <[email protected]> * Improve type annotations Signed-off-by: Piotr Żelasko <[email protected]> * config and prompt_fn registration changes from review Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * fix transcribe config Signed-off-by: stevehuang52 <[email protected]> * Refactor Canary to follow schema of remaining ASR models (NVIDIA#8260) * Initial draft of multi task beam decoding strategy Signed-off-by: smajumdar <[email protected]> * Stabilize inference Signed-off-by: smajumdar <[email protected]> * Update AED Multi Task model to mostly conform to Archetype-Type format. Update config Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add change decoding strategy Signed-off-by: smajumdar <[email protected]> * Remove redundant imports Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Cleanup Signed-off-by: smajumdar <[email protected]> * Cleanup Signed-off-by: smajumdar <[email protected]> * remove asr transformer dependency on nlp Signed-off-by: stevehuang52 <[email protected]> * clean up Signed-off-by: stevehuang52 <[email protected]> * copy token_classifier from nlp to asr Signed-off-by: stevehuang52 <[email protected]> * Address comments Signed-off-by: smajumdar <[email protected]> * Add typing to beam decoding Signed-off-by: smajumdar <[email protected]> * Make prompt format configurable Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * drop asr dependency on nlp Signed-off-by: stevehuang52 <[email protected]> --------- Signed-off-by: smajumdar <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: stevehuang52 <[email protected]> * fix transcribe, update asr evaluator Signed-off-by: stevehuang52 <[email protected]> * Extend the docs for the canary prompt_fn Signed-off-by: Piotr Żelasko <[email protected]> * Incorporate changes from Nithin's code review Signed-off-by: Piotr Żelasko <[email protected]> * training bug fix and adding launch script for speech_multitask (NVIDIA#8270) * bug fix and adding launch script for speech_multitask Signed-off-by: Krishna Puvvada <[email protected]> * update launch script example in speech_to_text_aed.py Signed-off-by: Krishna Puvvada <[email protected]> --------- Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> * Fix: drop_last must be true in validation/test otherwise the training will hang Signed-off-by: Piotr Żelasko <[email protected]> * revert to current transcribe API Signed-off-by: stevehuang52 <[email protected]> * revert changes to NLP, update docs Signed-off-by: stevehuang52 <[email protected]> * update eval utils Signed-off-by: stevehuang52 <[email protected]> * update docs Signed-off-by: stevehuang52 <[email protected]> * Remove DALI; rename compute_audio_loss to compute_loss Signed-off-by: Piotr Żelasko <[email protected]> * set default use_model_transcribe=False Signed-off-by: stevehuang52 <[email protected]> * change os.path.dirname to pathlib Signed-off-by: stevehuang52 <[email protected]> * [canary] Test for CanaryTokenizer + refactoring (NVIDIA#8285) * Test for CanaryTokenizer Signed-off-by: Piotr Żelasko <[email protected]> * Attempt at refactor... Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * Update config for AED models (NVIDIA#8294) Signed-off-by: smajumdar <[email protected]> * set default calculate_wer=False in transcribe_speech.py Signed-off-by: stevehuang52 <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks Signed-off-by: Piotr Żelasko <[email protected]> * Apply suggestions from code review, part 1 Co-authored-by: Nithin Rao <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> * Apply suggestions from code review, part 2 Signed-off-by: Piotr Żelasko <[email protected]> * Document compute_loss Signed-off-by: Piotr Żelasko <[email protected]> * update transcribe_speech.py Signed-off-by: stevehuang52 <[email protected]> * add docstring Signed-off-by: stevehuang52 <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: Krishna Puvvada <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> Co-authored-by: stevehuang52 <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: He Huang (Steve) <[email protected]> Co-authored-by: Nithin Rao <[email protected]> (cherry picked from commit 86efc4e) Co-authored-by: Piotr Żelasko <[email protected]> * add code for calling mcore_retro in NeMo * add code for calling mcore_retro in NeMo * runnable, training curve match retro mcore and nemo * working on retro inference * working on megatron_retro_eval.py and megatron_retro_inference.yaml * refactoring text_generation_utils code and retro inference relevant files * clean PR * resolving quick hacks (reading number of train/valid samples from workdir, discrepancy in total samples and samples with neighbors retrieved, tokenizers) * clean repository * revert changes to inference/eval code to original in main * clean code * runable training code, with already implemented eval code * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (NVIDIA#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * Add Bert HF checkpoint converter (NVIDIA#8088) * Add Bert HF checkpoint converter Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Reformat Signed-off-by: yaoyu-33 <[email protected]> * Add BERT ONNX export * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add NeMo BERT to HF BERT script * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Clean code Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update argument names Signed-off-by: yaoyu-33 <[email protected]> * Update build_transformer_config in Bert Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> * revert to original eval code files * revert to original eval code files 2 * revert to original eval code files 3 * revert to original eval code files 4 * clean code * clean code * update my code to support changes from lastest main * commit before rebase r1.23.0 * Multimodal r1.23.0 bug fix (NVIDIA#8315) * Rename quick-gelu Signed-off-by: yaoyu-33 <[email protected]> * ddpm config guard Signed-off-by: yaoyu-33 <[email protected]> * Fix ddpm edit api Signed-off-by: yaoyu-33 <[email protected]> * Fix insert_image_token cfg issue Signed-off-by: yaoyu-33 <[email protected]> * neva updates Signed-off-by: yaoyu-33 <[email protected]> * reformat Signed-off-by: yaoyu-33 <[email protected]> * Add back jenkins Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix jenkins Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix bugs Signed-off-by: yaoyu-33 <[email protected]> * Update default neva template Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * copy paste files from r1.23.0 * clean PR * Fixes for MoE parameter passing & use of AutoTokenizer/Model for mistral. (NVIDIA#8272) Signed-off-by: Alexandros Koumparoulis <[email protected]> * Keep max_seqlen and cu_seqlens_argmin for later micro-batches when PP>1 (NVIDIA#8334) Signed-off-by: Sangkug Lym <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Remove asr webapp (NVIDIA#8347) Signed-off-by: smajumdar <[email protected]> * remove _target_ at model level in aed config (NVIDIA#8351) Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> * revert changes for tts and asr * Add change_vocabulary and save_tokenizers() support to Multitask ASR models (NVIDIA#8357) * Add change_vocabulary and save_tokenizers() support Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update nemo/collections/asr/models/aed_multitask_models.py Co-authored-by: Piotr Żelasko <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> --------- Signed-off-by: smajumdar <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Piotr Żelasko <[email protected]> * Change default (NVIDIA#8371) Signed-off-by: smajumdar <[email protected]> * implement retro's own fwd_bwd_step() and validation_step() to not have argument first_val_step, which the MLM commit doesn't support * adding megatron compile_helpers(), in future can be fixed with correct MLM commit * bug fix in fast-conformer-aed.yaml and adding jenkins test for speech_to_text_aed model (NVIDIA#8368) Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> * Enable megatron core loggers for GPT pretraining (NVIDIA#8354) * Logging changes tested for gpt_pretraining Signed-off-by: Aishwarya Bhandare <[email protected]> * Additional args Signed-off-by: Aishwarya Bhandare <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Aishwarya Bhandare <[email protected]> Co-authored-by: Aishwarya Bhandare <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <[email protected]> * mcore ds fix (NVIDIA#8283) * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (NVIDIA#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * mcore ds fix Signed-off-by: Dmytro Pykhtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update mcore Signed-off-by: dimapihtar <[email protected]> * revert asr files Signed-off-by: dimapihtar <[email protected]> * add comments Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for mcore mock dataset Signed-off-by: dimapihtar <[email protected]> * update mcore version Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update gpt cfg Signed-off-by: dimapihtar <[email protected]> * update mcore commit Signed-off-by: dimapihtar <[email protected]> * fix Bert unit tests Signed-off-by: dimapihtar <[email protected]> * update bert tests Signed-off-by: dimapihtar <[email protected]> * fix bert mcore test Signed-off-by: dimapihtar <[email protected]> * fix gpt jenkins tests Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update apex & TE commits Signed-off-by: dimapihtar <[email protected]> * revert apex installation Signed-off-by: dimapihtar <[email protected]> * turn off the fusion for jenkins Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: dimapihtar <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Pablo Garay <[email protected]> * addressing Eric's reviews * adding existing implementation RETRO files * adding existing implementation RETRO files * Add Finetuning tutorial with HF Datasets (NVIDIA#8356) * Add Finetuning tutorial with HF Datasets Signed-off-by: Nithin Rao Koluguri <nithinraok> * update on Som comments Signed-off-by: Nithin Rao Koluguri <nithinraok> --------- Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * release updates (NVIDIA#8378) * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (NVIDIA#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * mcore ds fix Signed-off-by: Dmytro Pykhtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update mcore Signed-off-by: dimapihtar <[email protected]> * revert asr files Signed-off-by: dimapihtar <[email protected]> * add comments Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for mcore mock dataset Signed-off-by: dimapihtar <[email protected]> * update mcore version Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update gpt cfg Signed-off-by: dimapihtar <[email protected]> * update mcore commit Signed-off-by: dimapihtar <[email protected]> * fix Bert unit tests Signed-off-by: dimapihtar <[email protected]> * update bert tests Signed-off-by: dimapihtar <[email protected]> * fix bert mcore test Signed-off-by: dimapihtar <[email protected]> * fix gpt jenkins tests Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for dict data input type Signed-off-by: dimapihtar <[email protected]> * add mock ds test Signed-off-by: dimapihtar <[email protected]> * add test for dict data input type Signed-off-by: dimapihtar <[email protected]> * mcore ds fix Signed-off-by: dimapihtar <[email protected]> * data input fix Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: dimapihtar <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Pablo Garay <[email protected]> * MCore dataset compatibility for tokenizers (NVIDIA#8390) * Add unique_identifiers for all tokenizers and eod for SentencePieceTokenizer Signed-off-by: Valerie Sarge <[email protected]> * Add generalized token aliases to TokenizerSpec to conform with MegatronTokenizer's interface. Remove now-redundant individual fixes from AutoTokenizer and SentencePieceTokenizer. Signed-off-by: Valerie Sarge <[email protected]> --------- Signed-off-by: Valerie Sarge <[email protected]> Co-authored-by: Pablo Garay <[email protected]> * Mcore customization doc (NVIDIA#8298) * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (NVIDIA#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * Add Bert HF checkpoint converter (NVIDIA#8088) * Add Bert HF checkpoint converter Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Reformat Signed-off-by: yaoyu-33 <[email protected]> * Add BERT ONNX export * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add NeMo BERT to HF BERT script * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Clean code Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update argument names Signed-off-by: yaoyu-33 <[email protected]> * Update build_transformer_config in Bert Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> * initial placeholder Signed-off-by: Huiying Li <[email protected]> * add to intro/index.rst Signed-off-by: Huiying Li <[email protected]> * initial content update Signed-off-by: Huiying Li <[email protected]> * add diff images Signed-off-by: Huiying Li <[email protected]> size Signed-off-by: Huiying Li <[email protected]> * minor fixes * minor language change Signed-off-by: Chen Cui <[email protected]> * clean changes --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Chen Cui <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> Co-authored-by: Huiying Li <[email protected]> Co-authored-by: Chen Cui <[email protected]> * wer fix (NVIDIA#8404) Signed-off-by: Travis Bartley <[email protected]> * updated link to pubmed (NVIDIA#8402) Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * Update NFA video download link (NVIDIA#8406) * update nfa nasa video link Signed-off-by: Elena Rastorgueva <[email protected]> * update link in markdown Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> * revert changes (NVIDIA#8410) Signed-off-by: Chen Cui <[email protected]> * Fix dreambooth data sampler issue (NVIDIA#8400) * Turn on drop last Signed-off-by: yaoyu-33 <[email protected]> * Some neva fixes Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fixed errors in the CTM gen functions (NVIDIA#8416) Signed-off-by: Taejin Park <[email protected]> * add ensemble decoding fix (NVIDIA#8427) Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * SDE bugfix log (NVIDIA#8430) Signed-off-by: George <[email protected]> * mcore customization doc minor fix (NVIDIA#8421) Signed-off-by: Huiying Li <[email protected]> * NeMo-Mistral to HF converter bugfix. (NVIDIA#8353) Signed-off-by: Alexandros Koumparoulis <[email protected]> * Fixing mcore bert for TP, PP and SP (NVIDIA#8336) * Fixing mcore bert for TP, PP and SP * Fixing mcore bert for TP, PP and SP * Fixing mcore version * Fixing mcore version * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> --------- Signed-off-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Add settings to suppress bf16 compile errors in CI on V100 (NVIDIA#8481) * Add settings to suppress bf16 compile errors in CI on V100 Signed-off-by: Abhishree <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Abhishree <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * MoE parameter passing (NVIDIA#8255) * MoE parameter passing Signed-off-by: Alexandros Koumparoulis <[email protected]> * Pass EP/MoE params in consumer scripts. Signed-off-by: Alexandros Koumparoulis <[email protected]> * PR fixes Signed-off-by: Alexandros Koumparoulis <[email protected]> * Use latest commit of mcore-0.5 Signed-off-by: Alexandros Koumparoulis <[email protected]> * CI fix Signed-off-by: Alexandros Koumparoulis <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Update k2 version (NVIDIA#8478) (NVIDIA#8492) Signed-off-by: Vladimir Bataev <[email protected]> * Add fp8 support for SD/Update notebook paths (NVIDIA#8489) * Add fp8 support for SD/Update notebook paths Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <[email protected]> * pin to 0.5.0 (NVIDIA#8465) Signed-off-by: eharper <[email protected]> * Update NeMo Multimodal Requirements (NVIDIA#8515) * Update requirements_multimodal.txt Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * update github raw content link (NVIDIA#8517) Signed-off-by: Chen Cui <[email protected]> * Add dep notice for notebooks (NVIDIA#8522) * add dep notice Signed-off-by: eharper <[email protected]> * revert Signed-off-by: eharper <[email protected]> --------- Signed-off-by: eharper <[email protected]> * Revert FP8 integration (NVIDIA#8520) * Revert FP8 integration Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Update data prep notebook (NVIDIA#8532) Signed-off-by: Mingyuan Ma <[email protected]> * before update branch with latest r1.23.0 * update to run with MLM ae2817b3dde4efb1515061a5311d01d8f85bd99c (runnable training and saving checkpoint) * remove compile_helpers * reverse changes from main branch to r1.23.0 * adding *_legacy files * update MLM commit in Jenkinsfile to latest * debugging Jenkinstest: test different mcore import in retro_dataset * update Jenkinsfile edit megatron_retro_mutransfer_pretrain_legacy.py * removing all mcore RETRO to pass the Jenkinstest * fixing import legacy problem for tests/collections/nlp/test_indexed_retrieval_dataset.py * update Jenkinsfile file to use TE v0.7 * update NeMo to work with latest mcore RETRO (solving TE problems) * update TE commit Jenkinsfile to be the same with r1.23.0's Jenkinsfile * update commit for MLM * jenkinstest debugging * temporary fix RETRO's __init__ for jenkinstest * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * add model.data.dataloader_type=cyclic to jenkinsfile * update code to work with latest megatron-lm main 81dab6067 * update M-LM commit in Jenkinsfile to latest main M-LM 81dab6067 * fix to by pass CI test bf16 problem (following this PR https://github.com/NVIDIA/NeMo/pull/8481/files) * isort and black * adjusting model.micro_batch_size to 1 * fix BRANCH = 'r1.23.0' * replace tutorials dir from main branch to huvu/mcore_retro * fix minor merges conflict * update Jenkinsfile * runnable with a temporary fix from Jacek (unfound -unfinished problem) * runnable with a temporary fix from Jacek (unfound -unfinished problem) * modified nlp_overrides.py back to original * fix checkpoint from Jacek Bieniusiewicz * config Jenkinsfile test * set RETRO Jenkins MBS to 1 * black fix * isort fix * update TE commit * update to latest Jenkinsfile with latest container and commits * remove new RETRO jenkinstest * merge latest main * put RETRO Jenkinstest to the right place * update code for megatron_retro_pretraining_legacy.py * untrack ipa_cmudict-0.7b_nv23.01.txt * untrack ipa_cmudict-0.7b_nv23.01.txt * set config in megatron_retro_pretraining_legacy.py to megatron_retro_config_legacy * update new RETRO jenkinstest to run faster * merging latest main, and edit Jenkinstest * update Jenkinstest for new RETRO to run faster * fix isort * fix whitespace Signed-off-by: eharper <[email protected]> --------- Signed-off-by: eharper <[email protected]> Signed-off-by: Mikołaj Błaż <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: dimapihtar <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: Nithin Rao Koluguri <nithinraok> Signed-off-by: Jimmy Zhang <[email protected]> Signed-off-by: Chen Cui <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Alexandros Koumparoulis <[email protected]> Signed-off-by: Sangkug Lym <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: Krishna Puvvada <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Signed-off-by: Aishwarya Bhandare <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: Valerie Sarge <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Travis Bartley <[email protected]> Signed-off-by: Taejin Park <[email protected]> Signed-off-by: George <[email protected]> Signed-off-by: Shanmugam Ramasamy <[email protected]> Signed-off-by: Abhishree <[email protected]> Signed-off-by: Vladimir Bataev <[email protected]> Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: eharper <[email protected]> Co-authored-by: mikolajblaz <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: dimapihtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Piotr Żelasko <[email protected]> Co-authored-by: Elena Rastorgueva <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: JimmyZhang12 <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: Chen Cui <[email protected]> Co-authored-by: Huy Vu2 <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: Bobby Chen <[email protected]> Co-authored-by: akoumpa <[email protected]> Co-authored-by: Sangkug Lym <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: ashbhandare <[email protected]> Co-authored-by: Aishwarya Bhandare <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Pablo Garay <[email protected]> Co-authored-by: Valerie Sarge <[email protected]> Co-authored-by: Huiying <[email protected]> Co-authored-by: Huiying Li <[email protected]> Co-authored-by: tbartley94 <[email protected]> Co-authored-by: Taejin Park <[email protected]> Co-authored-by: George <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Abhishree Thittenamane <[email protected]> Co-authored-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: Vladimir Bataev <[email protected]> Co-authored-by: Ming <[email protected]> Co-authored-by: Huy Vu2 <[email protected]>
rohitrango
pushed a commit
to rohitrango/NeMo
that referenced
this pull request
Jun 25, 2024
* update branch Signed-off-by: eharper <[email protected]> * Add dist ckpt support for regular optimizers (NVIDIA#7749) * Add dist ckpt support for regular optimizers Signed-off-by: Mikołaj Błaż <[email protected]> * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * fix imports Signed-off-by: dimapihtar <[email protected]> * imports fix Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * ci imports fix Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert asr notebook Signed-off-by: dimapihtar <[email protected]> * revert asr notebook Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Mikołaj Błaż <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: dimapihtar <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: dimapihtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Pin lhotse=1.19.2 in r1.23.0 (NVIDIA#8303) Signed-off-by: Piotr Żelasko <[email protected]> * Cache Aware Streaming tutorial notebook (NVIDIA#8296) * add notebook Signed-off-by: Elena Rastorgueva <[email protected]> * rename old notebook to Buffered_Streaming Signed-off-by: Elena Rastorgueva <[email protected]> * call setup_streaming_params in set_default_att_context_size method Signed-off-by: Elena Rastorgueva <[email protected]> * update links in docs Signed-off-by: Elena Rastorgueva <[email protected]> * update links to tutorials in docs Signed-off-by: Elena Rastorgueva <[email protected]> * remove hard-coding Signed-off-by: Elena Rastorgueva <[email protected]> * rename var Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> * fix path location and branch (NVIDIA#8304) * fix path location and branch Signed-off-by: Nithin Rao Koluguri <nithinraok> * change to a floating point number Signed-off-by: Nithin Rao Koluguri <nithinraok> --------- Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Somshubra Majumdar <[email protected]> * add deallocate pipeline output optimization (NVIDIA#8279) * add deallocate pipeline output optimization Signed-off-by: Jimmy Zhang <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Jimmy Zhang <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fix memory leak caused by context parallelism hanging references by omegaconf (NVIDIA#8299) * save cp_size to self Signed-off-by: Jimmy Zhang <[email protected]> * use parallel_state instead of self Signed-off-by: Jimmy Zhang <[email protected]> --------- Signed-off-by: Jimmy Zhang <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: Eric Harper <[email protected]> * remove assertion (NVIDIA#8302) Signed-off-by: dimapihtar <[email protected]> * Update PEFT Doc (NVIDIA#8262) * update peft doc Signed-off-by: Chen Cui <[email protected]> * remove old prompt learning doc and notebook Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * Merge branch 'r1.23.0' into chcui/update_peft_doc Signed-off-by: Chen Cui <[email protected]> * revert accidental changes Signed-off-by: Chen Cui <[email protected]> * revert accidental changes Signed-off-by: Chen Cui <[email protected]> --------- Signed-off-by: Chen Cui <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks (NVIDIA#8242) (NVIDIA#8324) * Rebasing canary changes at current main Signed-off-by: Piotr Żelasko <[email protected]> * Move the changes from asr transformer to nlp transformer as originally intended Signed-off-by: Piotr Żelasko <[email protected]> * update eval to strip spaces before punctuations Signed-off-by: stevehuang52 <[email protected]> * update pc strip Signed-off-by: stevehuang52 <[email protected]> * [canary] Refactor: `PromptedAudioToTextLhotseDataset` and `EncDecMultiTaskModel` (NVIDIA#8247) * Create a separate CanaryDataset and use it inside `transformer_bpe_models.py`. Ditches `token_sequence_format`. Signed-off-by: Piotr Żelasko <[email protected]> * [canary] Refactor: move changes in transformer_bpe_models.py to Canar… (NVIDIA#8252) * [canary] Refactor: move changes in transformer_bpe_models.py to CanaryModel Signed-off-by: Piotr Żelasko <[email protected]> * Rename `CanaryModel` to `EncDecMultiTaskModel` and remove inheritance from `EncDecTransfModelBPE`; add a separate config for this model Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * Rename `CanaryDataset` to `PromptedAudioToTextLhotseDataset`; add `prompt_format_fn` argument; clean-up the `_canary_prompt_format` function a bit Signed-off-by: Piotr Żelasko <[email protected]> * Move tokenization into `prompt_format_fn`, fix usage, add docs Signed-off-by: Piotr Żelasko <[email protected]> * Backward-compatible utterance validation Signed-off-by: Piotr Żelasko <[email protected]> * Improve type annotations Signed-off-by: Piotr Żelasko <[email protected]> * config and prompt_fn registration changes from review Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * fix transcribe config Signed-off-by: stevehuang52 <[email protected]> * Refactor Canary to follow schema of remaining ASR models (NVIDIA#8260) * Initial draft of multi task beam decoding strategy Signed-off-by: smajumdar <[email protected]> * Stabilize inference Signed-off-by: smajumdar <[email protected]> * Update AED Multi Task model to mostly conform to Archetype-Type format. Update config Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add change decoding strategy Signed-off-by: smajumdar <[email protected]> * Remove redundant imports Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Cleanup Signed-off-by: smajumdar <[email protected]> * Cleanup Signed-off-by: smajumdar <[email protected]> * remove asr transformer dependency on nlp Signed-off-by: stevehuang52 <[email protected]> * clean up Signed-off-by: stevehuang52 <[email protected]> * copy token_classifier from nlp to asr Signed-off-by: stevehuang52 <[email protected]> * Address comments Signed-off-by: smajumdar <[email protected]> * Add typing to beam decoding Signed-off-by: smajumdar <[email protected]> * Make prompt format configurable Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * drop asr dependency on nlp Signed-off-by: stevehuang52 <[email protected]> --------- Signed-off-by: smajumdar <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: stevehuang52 <[email protected]> * fix transcribe, update asr evaluator Signed-off-by: stevehuang52 <[email protected]> * Extend the docs for the canary prompt_fn Signed-off-by: Piotr Żelasko <[email protected]> * Incorporate changes from Nithin's code review Signed-off-by: Piotr Żelasko <[email protected]> * training bug fix and adding launch script for speech_multitask (NVIDIA#8270) * bug fix and adding launch script for speech_multitask Signed-off-by: Krishna Puvvada <[email protected]> * update launch script example in speech_to_text_aed.py Signed-off-by: Krishna Puvvada <[email protected]> --------- Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> * Fix: drop_last must be true in validation/test otherwise the training will hang Signed-off-by: Piotr Żelasko <[email protected]> * revert to current transcribe API Signed-off-by: stevehuang52 <[email protected]> * revert changes to NLP, update docs Signed-off-by: stevehuang52 <[email protected]> * update eval utils Signed-off-by: stevehuang52 <[email protected]> * update docs Signed-off-by: stevehuang52 <[email protected]> * Remove DALI; rename compute_audio_loss to compute_loss Signed-off-by: Piotr Żelasko <[email protected]> * set default use_model_transcribe=False Signed-off-by: stevehuang52 <[email protected]> * change os.path.dirname to pathlib Signed-off-by: stevehuang52 <[email protected]> * [canary] Test for CanaryTokenizer + refactoring (NVIDIA#8285) * Test for CanaryTokenizer Signed-off-by: Piotr Żelasko <[email protected]> * Attempt at refactor... Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * Update config for AED models (NVIDIA#8294) Signed-off-by: smajumdar <[email protected]> * set default calculate_wer=False in transcribe_speech.py Signed-off-by: stevehuang52 <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks Signed-off-by: Piotr Żelasko <[email protected]> * Apply suggestions from code review, part 1 Co-authored-by: Nithin Rao <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> * Apply suggestions from code review, part 2 Signed-off-by: Piotr Żelasko <[email protected]> * Document compute_loss Signed-off-by: Piotr Żelasko <[email protected]> * update transcribe_speech.py Signed-off-by: stevehuang52 <[email protected]> * add docstring Signed-off-by: stevehuang52 <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: Krishna Puvvada <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> Co-authored-by: stevehuang52 <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: He Huang (Steve) <[email protected]> Co-authored-by: Nithin Rao <[email protected]> (cherry picked from commit 86efc4e) Co-authored-by: Piotr Żelasko <[email protected]> * add code for calling mcore_retro in NeMo * add code for calling mcore_retro in NeMo * runnable, training curve match retro mcore and nemo * working on retro inference * working on megatron_retro_eval.py and megatron_retro_inference.yaml * refactoring text_generation_utils code and retro inference relevant files * clean PR * resolving quick hacks (reading number of train/valid samples from workdir, discrepancy in total samples and samples with neighbors retrieved, tokenizers) * clean repository * revert changes to inference/eval code to original in main * clean code * runable training code, with already implemented eval code * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (NVIDIA#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * Add Bert HF checkpoint converter (NVIDIA#8088) * Add Bert HF checkpoint converter Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Reformat Signed-off-by: yaoyu-33 <[email protected]> * Add BERT ONNX export * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add NeMo BERT to HF BERT script * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Clean code Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update argument names Signed-off-by: yaoyu-33 <[email protected]> * Update build_transformer_config in Bert Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> * revert to original eval code files * revert to original eval code files 2 * revert to original eval code files 3 * revert to original eval code files 4 * clean code * clean code * update my code to support changes from lastest main * commit before rebase r1.23.0 * Multimodal r1.23.0 bug fix (NVIDIA#8315) * Rename quick-gelu Signed-off-by: yaoyu-33 <[email protected]> * ddpm config guard Signed-off-by: yaoyu-33 <[email protected]> * Fix ddpm edit api Signed-off-by: yaoyu-33 <[email protected]> * Fix insert_image_token cfg issue Signed-off-by: yaoyu-33 <[email protected]> * neva updates Signed-off-by: yaoyu-33 <[email protected]> * reformat Signed-off-by: yaoyu-33 <[email protected]> * Add back jenkins Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix jenkins Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix bugs Signed-off-by: yaoyu-33 <[email protected]> * Update default neva template Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * copy paste files from r1.23.0 * clean PR * Fixes for MoE parameter passing & use of AutoTokenizer/Model for mistral. (NVIDIA#8272) Signed-off-by: Alexandros Koumparoulis <[email protected]> * Keep max_seqlen and cu_seqlens_argmin for later micro-batches when PP>1 (NVIDIA#8334) Signed-off-by: Sangkug Lym <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Remove asr webapp (NVIDIA#8347) Signed-off-by: smajumdar <[email protected]> * remove _target_ at model level in aed config (NVIDIA#8351) Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> * revert changes for tts and asr * Add change_vocabulary and save_tokenizers() support to Multitask ASR models (NVIDIA#8357) * Add change_vocabulary and save_tokenizers() support Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update nemo/collections/asr/models/aed_multitask_models.py Co-authored-by: Piotr Żelasko <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> --------- Signed-off-by: smajumdar <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Piotr Żelasko <[email protected]> * Change default (NVIDIA#8371) Signed-off-by: smajumdar <[email protected]> * implement retro's own fwd_bwd_step() and validation_step() to not have argument first_val_step, which the MLM commit doesn't support * adding megatron compile_helpers(), in future can be fixed with correct MLM commit * bug fix in fast-conformer-aed.yaml and adding jenkins test for speech_to_text_aed model (NVIDIA#8368) Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> * Enable megatron core loggers for GPT pretraining (NVIDIA#8354) * Logging changes tested for gpt_pretraining Signed-off-by: Aishwarya Bhandare <[email protected]> * Additional args Signed-off-by: Aishwarya Bhandare <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Aishwarya Bhandare <[email protected]> Co-authored-by: Aishwarya Bhandare <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <[email protected]> * mcore ds fix (NVIDIA#8283) * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (NVIDIA#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * mcore ds fix Signed-off-by: Dmytro Pykhtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update mcore Signed-off-by: dimapihtar <[email protected]> * revert asr files Signed-off-by: dimapihtar <[email protected]> * add comments Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for mcore mock dataset Signed-off-by: dimapihtar <[email protected]> * update mcore version Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update gpt cfg Signed-off-by: dimapihtar <[email protected]> * update mcore commit Signed-off-by: dimapihtar <[email protected]> * fix Bert unit tests Signed-off-by: dimapihtar <[email protected]> * update bert tests Signed-off-by: dimapihtar <[email protected]> * fix bert mcore test Signed-off-by: dimapihtar <[email protected]> * fix gpt jenkins tests Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update apex & TE commits Signed-off-by: dimapihtar <[email protected]> * revert apex installation Signed-off-by: dimapihtar <[email protected]> * turn off the fusion for jenkins Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: dimapihtar <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Pablo Garay <[email protected]> * addressing Eric's reviews * adding existing implementation RETRO files * adding existing implementation RETRO files * Add Finetuning tutorial with HF Datasets (NVIDIA#8356) * Add Finetuning tutorial with HF Datasets Signed-off-by: Nithin Rao Koluguri <nithinraok> * update on Som comments Signed-off-by: Nithin Rao Koluguri <nithinraok> --------- Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * release updates (NVIDIA#8378) * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (NVIDIA#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * mcore ds fix Signed-off-by: Dmytro Pykhtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update mcore Signed-off-by: dimapihtar <[email protected]> * revert asr files Signed-off-by: dimapihtar <[email protected]> * add comments Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for mcore mock dataset Signed-off-by: dimapihtar <[email protected]> * update mcore version Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update gpt cfg Signed-off-by: dimapihtar <[email protected]> * update mcore commit Signed-off-by: dimapihtar <[email protected]> * fix Bert unit tests Signed-off-by: dimapihtar <[email protected]> * update bert tests Signed-off-by: dimapihtar <[email protected]> * fix bert mcore test Signed-off-by: dimapihtar <[email protected]> * fix gpt jenkins tests Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for dict data input type Signed-off-by: dimapihtar <[email protected]> * add mock ds test Signed-off-by: dimapihtar <[email protected]> * add test for dict data input type Signed-off-by: dimapihtar <[email protected]> * mcore ds fix Signed-off-by: dimapihtar <[email protected]> * data input fix Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: dimapihtar <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Pablo Garay <[email protected]> * MCore dataset compatibility for tokenizers (NVIDIA#8390) * Add unique_identifiers for all tokenizers and eod for SentencePieceTokenizer Signed-off-by: Valerie Sarge <[email protected]> * Add generalized token aliases to TokenizerSpec to conform with MegatronTokenizer's interface. Remove now-redundant individual fixes from AutoTokenizer and SentencePieceTokenizer. Signed-off-by: Valerie Sarge <[email protected]> --------- Signed-off-by: Valerie Sarge <[email protected]> Co-authored-by: Pablo Garay <[email protected]> * Mcore customization doc (NVIDIA#8298) * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (NVIDIA#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * Add Bert HF checkpoint converter (NVIDIA#8088) * Add Bert HF checkpoint converter Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Reformat Signed-off-by: yaoyu-33 <[email protected]> * Add BERT ONNX export * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add NeMo BERT to HF BERT script * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Clean code Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update argument names Signed-off-by: yaoyu-33 <[email protected]> * Update build_transformer_config in Bert Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> * initial placeholder Signed-off-by: Huiying Li <[email protected]> * add to intro/index.rst Signed-off-by: Huiying Li <[email protected]> * initial content update Signed-off-by: Huiying Li <[email protected]> * add diff images Signed-off-by: Huiying Li <[email protected]> size Signed-off-by: Huiying Li <[email protected]> * minor fixes * minor language change Signed-off-by: Chen Cui <[email protected]> * clean changes --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Chen Cui <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> Co-authored-by: Huiying Li <[email protected]> Co-authored-by: Chen Cui <[email protected]> * wer fix (NVIDIA#8404) Signed-off-by: Travis Bartley <[email protected]> * updated link to pubmed (NVIDIA#8402) Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * Update NFA video download link (NVIDIA#8406) * update nfa nasa video link Signed-off-by: Elena Rastorgueva <[email protected]> * update link in markdown Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> * revert changes (NVIDIA#8410) Signed-off-by: Chen Cui <[email protected]> * Fix dreambooth data sampler issue (NVIDIA#8400) * Turn on drop last Signed-off-by: yaoyu-33 <[email protected]> * Some neva fixes Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fixed errors in the CTM gen functions (NVIDIA#8416) Signed-off-by: Taejin Park <[email protected]> * add ensemble decoding fix (NVIDIA#8427) Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * SDE bugfix log (NVIDIA#8430) Signed-off-by: George <[email protected]> * mcore customization doc minor fix (NVIDIA#8421) Signed-off-by: Huiying Li <[email protected]> * NeMo-Mistral to HF converter bugfix. (NVIDIA#8353) Signed-off-by: Alexandros Koumparoulis <[email protected]> * Fixing mcore bert for TP, PP and SP (NVIDIA#8336) * Fixing mcore bert for TP, PP and SP * Fixing mcore bert for TP, PP and SP * Fixing mcore version * Fixing mcore version * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> --------- Signed-off-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Add settings to suppress bf16 compile errors in CI on V100 (NVIDIA#8481) * Add settings to suppress bf16 compile errors in CI on V100 Signed-off-by: Abhishree <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Abhishree <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * MoE parameter passing (NVIDIA#8255) * MoE parameter passing Signed-off-by: Alexandros Koumparoulis <[email protected]> * Pass EP/MoE params in consumer scripts. Signed-off-by: Alexandros Koumparoulis <[email protected]> * PR fixes Signed-off-by: Alexandros Koumparoulis <[email protected]> * Use latest commit of mcore-0.5 Signed-off-by: Alexandros Koumparoulis <[email protected]> * CI fix Signed-off-by: Alexandros Koumparoulis <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Update k2 version (NVIDIA#8478) (NVIDIA#8492) Signed-off-by: Vladimir Bataev <[email protected]> * Add fp8 support for SD/Update notebook paths (NVIDIA#8489) * Add fp8 support for SD/Update notebook paths Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <[email protected]> * pin to 0.5.0 (NVIDIA#8465) Signed-off-by: eharper <[email protected]> * Update NeMo Multimodal Requirements (NVIDIA#8515) * Update requirements_multimodal.txt Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * update github raw content link (NVIDIA#8517) Signed-off-by: Chen Cui <[email protected]> * Add dep notice for notebooks (NVIDIA#8522) * add dep notice Signed-off-by: eharper <[email protected]> * revert Signed-off-by: eharper <[email protected]> --------- Signed-off-by: eharper <[email protected]> * Revert FP8 integration (NVIDIA#8520) * Revert FP8 integration Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Update data prep notebook (NVIDIA#8532) Signed-off-by: Mingyuan Ma <[email protected]> * before update branch with latest r1.23.0 * update to run with MLM ae2817b3dde4efb1515061a5311d01d8f85bd99c (runnable training and saving checkpoint) * remove compile_helpers * reverse changes from main branch to r1.23.0 * adding *_legacy files * update MLM commit in Jenkinsfile to latest * debugging Jenkinstest: test different mcore import in retro_dataset * update Jenkinsfile edit megatron_retro_mutransfer_pretrain_legacy.py * removing all mcore RETRO to pass the Jenkinstest * fixing import legacy problem for tests/collections/nlp/test_indexed_retrieval_dataset.py * update Jenkinsfile file to use TE v0.7 * update NeMo to work with latest mcore RETRO (solving TE problems) * update TE commit Jenkinsfile to be the same with r1.23.0's Jenkinsfile * update commit for MLM * jenkinstest debugging * temporary fix RETRO's __init__ for jenkinstest * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * add model.data.dataloader_type=cyclic to jenkinsfile * update code to work with latest megatron-lm main 81dab6067 * update M-LM commit in Jenkinsfile to latest main M-LM 81dab6067 * fix to by pass CI test bf16 problem (following this PR https://github.com/NVIDIA/NeMo/pull/8481/files) * isort and black * adjusting model.micro_batch_size to 1 * fix BRANCH = 'r1.23.0' * replace tutorials dir from main branch to huvu/mcore_retro * fix minor merges conflict * update Jenkinsfile * runnable with a temporary fix from Jacek (unfound -unfinished problem) * runnable with a temporary fix from Jacek (unfound -unfinished problem) * modified nlp_overrides.py back to original * fix checkpoint from Jacek Bieniusiewicz * config Jenkinsfile test * set RETRO Jenkins MBS to 1 * black fix * isort fix * update TE commit * update to latest Jenkinsfile with latest container and commits * remove new RETRO jenkinstest * merge latest main * put RETRO Jenkinstest to the right place * update code for megatron_retro_pretraining_legacy.py * untrack ipa_cmudict-0.7b_nv23.01.txt * untrack ipa_cmudict-0.7b_nv23.01.txt * set config in megatron_retro_pretraining_legacy.py to megatron_retro_config_legacy * update new RETRO jenkinstest to run faster * merging latest main, and edit Jenkinstest * update Jenkinstest for new RETRO to run faster * fix isort * adding RETRO tests to cicd-main.yml action tests * update ipa_cmudict-0.7b_nv23.01.txt * remove quotes for model.data for legacy RETRO action tests --------- Signed-off-by: eharper <[email protected]> Signed-off-by: Mikołaj Błaż <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: dimapihtar <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: Nithin Rao Koluguri <nithinraok> Signed-off-by: Jimmy Zhang <[email protected]> Signed-off-by: Chen Cui <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Alexandros Koumparoulis <[email protected]> Signed-off-by: Sangkug Lym <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: Krishna Puvvada <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Signed-off-by: Aishwarya Bhandare <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: Valerie Sarge <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Travis Bartley <[email protected]> Signed-off-by: Taejin Park <[email protected]> Signed-off-by: George <[email protected]> Signed-off-by: Shanmugam Ramasamy <[email protected]> Signed-off-by: Abhishree <[email protected]> Signed-off-by: Vladimir Bataev <[email protected]> Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: eharper <[email protected]> Co-authored-by: mikolajblaz <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: dimapihtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Piotr Żelasko <[email protected]> Co-authored-by: Elena Rastorgueva <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: JimmyZhang12 <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: Chen Cui <[email protected]> Co-authored-by: Huy Vu2 <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: Bobby Chen <[email protected]> Co-authored-by: akoumpa <[email protected]> Co-authored-by: Sangkug Lym <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: ashbhandare <[email protected]> Co-authored-by: Aishwarya Bhandare <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Pablo Garay <[email protected]> Co-authored-by: Valerie Sarge <[email protected]> Co-authored-by: Huiying <[email protected]> Co-authored-by: Huiying Li <[email protected]> Co-authored-by: tbartley94 <[email protected]> Co-authored-by: Taejin Park <[email protected]> Co-authored-by: George <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Abhishree Thittenamane <[email protected]> Co-authored-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: Vladimir Bataev <[email protected]> Co-authored-by: Ming <[email protected]> Co-authored-by: Huy Vu2 <[email protected]>
rohitrango
pushed a commit
to rohitrango/NeMo
that referenced
this pull request
Jun 25, 2024
* update branch Signed-off-by: eharper <[email protected]> * Add dist ckpt support for regular optimizers (NVIDIA#7749) * Add dist ckpt support for regular optimizers Signed-off-by: Mikołaj Błaż <[email protected]> * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * fix imports Signed-off-by: dimapihtar <[email protected]> * imports fix Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * ci imports fix Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert asr notebook Signed-off-by: dimapihtar <[email protected]> * revert asr notebook Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Mikołaj Błaż <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: dimapihtar <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: dimapihtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Pin lhotse=1.19.2 in r1.23.0 (NVIDIA#8303) Signed-off-by: Piotr Żelasko <[email protected]> * Cache Aware Streaming tutorial notebook (NVIDIA#8296) * add notebook Signed-off-by: Elena Rastorgueva <[email protected]> * rename old notebook to Buffered_Streaming Signed-off-by: Elena Rastorgueva <[email protected]> * call setup_streaming_params in set_default_att_context_size method Signed-off-by: Elena Rastorgueva <[email protected]> * update links in docs Signed-off-by: Elena Rastorgueva <[email protected]> * update links to tutorials in docs Signed-off-by: Elena Rastorgueva <[email protected]> * remove hard-coding Signed-off-by: Elena Rastorgueva <[email protected]> * rename var Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> * fix path location and branch (NVIDIA#8304) * fix path location and branch Signed-off-by: Nithin Rao Koluguri <nithinraok> * change to a floating point number Signed-off-by: Nithin Rao Koluguri <nithinraok> --------- Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Somshubra Majumdar <[email protected]> * add deallocate pipeline output optimization (NVIDIA#8279) * add deallocate pipeline output optimization Signed-off-by: Jimmy Zhang <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Jimmy Zhang <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fix memory leak caused by context parallelism hanging references by omegaconf (NVIDIA#8299) * save cp_size to self Signed-off-by: Jimmy Zhang <[email protected]> * use parallel_state instead of self Signed-off-by: Jimmy Zhang <[email protected]> --------- Signed-off-by: Jimmy Zhang <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: Eric Harper <[email protected]> * remove assertion (NVIDIA#8302) Signed-off-by: dimapihtar <[email protected]> * Update PEFT Doc (NVIDIA#8262) * update peft doc Signed-off-by: Chen Cui <[email protected]> * remove old prompt learning doc and notebook Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * Merge branch 'r1.23.0' into chcui/update_peft_doc Signed-off-by: Chen Cui <[email protected]> * revert accidental changes Signed-off-by: Chen Cui <[email protected]> * revert accidental changes Signed-off-by: Chen Cui <[email protected]> --------- Signed-off-by: Chen Cui <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks (NVIDIA#8242) (NVIDIA#8324) * Rebasing canary changes at current main Signed-off-by: Piotr Żelasko <[email protected]> * Move the changes from asr transformer to nlp transformer as originally intended Signed-off-by: Piotr Żelasko <[email protected]> * update eval to strip spaces before punctuations Signed-off-by: stevehuang52 <[email protected]> * update pc strip Signed-off-by: stevehuang52 <[email protected]> * [canary] Refactor: `PromptedAudioToTextLhotseDataset` and `EncDecMultiTaskModel` (NVIDIA#8247) * Create a separate CanaryDataset and use it inside `transformer_bpe_models.py`. Ditches `token_sequence_format`. Signed-off-by: Piotr Żelasko <[email protected]> * [canary] Refactor: move changes in transformer_bpe_models.py to Canar… (NVIDIA#8252) * [canary] Refactor: move changes in transformer_bpe_models.py to CanaryModel Signed-off-by: Piotr Żelasko <[email protected]> * Rename `CanaryModel` to `EncDecMultiTaskModel` and remove inheritance from `EncDecTransfModelBPE`; add a separate config for this model Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * Rename `CanaryDataset` to `PromptedAudioToTextLhotseDataset`; add `prompt_format_fn` argument; clean-up the `_canary_prompt_format` function a bit Signed-off-by: Piotr Żelasko <[email protected]> * Move tokenization into `prompt_format_fn`, fix usage, add docs Signed-off-by: Piotr Żelasko <[email protected]> * Backward-compatible utterance validation Signed-off-by: Piotr Żelasko <[email protected]> * Improve type annotations Signed-off-by: Piotr Żelasko <[email protected]> * config and prompt_fn registration changes from review Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * fix transcribe config Signed-off-by: stevehuang52 <[email protected]> * Refactor Canary to follow schema of remaining ASR models (NVIDIA#8260) * Initial draft of multi task beam decoding strategy Signed-off-by: smajumdar <[email protected]> * Stabilize inference Signed-off-by: smajumdar <[email protected]> * Update AED Multi Task model to mostly conform to Archetype-Type format. Update config Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add change decoding strategy Signed-off-by: smajumdar <[email protected]> * Remove redundant imports Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Cleanup Signed-off-by: smajumdar <[email protected]> * Cleanup Signed-off-by: smajumdar <[email protected]> * remove asr transformer dependency on nlp Signed-off-by: stevehuang52 <[email protected]> * clean up Signed-off-by: stevehuang52 <[email protected]> * copy token_classifier from nlp to asr Signed-off-by: stevehuang52 <[email protected]> * Address comments Signed-off-by: smajumdar <[email protected]> * Add typing to beam decoding Signed-off-by: smajumdar <[email protected]> * Make prompt format configurable Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * drop asr dependency on nlp Signed-off-by: stevehuang52 <[email protected]> --------- Signed-off-by: smajumdar <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: stevehuang52 <[email protected]> * fix transcribe, update asr evaluator Signed-off-by: stevehuang52 <[email protected]> * Extend the docs for the canary prompt_fn Signed-off-by: Piotr Żelasko <[email protected]> * Incorporate changes from Nithin's code review Signed-off-by: Piotr Żelasko <[email protected]> * training bug fix and adding launch script for speech_multitask (NVIDIA#8270) * bug fix and adding launch script for speech_multitask Signed-off-by: Krishna Puvvada <[email protected]> * update launch script example in speech_to_text_aed.py Signed-off-by: Krishna Puvvada <[email protected]> --------- Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> * Fix: drop_last must be true in validation/test otherwise the training will hang Signed-off-by: Piotr Żelasko <[email protected]> * revert to current transcribe API Signed-off-by: stevehuang52 <[email protected]> * revert changes to NLP, update docs Signed-off-by: stevehuang52 <[email protected]> * update eval utils Signed-off-by: stevehuang52 <[email protected]> * update docs Signed-off-by: stevehuang52 <[email protected]> * Remove DALI; rename compute_audio_loss to compute_loss Signed-off-by: Piotr Żelasko <[email protected]> * set default use_model_transcribe=False Signed-off-by: stevehuang52 <[email protected]> * change os.path.dirname to pathlib Signed-off-by: stevehuang52 <[email protected]> * [canary] Test for CanaryTokenizer + refactoring (NVIDIA#8285) * Test for CanaryTokenizer Signed-off-by: Piotr Żelasko <[email protected]> * Attempt at refactor... Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * Update config for AED models (NVIDIA#8294) Signed-off-by: smajumdar <[email protected]> * set default calculate_wer=False in transcribe_speech.py Signed-off-by: stevehuang52 <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks Signed-off-by: Piotr Żelasko <[email protected]> * Apply suggestions from code review, part 1 Co-authored-by: Nithin Rao <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> * Apply suggestions from code review, part 2 Signed-off-by: Piotr Żelasko <[email protected]> * Document compute_loss Signed-off-by: Piotr Żelasko <[email protected]> * update transcribe_speech.py Signed-off-by: stevehuang52 <[email protected]> * add docstring Signed-off-by: stevehuang52 <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: Krishna Puvvada <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> Co-authored-by: stevehuang52 <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: He Huang (Steve) <[email protected]> Co-authored-by: Nithin Rao <[email protected]> (cherry picked from commit 86efc4e) Co-authored-by: Piotr Żelasko <[email protected]> * add code for calling mcore_retro in NeMo * add code for calling mcore_retro in NeMo * runnable, training curve match retro mcore and nemo * working on retro inference * working on megatron_retro_eval.py and megatron_retro_inference.yaml * refactoring text_generation_utils code and retro inference relevant files * clean PR * resolving quick hacks (reading number of train/valid samples from workdir, discrepancy in total samples and samples with neighbors retrieved, tokenizers) * clean repository * revert changes to inference/eval code to original in main * clean code * runable training code, with already implemented eval code * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (NVIDIA#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * Add Bert HF checkpoint converter (NVIDIA#8088) * Add Bert HF checkpoint converter Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Reformat Signed-off-by: yaoyu-33 <[email protected]> * Add BERT ONNX export * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add NeMo BERT to HF BERT script * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Clean code Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update argument names Signed-off-by: yaoyu-33 <[email protected]> * Update build_transformer_config in Bert Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> * revert to original eval code files * revert to original eval code files 2 * revert to original eval code files 3 * revert to original eval code files 4 * clean code * clean code * update my code to support changes from lastest main * commit before rebase r1.23.0 * Multimodal r1.23.0 bug fix (NVIDIA#8315) * Rename quick-gelu Signed-off-by: yaoyu-33 <[email protected]> * ddpm config guard Signed-off-by: yaoyu-33 <[email protected]> * Fix ddpm edit api Signed-off-by: yaoyu-33 <[email protected]> * Fix insert_image_token cfg issue Signed-off-by: yaoyu-33 <[email protected]> * neva updates Signed-off-by: yaoyu-33 <[email protected]> * reformat Signed-off-by: yaoyu-33 <[email protected]> * Add back jenkins Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix jenkins Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix bugs Signed-off-by: yaoyu-33 <[email protected]> * Update default neva template Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * copy paste files from r1.23.0 * clean PR * Fixes for MoE parameter passing & use of AutoTokenizer/Model for mistral. (NVIDIA#8272) Signed-off-by: Alexandros Koumparoulis <[email protected]> * Keep max_seqlen and cu_seqlens_argmin for later micro-batches when PP>1 (NVIDIA#8334) Signed-off-by: Sangkug Lym <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Remove asr webapp (NVIDIA#8347) Signed-off-by: smajumdar <[email protected]> * remove _target_ at model level in aed config (NVIDIA#8351) Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> * revert changes for tts and asr * Add change_vocabulary and save_tokenizers() support to Multitask ASR models (NVIDIA#8357) * Add change_vocabulary and save_tokenizers() support Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update nemo/collections/asr/models/aed_multitask_models.py Co-authored-by: Piotr Żelasko <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> --------- Signed-off-by: smajumdar <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Piotr Żelasko <[email protected]> * Change default (NVIDIA#8371) Signed-off-by: smajumdar <[email protected]> * implement retro's own fwd_bwd_step() and validation_step() to not have argument first_val_step, which the MLM commit doesn't support * adding megatron compile_helpers(), in future can be fixed with correct MLM commit * bug fix in fast-conformer-aed.yaml and adding jenkins test for speech_to_text_aed model (NVIDIA#8368) Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> * Enable megatron core loggers for GPT pretraining (NVIDIA#8354) * Logging changes tested for gpt_pretraining Signed-off-by: Aishwarya Bhandare <[email protected]> * Additional args Signed-off-by: Aishwarya Bhandare <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Aishwarya Bhandare <[email protected]> Co-authored-by: Aishwarya Bhandare <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <[email protected]> * mcore ds fix (NVIDIA#8283) * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (NVIDIA#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * mcore ds fix Signed-off-by: Dmytro Pykhtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update mcore Signed-off-by: dimapihtar <[email protected]> * revert asr files Signed-off-by: dimapihtar <[email protected]> * add comments Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for mcore mock dataset Signed-off-by: dimapihtar <[email protected]> * update mcore version Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update gpt cfg Signed-off-by: dimapihtar <[email protected]> * update mcore commit Signed-off-by: dimapihtar <[email protected]> * fix Bert unit tests Signed-off-by: dimapihtar <[email protected]> * update bert tests Signed-off-by: dimapihtar <[email protected]> * fix bert mcore test Signed-off-by: dimapihtar <[email protected]> * fix gpt jenkins tests Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update apex & TE commits Signed-off-by: dimapihtar <[email protected]> * revert apex installation Signed-off-by: dimapihtar <[email protected]> * turn off the fusion for jenkins Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: dimapihtar <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Pablo Garay <[email protected]> * addressing Eric's reviews * adding existing implementation RETRO files * adding existing implementation RETRO files * Add Finetuning tutorial with HF Datasets (NVIDIA#8356) * Add Finetuning tutorial with HF Datasets Signed-off-by: Nithin Rao Koluguri <nithinraok> * update on Som comments Signed-off-by: Nithin Rao Koluguri <nithinraok> --------- Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * release updates (NVIDIA#8378) * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (NVIDIA#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * mcore ds fix Signed-off-by: Dmytro Pykhtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update mcore Signed-off-by: dimapihtar <[email protected]> * revert asr files Signed-off-by: dimapihtar <[email protected]> * add comments Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for mcore mock dataset Signed-off-by: dimapihtar <[email protected]> * update mcore version Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update gpt cfg Signed-off-by: dimapihtar <[email protected]> * update mcore commit Signed-off-by: dimapihtar <[email protected]> * fix Bert unit tests Signed-off-by: dimapihtar <[email protected]> * update bert tests Signed-off-by: dimapihtar <[email protected]> * fix bert mcore test Signed-off-by: dimapihtar <[email protected]> * fix gpt jenkins tests Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for dict data input type Signed-off-by: dimapihtar <[email protected]> * add mock ds test Signed-off-by: dimapihtar <[email protected]> * add test for dict data input type Signed-off-by: dimapihtar <[email protected]> * mcore ds fix Signed-off-by: dimapihtar <[email protected]> * data input fix Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: dimapihtar <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Pablo Garay <[email protected]> * MCore dataset compatibility for tokenizers (NVIDIA#8390) * Add unique_identifiers for all tokenizers and eod for SentencePieceTokenizer Signed-off-by: Valerie Sarge <[email protected]> * Add generalized token aliases to TokenizerSpec to conform with MegatronTokenizer's interface. Remove now-redundant individual fixes from AutoTokenizer and SentencePieceTokenizer. Signed-off-by: Valerie Sarge <[email protected]> --------- Signed-off-by: Valerie Sarge <[email protected]> Co-authored-by: Pablo Garay <[email protected]> * Mcore customization doc (NVIDIA#8298) * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (NVIDIA#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * Add Bert HF checkpoint converter (NVIDIA#8088) * Add Bert HF checkpoint converter Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Reformat Signed-off-by: yaoyu-33 <[email protected]> * Add BERT ONNX export * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add NeMo BERT to HF BERT script * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Clean code Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update argument names Signed-off-by: yaoyu-33 <[email protected]> * Update build_transformer_config in Bert Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> * initial placeholder Signed-off-by: Huiying Li <[email protected]> * add to intro/index.rst Signed-off-by: Huiying Li <[email protected]> * initial content update Signed-off-by: Huiying Li <[email protected]> * add diff images Signed-off-by: Huiying Li <[email protected]> size Signed-off-by: Huiying Li <[email protected]> * minor fixes * minor language change Signed-off-by: Chen Cui <[email protected]> * clean changes --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Chen Cui <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> Co-authored-by: Huiying Li <[email protected]> Co-authored-by: Chen Cui <[email protected]> * wer fix (NVIDIA#8404) Signed-off-by: Travis Bartley <[email protected]> * updated link to pubmed (NVIDIA#8402) Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * Update NFA video download link (NVIDIA#8406) * update nfa nasa video link Signed-off-by: Elena Rastorgueva <[email protected]> * update link in markdown Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> * revert changes (NVIDIA#8410) Signed-off-by: Chen Cui <[email protected]> * Fix dreambooth data sampler issue (NVIDIA#8400) * Turn on drop last Signed-off-by: yaoyu-33 <[email protected]> * Some neva fixes Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fixed errors in the CTM gen functions (NVIDIA#8416) Signed-off-by: Taejin Park <[email protected]> * add ensemble decoding fix (NVIDIA#8427) Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * SDE bugfix log (NVIDIA#8430) Signed-off-by: George <[email protected]> * mcore customization doc minor fix (NVIDIA#8421) Signed-off-by: Huiying Li <[email protected]> * NeMo-Mistral to HF converter bugfix. (NVIDIA#8353) Signed-off-by: Alexandros Koumparoulis <[email protected]> * Fixing mcore bert for TP, PP and SP (NVIDIA#8336) * Fixing mcore bert for TP, PP and SP * Fixing mcore bert for TP, PP and SP * Fixing mcore version * Fixing mcore version * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> --------- Signed-off-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Add settings to suppress bf16 compile errors in CI on V100 (NVIDIA#8481) * Add settings to suppress bf16 compile errors in CI on V100 Signed-off-by: Abhishree <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Abhishree <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * MoE parameter passing (NVIDIA#8255) * MoE parameter passing Signed-off-by: Alexandros Koumparoulis <[email protected]> * Pass EP/MoE params in consumer scripts. Signed-off-by: Alexandros Koumparoulis <[email protected]> * PR fixes Signed-off-by: Alexandros Koumparoulis <[email protected]> * Use latest commit of mcore-0.5 Signed-off-by: Alexandros Koumparoulis <[email protected]> * CI fix Signed-off-by: Alexandros Koumparoulis <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Update k2 version (NVIDIA#8478) (NVIDIA#8492) Signed-off-by: Vladimir Bataev <[email protected]> * Add fp8 support for SD/Update notebook paths (NVIDIA#8489) * Add fp8 support for SD/Update notebook paths Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <[email protected]> * pin to 0.5.0 (NVIDIA#8465) Signed-off-by: eharper <[email protected]> * Update NeMo Multimodal Requirements (NVIDIA#8515) * Update requirements_multimodal.txt Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * update github raw content link (NVIDIA#8517) Signed-off-by: Chen Cui <[email protected]> * Add dep notice for notebooks (NVIDIA#8522) * add dep notice Signed-off-by: eharper <[email protected]> * revert Signed-off-by: eharper <[email protected]> --------- Signed-off-by: eharper <[email protected]> * Revert FP8 integration (NVIDIA#8520) * Revert FP8 integration Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Update data prep notebook (NVIDIA#8532) Signed-off-by: Mingyuan Ma <[email protected]> * before update branch with latest r1.23.0 * update to run with MLM ae2817b3dde4efb1515061a5311d01d8f85bd99c (runnable training and saving checkpoint) * remove compile_helpers * reverse changes from main branch to r1.23.0 * adding *_legacy files * update MLM commit in Jenkinsfile to latest * debugging Jenkinstest: test different mcore import in retro_dataset * update Jenkinsfile edit megatron_retro_mutransfer_pretrain_legacy.py * removing all mcore RETRO to pass the Jenkinstest * fixing import legacy problem for tests/collections/nlp/test_indexed_retrieval_dataset.py * update Jenkinsfile file to use TE v0.7 * update NeMo to work with latest mcore RETRO (solving TE problems) * update TE commit Jenkinsfile to be the same with r1.23.0's Jenkinsfile * update commit for MLM * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * jenkinstest debugging * temporary fix RETRO's __init__ for jenkinstest * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * add model.data.dataloader_type=cyclic to jenkinsfile * runnable for inference * update code to work with latest megatron-lm main 81dab6067 * update M-LM commit in Jenkinsfile to latest main M-LM 81dab6067 * cleaning inference code * fix to by pass CI test bf16 problem (following this PR https://github.com/NVIDIA/NeMo/pull/8481/files) * isort and black * adjusting model.micro_batch_size to 1 * fix BRANCH = 'r1.23.0' * replace tutorials dir from main branch to huvu/mcore_retro * fix minor merges conflict * update Jenkinsfile * runnable with a temporary fix from Jacek (unfound -unfinished problem) * runnable with a temporary fix from Jacek (unfound -unfinished problem) * modified nlp_overrides.py back to original * fix checkpoint from Jacek Bieniusiewicz * config Jenkinsfile test * set RETRO Jenkins MBS to 1 * black fix * isort fix * update TE commit * update to latest Jenkinsfile with latest container and commits * remove new RETRO jenkinstest * merge latest main * put RETRO Jenkinstest to the right place * update code for megatron_retro_pretraining_legacy.py * update Jenkins and _legacy.py * update new RETRO jenkinstest to run faster * fixing errors from GitHub Advanced Security / CodeQL * fixing errors from GitHub Advanced Security / CodeQL * update manually branch to huvu/mcore_retro * remove DEBUGGING markers * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * copy paste scripts/tts_dataset_files/ipa_cmudict-0.7b_nv23.01.txt * update codes to fix Github warnings; adding cicd-main.yml action tests * cleaning code, addressing Shanmugam's comments * saving before pulling from main * cleaning code * adding deprecations note * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: eharper <[email protected]> Signed-off-by: Mikołaj Błaż <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: dimapihtar <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: Nithin Rao Koluguri <nithinraok> Signed-off-by: Jimmy Zhang <[email protected]> Signed-off-by: Chen Cui <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Alexandros Koumparoulis <[email protected]> Signed-off-by: Sangkug Lym <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: Krishna Puvvada <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Signed-off-by: Aishwarya Bhandare <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: Valerie Sarge <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Travis Bartley <[email protected]> Signed-off-by: Taejin Park <[email protected]> Signed-off-by: George <[email protected]> Signed-off-by: Shanmugam Ramasamy <[email protected]> Signed-off-by: Abhishree <[email protected]> Signed-off-by: Vladimir Bataev <[email protected]> Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: eharper <[email protected]> Co-authored-by: mikolajblaz <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: dimapihtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Piotr Żelasko <[email protected]> Co-authored-by: Elena Rastorgueva <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: JimmyZhang12 <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: Chen Cui <[email protected]> Co-authored-by: Huy Vu2 <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: Bobby Chen <[email protected]> Co-authored-by: akoumpa <[email protected]> Co-authored-by: Sangkug Lym <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: ashbhandare <[email protected]> Co-authored-by: Aishwarya Bhandare <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Pablo Garay <[email protected]> Co-authored-by: Valerie Sarge <[email protected]> Co-authored-by: Huiying <[email protected]> Co-authored-by: Huiying Li <[email protected]> Co-authored-by: tbartley94 <[email protected]> Co-authored-by: Taejin Park <[email protected]> Co-authored-by: George <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Abhishree Thittenamane <[email protected]> Co-authored-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: Vladimir Bataev <[email protected]> Co-authored-by: Ming <[email protected]> Co-authored-by: Huy Vu2 <[email protected]> Co-authored-by: root <[email protected]>
rohitrango
pushed a commit
to rohitrango/NeMo
that referenced
this pull request
Jun 25, 2024
* update branch Signed-off-by: eharper <[email protected]> * Add dist ckpt support for regular optimizers (NVIDIA#7749) * Add dist ckpt support for regular optimizers Signed-off-by: Mikołaj Błaż <[email protected]> * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * fix imports Signed-off-by: dimapihtar <[email protected]> * imports fix Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * ci imports fix Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert asr notebook Signed-off-by: dimapihtar <[email protected]> * revert asr notebook Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Mikołaj Błaż <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: dimapihtar <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: dimapihtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Pin lhotse=1.19.2 in r1.23.0 (NVIDIA#8303) Signed-off-by: Piotr Żelasko <[email protected]> * Cache Aware Streaming tutorial notebook (NVIDIA#8296) * add notebook Signed-off-by: Elena Rastorgueva <[email protected]> * rename old notebook to Buffered_Streaming Signed-off-by: Elena Rastorgueva <[email protected]> * call setup_streaming_params in set_default_att_context_size method Signed-off-by: Elena Rastorgueva <[email protected]> * update links in docs Signed-off-by: Elena Rastorgueva <[email protected]> * update links to tutorials in docs Signed-off-by: Elena Rastorgueva <[email protected]> * remove hard-coding Signed-off-by: Elena Rastorgueva <[email protected]> * rename var Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> * fix path location and branch (NVIDIA#8304) * fix path location and branch Signed-off-by: Nithin Rao Koluguri <nithinraok> * change to a floating point number Signed-off-by: Nithin Rao Koluguri <nithinraok> --------- Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Somshubra Majumdar <[email protected]> * add deallocate pipeline output optimization (NVIDIA#8279) * add deallocate pipeline output optimization Signed-off-by: Jimmy Zhang <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Jimmy Zhang <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fix memory leak caused by context parallelism hanging references by omegaconf (NVIDIA#8299) * save cp_size to self Signed-off-by: Jimmy Zhang <[email protected]> * use parallel_state instead of self Signed-off-by: Jimmy Zhang <[email protected]> --------- Signed-off-by: Jimmy Zhang <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: Eric Harper <[email protected]> * remove assertion (NVIDIA#8302) Signed-off-by: dimapihtar <[email protected]> * Update PEFT Doc (NVIDIA#8262) * update peft doc Signed-off-by: Chen Cui <[email protected]> * remove old prompt learning doc and notebook Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * Merge branch 'r1.23.0' into chcui/update_peft_doc Signed-off-by: Chen Cui <[email protected]> * revert accidental changes Signed-off-by: Chen Cui <[email protected]> * revert accidental changes Signed-off-by: Chen Cui <[email protected]> --------- Signed-off-by: Chen Cui <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks (NVIDIA#8242) (NVIDIA#8324) * Rebasing canary changes at current main Signed-off-by: Piotr Żelasko <[email protected]> * Move the changes from asr transformer to nlp transformer as originally intended Signed-off-by: Piotr Żelasko <[email protected]> * update eval to strip spaces before punctuations Signed-off-by: stevehuang52 <[email protected]> * update pc strip Signed-off-by: stevehuang52 <[email protected]> * [canary] Refactor: `PromptedAudioToTextLhotseDataset` and `EncDecMultiTaskModel` (NVIDIA#8247) * Create a separate CanaryDataset and use it inside `transformer_bpe_models.py`. Ditches `token_sequence_format`. Signed-off-by: Piotr Żelasko <[email protected]> * [canary] Refactor: move changes in transformer_bpe_models.py to Canar… (NVIDIA#8252) * [canary] Refactor: move changes in transformer_bpe_models.py to CanaryModel Signed-off-by: Piotr Żelasko <[email protected]> * Rename `CanaryModel` to `EncDecMultiTaskModel` and remove inheritance from `EncDecTransfModelBPE`; add a separate config for this model Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * Rename `CanaryDataset` to `PromptedAudioToTextLhotseDataset`; add `prompt_format_fn` argument; clean-up the `_canary_prompt_format` function a bit Signed-off-by: Piotr Żelasko <[email protected]> * Move tokenization into `prompt_format_fn`, fix usage, add docs Signed-off-by: Piotr Żelasko <[email protected]> * Backward-compatible utterance validation Signed-off-by: Piotr Żelasko <[email protected]> * Improve type annotations Signed-off-by: Piotr Żelasko <[email protected]> * config and prompt_fn registration changes from review Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * fix transcribe config Signed-off-by: stevehuang52 <[email protected]> * Refactor Canary to follow schema of remaining ASR models (NVIDIA#8260) * Initial draft of multi task beam decoding strategy Signed-off-by: smajumdar <[email protected]> * Stabilize inference Signed-off-by: smajumdar <[email protected]> * Update AED Multi Task model to mostly conform to Archetype-Type format. Update config Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add change decoding strategy Signed-off-by: smajumdar <[email protected]> * Remove redundant imports Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Cleanup Signed-off-by: smajumdar <[email protected]> * Cleanup Signed-off-by: smajumdar <[email protected]> * remove asr transformer dependency on nlp Signed-off-by: stevehuang52 <[email protected]> * clean up Signed-off-by: stevehuang52 <[email protected]> * copy token_classifier from nlp to asr Signed-off-by: stevehuang52 <[email protected]> * Address comments Signed-off-by: smajumdar <[email protected]> * Add typing to beam decoding Signed-off-by: smajumdar <[email protected]> * Make prompt format configurable Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * drop asr dependency on nlp Signed-off-by: stevehuang52 <[email protected]> --------- Signed-off-by: smajumdar <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: stevehuang52 <[email protected]> * fix transcribe, update asr evaluator Signed-off-by: stevehuang52 <[email protected]> * Extend the docs for the canary prompt_fn Signed-off-by: Piotr Żelasko <[email protected]> * Incorporate changes from Nithin's code review Signed-off-by: Piotr Żelasko <[email protected]> * training bug fix and adding launch script for speech_multitask (NVIDIA#8270) * bug fix and adding launch script for speech_multitask Signed-off-by: Krishna Puvvada <[email protected]> * update launch script example in speech_to_text_aed.py Signed-off-by: Krishna Puvvada <[email protected]> --------- Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> * Fix: drop_last must be true in validation/test otherwise the training will hang Signed-off-by: Piotr Żelasko <[email protected]> * revert to current transcribe API Signed-off-by: stevehuang52 <[email protected]> * revert changes to NLP, update docs Signed-off-by: stevehuang52 <[email protected]> * update eval utils Signed-off-by: stevehuang52 <[email protected]> * update docs Signed-off-by: stevehuang52 <[email protected]> * Remove DALI; rename compute_audio_loss to compute_loss Signed-off-by: Piotr Żelasko <[email protected]> * set default use_model_transcribe=False Signed-off-by: stevehuang52 <[email protected]> * change os.path.dirname to pathlib Signed-off-by: stevehuang52 <[email protected]> * [canary] Test for CanaryTokenizer + refactoring (NVIDIA#8285) * Test for CanaryTokenizer Signed-off-by: Piotr Żelasko <[email protected]> * Attempt at refactor... Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * Update config for AED models (NVIDIA#8294) Signed-off-by: smajumdar <[email protected]> * set default calculate_wer=False in transcribe_speech.py Signed-off-by: stevehuang52 <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks Signed-off-by: Piotr Żelasko <[email protected]> * Apply suggestions from code review, part 1 Co-authored-by: Nithin Rao <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> * Apply suggestions from code review, part 2 Signed-off-by: Piotr Żelasko <[email protected]> * Document compute_loss Signed-off-by: Piotr Żelasko <[email protected]> * update transcribe_speech.py Signed-off-by: stevehuang52 <[email protected]> * add docstring Signed-off-by: stevehuang52 <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: Krishna Puvvada <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> Co-authored-by: stevehuang52 <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: He Huang (Steve) <[email protected]> Co-authored-by: Nithin Rao <[email protected]> (cherry picked from commit 86efc4e) Co-authored-by: Piotr Żelasko <[email protected]> * add code for calling mcore_retro in NeMo * add code for calling mcore_retro in NeMo * runnable, training curve match retro mcore and nemo * working on retro inference * working on megatron_retro_eval.py and megatron_retro_inference.yaml * refactoring text_generation_utils code and retro inference relevant files * clean PR * resolving quick hacks (reading number of train/valid samples from workdir, discrepancy in total samples and samples with neighbors retrieved, tokenizers) * clean repository * revert changes to inference/eval code to original in main * clean code * runable training code, with already implemented eval code * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (NVIDIA#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * Add Bert HF checkpoint converter (NVIDIA#8088) * Add Bert HF checkpoint converter Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Reformat Signed-off-by: yaoyu-33 <[email protected]> * Add BERT ONNX export * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add NeMo BERT to HF BERT script * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Clean code Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update argument names Signed-off-by: yaoyu-33 <[email protected]> * Update build_transformer_config in Bert Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> * revert to original eval code files * revert to original eval code files 2 * revert to original eval code files 3 * revert to original eval code files 4 * clean code * clean code * update my code to support changes from lastest main * commit before rebase r1.23.0 * Multimodal r1.23.0 bug fix (NVIDIA#8315) * Rename quick-gelu Signed-off-by: yaoyu-33 <[email protected]> * ddpm config guard Signed-off-by: yaoyu-33 <[email protected]> * Fix ddpm edit api Signed-off-by: yaoyu-33 <[email protected]> * Fix insert_image_token cfg issue Signed-off-by: yaoyu-33 <[email protected]> * neva updates Signed-off-by: yaoyu-33 <[email protected]> * reformat Signed-off-by: yaoyu-33 <[email protected]> * Add back jenkins Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix jenkins Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix bugs Signed-off-by: yaoyu-33 <[email protected]> * Update default neva template Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * copy paste files from r1.23.0 * clean PR * Fixes for MoE parameter passing & use of AutoTokenizer/Model for mistral. (NVIDIA#8272) Signed-off-by: Alexandros Koumparoulis <[email protected]> * Keep max_seqlen and cu_seqlens_argmin for later micro-batches when PP>1 (NVIDIA#8334) Signed-off-by: Sangkug Lym <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Remove asr webapp (NVIDIA#8347) Signed-off-by: smajumdar <[email protected]> * remove _target_ at model level in aed config (NVIDIA#8351) Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> * revert changes for tts and asr * Add change_vocabulary and save_tokenizers() support to Multitask ASR models (NVIDIA#8357) * Add change_vocabulary and save_tokenizers() support Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update nemo/collections/asr/models/aed_multitask_models.py Co-authored-by: Piotr Żelasko <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> --------- Signed-off-by: smajumdar <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Piotr Żelasko <[email protected]> * Change default (NVIDIA#8371) Signed-off-by: smajumdar <[email protected]> * implement retro's own fwd_bwd_step() and validation_step() to not have argument first_val_step, which the MLM commit doesn't support * adding megatron compile_helpers(), in future can be fixed with correct MLM commit * bug fix in fast-conformer-aed.yaml and adding jenkins test for speech_to_text_aed model (NVIDIA#8368) Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> * Enable megatron core loggers for GPT pretraining (NVIDIA#8354) * Logging changes tested for gpt_pretraining Signed-off-by: Aishwarya Bhandare <[email protected]> * Additional args Signed-off-by: Aishwarya Bhandare <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Aishwarya Bhandare <[email protected]> Co-authored-by: Aishwarya Bhandare <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <[email protected]> * mcore ds fix (NVIDIA#8283) * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (NVIDIA#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * mcore ds fix Signed-off-by: Dmytro Pykhtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update mcore Signed-off-by: dimapihtar <[email protected]> * revert asr files Signed-off-by: dimapihtar <[email protected]> * add comments Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for mcore mock dataset Signed-off-by: dimapihtar <[email protected]> * update mcore version Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update gpt cfg Signed-off-by: dimapihtar <[email protected]> * update mcore commit Signed-off-by: dimapihtar <[email protected]> * fix Bert unit tests Signed-off-by: dimapihtar <[email protected]> * update bert tests Signed-off-by: dimapihtar <[email protected]> * fix bert mcore test Signed-off-by: dimapihtar <[email protected]> * fix gpt jenkins tests Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update apex & TE commits Signed-off-by: dimapihtar <[email protected]> * revert apex installation Signed-off-by: dimapihtar <[email protected]> * turn off the fusion for jenkins Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: dimapihtar <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Pablo Garay <[email protected]> * addressing Eric's reviews * adding existing implementation RETRO files * adding existing implementation RETRO files * Add Finetuning tutorial with HF Datasets (NVIDIA#8356) * Add Finetuning tutorial with HF Datasets Signed-off-by: Nithin Rao Koluguri <nithinraok> * update on Som comments Signed-off-by: Nithin Rao Koluguri <nithinraok> --------- Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * release updates (NVIDIA#8378) * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (NVIDIA#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * mcore ds fix Signed-off-by: Dmytro Pykhtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update mcore Signed-off-by: dimapihtar <[email protected]> * revert asr files Signed-off-by: dimapihtar <[email protected]> * add comments Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for mcore mock dataset Signed-off-by: dimapihtar <[email protected]> * update mcore version Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update gpt cfg Signed-off-by: dimapihtar <[email protected]> * update mcore commit Signed-off-by: dimapihtar <[email protected]> * fix Bert unit tests Signed-off-by: dimapihtar <[email protected]> * update bert tests Signed-off-by: dimapihtar <[email protected]> * fix bert mcore test Signed-off-by: dimapihtar <[email protected]> * fix gpt jenkins tests Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for dict data input type Signed-off-by: dimapihtar <[email protected]> * add mock ds test Signed-off-by: dimapihtar <[email protected]> * add test for dict data input type Signed-off-by: dimapihtar <[email protected]> * mcore ds fix Signed-off-by: dimapihtar <[email protected]> * data input fix Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: dimapihtar <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Pablo Garay <[email protected]> * MCore dataset compatibility for tokenizers (NVIDIA#8390) * Add unique_identifiers for all tokenizers and eod for SentencePieceTokenizer Signed-off-by: Valerie Sarge <[email protected]> * Add generalized token aliases to TokenizerSpec to conform with MegatronTokenizer's interface. Remove now-redundant individual fixes from AutoTokenizer and SentencePieceTokenizer. Signed-off-by: Valerie Sarge <[email protected]> --------- Signed-off-by: Valerie Sarge <[email protected]> Co-authored-by: Pablo Garay <[email protected]> * Mcore customization doc (NVIDIA#8298) * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (NVIDIA#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * Add Bert HF checkpoint converter (NVIDIA#8088) * Add Bert HF checkpoint converter Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Reformat Signed-off-by: yaoyu-33 <[email protected]> * Add BERT ONNX export * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add NeMo BERT to HF BERT script * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Clean code Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update argument names Signed-off-by: yaoyu-33 <[email protected]> * Update build_transformer_config in Bert Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> * initial placeholder Signed-off-by: Huiying Li <[email protected]> * add to intro/index.rst Signed-off-by: Huiying Li <[email protected]> * initial content update Signed-off-by: Huiying Li <[email protected]> * add diff images Signed-off-by: Huiying Li <[email protected]> size Signed-off-by: Huiying Li <[email protected]> * minor fixes * minor language change Signed-off-by: Chen Cui <[email protected]> * clean changes --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Chen Cui <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> Co-authored-by: Huiying Li <[email protected]> Co-authored-by: Chen Cui <[email protected]> * wer fix (NVIDIA#8404) Signed-off-by: Travis Bartley <[email protected]> * updated link to pubmed (NVIDIA#8402) Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * Update NFA video download link (NVIDIA#8406) * update nfa nasa video link Signed-off-by: Elena Rastorgueva <[email protected]> * update link in markdown Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> * revert changes (NVIDIA#8410) Signed-off-by: Chen Cui <[email protected]> * Fix dreambooth data sampler issue (NVIDIA#8400) * Turn on drop last Signed-off-by: yaoyu-33 <[email protected]> * Some neva fixes Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fixed errors in the CTM gen functions (NVIDIA#8416) Signed-off-by: Taejin Park <[email protected]> * add ensemble decoding fix (NVIDIA#8427) Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * SDE bugfix log (NVIDIA#8430) Signed-off-by: George <[email protected]> * mcore customization doc minor fix (NVIDIA#8421) Signed-off-by: Huiying Li <[email protected]> * NeMo-Mistral to HF converter bugfix. (NVIDIA#8353) Signed-off-by: Alexandros Koumparoulis <[email protected]> * Fixing mcore bert for TP, PP and SP (NVIDIA#8336) * Fixing mcore bert for TP, PP and SP * Fixing mcore bert for TP, PP and SP * Fixing mcore version * Fixing mcore version * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> --------- Signed-off-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Add settings to suppress bf16 compile errors in CI on V100 (NVIDIA#8481) * Add settings to suppress bf16 compile errors in CI on V100 Signed-off-by: Abhishree <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Abhishree <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * MoE parameter passing (NVIDIA#8255) * MoE parameter passing Signed-off-by: Alexandros Koumparoulis <[email protected]> * Pass EP/MoE params in consumer scripts. Signed-off-by: Alexandros Koumparoulis <[email protected]> * PR fixes Signed-off-by: Alexandros Koumparoulis <[email protected]> * Use latest commit of mcore-0.5 Signed-off-by: Alexandros Koumparoulis <[email protected]> * CI fix Signed-off-by: Alexandros Koumparoulis <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Update k2 version (NVIDIA#8478) (NVIDIA#8492) Signed-off-by: Vladimir Bataev <[email protected]> * Add fp8 support for SD/Update notebook paths (NVIDIA#8489) * Add fp8 support for SD/Update notebook paths Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <[email protected]> * pin to 0.5.0 (NVIDIA#8465) Signed-off-by: eharper <[email protected]> * Update NeMo Multimodal Requirements (NVIDIA#8515) * Update requirements_multimodal.txt Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * update github raw content link (NVIDIA#8517) Signed-off-by: Chen Cui <[email protected]> * Add dep notice for notebooks (NVIDIA#8522) * add dep notice Signed-off-by: eharper <[email protected]> * revert Signed-off-by: eharper <[email protected]> --------- Signed-off-by: eharper <[email protected]> * Revert FP8 integration (NVIDIA#8520) * Revert FP8 integration Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Update data prep notebook (NVIDIA#8532) Signed-off-by: Mingyuan Ma <[email protected]> * before update branch with latest r1.23.0 * update to run with MLM ae2817b3dde4efb1515061a5311d01d8f85bd99c (runnable training and saving checkpoint) * remove compile_helpers * reverse changes from main branch to r1.23.0 * adding *_legacy files * update MLM commit in Jenkinsfile to latest * debugging Jenkinstest: test different mcore import in retro_dataset * update Jenkinsfile edit megatron_retro_mutransfer_pretrain_legacy.py * removing all mcore RETRO to pass the Jenkinstest * fixing import legacy problem for tests/collections/nlp/test_indexed_retrieval_dataset.py * update Jenkinsfile file to use TE v0.7 * update NeMo to work with latest mcore RETRO (solving TE problems) * update TE commit Jenkinsfile to be the same with r1.23.0's Jenkinsfile * update commit for MLM * jenkinstest debugging * temporary fix RETRO's __init__ for jenkinstest * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * edit splits_string in jenkinsfile to correct format; put RETRO test in front to test faster * add model.data.dataloader_type=cyclic to jenkinsfile * update code to work with latest megatron-lm main 81dab6067 * update M-LM commit in Jenkinsfile to latest main M-LM 81dab6067 * fix to by pass CI test bf16 problem (following this PR https://github.com/NVIDIA/NeMo/pull/8481/files) * isort and black * adjusting model.micro_batch_size to 1 * fix BRANCH = 'r1.23.0' * replace tutorials dir from main branch to huvu/mcore_retro * fix minor merges conflict * update Jenkinsfile * runnable with a temporary fix from Jacek (unfound -unfinished problem) * runnable with a temporary fix from Jacek (unfound -unfinished problem) * modified nlp_overrides.py back to original * fix checkpoint from Jacek Bieniusiewicz * config Jenkinsfile test * set RETRO Jenkins MBS to 1 * black fix * isort fix * update TE commit * update to latest Jenkinsfile with latest container and commits * remove new RETRO jenkinstest * merge latest main * put RETRO Jenkinstest to the right place * update code for megatron_retro_pretraining_legacy.py * untrack ipa_cmudict-0.7b_nv23.01.txt * untrack ipa_cmudict-0.7b_nv23.01.txt * set config in megatron_retro_pretraining_legacy.py to megatron_retro_config_legacy * update new RETRO jenkinstest to run faster * merging latest main, and edit Jenkinstest * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * huvu/mcore_retro_docs first commit * update with main * update RETRO docs * fix scripts/tts_dataset_files/ipa_cmudict-0.7b_nv23.01.txt * update docs * update docs * udpate RETRO docs * update with Jennifer's comments --------- Signed-off-by: eharper <[email protected]> Signed-off-by: Mikołaj Błaż <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: dimapihtar <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: Nithin Rao Koluguri <nithinraok> Signed-off-by: Jimmy Zhang <[email protected]> Signed-off-by: Chen Cui <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Alexandros Koumparoulis <[email protected]> Signed-off-by: Sangkug Lym <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: Krishna Puvvada <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Signed-off-by: Aishwarya Bhandare <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: Valerie Sarge <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Travis Bartley <[email protected]> Signed-off-by: Taejin Park <[email protected]> Signed-off-by: George <[email protected]> Signed-off-by: Shanmugam Ramasamy <[email protected]> Signed-off-by: Abhishree <[email protected]> Signed-off-by: Vladimir Bataev <[email protected]> Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: eharper <[email protected]> Co-authored-by: mikolajblaz <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: dimapihtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Piotr Żelasko <[email protected]> Co-authored-by: Elena Rastorgueva <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: JimmyZhang12 <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: Chen Cui <[email protected]> Co-authored-by: Huy Vu2 <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: Bobby Chen <[email protected]> Co-authored-by: akoumpa <[email protected]> Co-authored-by: Sangkug Lym <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: ashbhandare <[email protected]> Co-authored-by: Aishwarya Bhandare <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Pablo Garay <[email protected]> Co-authored-by: Valerie Sarge <[email protected]> Co-authored-by: Huiying <[email protected]> Co-authored-by: Huiying Li <[email protected]> Co-authored-by: tbartley94 <[email protected]> Co-authored-by: Taejin Park <[email protected]> Co-authored-by: George <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Abhishree Thittenamane <[email protected]> Co-authored-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: Vladimir Bataev <[email protected]> Co-authored-by: Ming <[email protected]> Co-authored-by: Huy Vu2 <[email protected]>
rohitrango
pushed a commit
to rohitrango/NeMo
that referenced
this pull request
Jun 25, 2024
* update branch Signed-off-by: eharper <[email protected]> * Add dist ckpt support for regular optimizers (NVIDIA#7749) * Add dist ckpt support for regular optimizers Signed-off-by: Mikołaj Błaż <[email protected]> * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * fix imports Signed-off-by: dimapihtar <[email protected]> * imports fix Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * ci imports fix Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert asr notebook Signed-off-by: dimapihtar <[email protected]> * revert asr notebook Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Mikołaj Błaż <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: dimapihtar <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: dimapihtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Pin lhotse=1.19.2 in r1.23.0 (NVIDIA#8303) Signed-off-by: Piotr Żelasko <[email protected]> * Cache Aware Streaming tutorial notebook (NVIDIA#8296) * add notebook Signed-off-by: Elena Rastorgueva <[email protected]> * rename old notebook to Buffered_Streaming Signed-off-by: Elena Rastorgueva <[email protected]> * call setup_streaming_params in set_default_att_context_size method Signed-off-by: Elena Rastorgueva <[email protected]> * update links in docs Signed-off-by: Elena Rastorgueva <[email protected]> * update links to tutorials in docs Signed-off-by: Elena Rastorgueva <[email protected]> * remove hard-coding Signed-off-by: Elena Rastorgueva <[email protected]> * rename var Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> * fix path location and branch (NVIDIA#8304) * fix path location and branch Signed-off-by: Nithin Rao Koluguri <nithinraok> * change to a floating point number Signed-off-by: Nithin Rao Koluguri <nithinraok> --------- Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Somshubra Majumdar <[email protected]> * add deallocate pipeline output optimization (NVIDIA#8279) * add deallocate pipeline output optimization Signed-off-by: Jimmy Zhang <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Jimmy Zhang <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fix memory leak caused by context parallelism hanging references by omegaconf (NVIDIA#8299) * save cp_size to self Signed-off-by: Jimmy Zhang <[email protected]> * use parallel_state instead of self Signed-off-by: Jimmy Zhang <[email protected]> --------- Signed-off-by: Jimmy Zhang <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: Eric Harper <[email protected]> * remove assertion (NVIDIA#8302) Signed-off-by: dimapihtar <[email protected]> * Update PEFT Doc (NVIDIA#8262) * update peft doc Signed-off-by: Chen Cui <[email protected]> * remove old prompt learning doc and notebook Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * fix table Signed-off-by: Chen Cui <[email protected]> * Merge branch 'r1.23.0' into chcui/update_peft_doc Signed-off-by: Chen Cui <[email protected]> * revert accidental changes Signed-off-by: Chen Cui <[email protected]> * revert accidental changes Signed-off-by: Chen Cui <[email protected]> --------- Signed-off-by: Chen Cui <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks (NVIDIA#8242) (NVIDIA#8324) * Rebasing canary changes at current main Signed-off-by: Piotr Żelasko <[email protected]> * Move the changes from asr transformer to nlp transformer as originally intended Signed-off-by: Piotr Żelasko <[email protected]> * update eval to strip spaces before punctuations Signed-off-by: stevehuang52 <[email protected]> * update pc strip Signed-off-by: stevehuang52 <[email protected]> * [canary] Refactor: `PromptedAudioToTextLhotseDataset` and `EncDecMultiTaskModel` (NVIDIA#8247) * Create a separate CanaryDataset and use it inside `transformer_bpe_models.py`. Ditches `token_sequence_format`. Signed-off-by: Piotr Żelasko <[email protected]> * [canary] Refactor: move changes in transformer_bpe_models.py to Canar… (NVIDIA#8252) * [canary] Refactor: move changes in transformer_bpe_models.py to CanaryModel Signed-off-by: Piotr Żelasko <[email protected]> * Rename `CanaryModel` to `EncDecMultiTaskModel` and remove inheritance from `EncDecTransfModelBPE`; add a separate config for this model Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * Rename `CanaryDataset` to `PromptedAudioToTextLhotseDataset`; add `prompt_format_fn` argument; clean-up the `_canary_prompt_format` function a bit Signed-off-by: Piotr Żelasko <[email protected]> * Move tokenization into `prompt_format_fn`, fix usage, add docs Signed-off-by: Piotr Żelasko <[email protected]> * Backward-compatible utterance validation Signed-off-by: Piotr Żelasko <[email protected]> * Improve type annotations Signed-off-by: Piotr Żelasko <[email protected]> * config and prompt_fn registration changes from review Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * fix transcribe config Signed-off-by: stevehuang52 <[email protected]> * Refactor Canary to follow schema of remaining ASR models (NVIDIA#8260) * Initial draft of multi task beam decoding strategy Signed-off-by: smajumdar <[email protected]> * Stabilize inference Signed-off-by: smajumdar <[email protected]> * Update AED Multi Task model to mostly conform to Archetype-Type format. Update config Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add change decoding strategy Signed-off-by: smajumdar <[email protected]> * Remove redundant imports Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Cleanup Signed-off-by: smajumdar <[email protected]> * Cleanup Signed-off-by: smajumdar <[email protected]> * remove asr transformer dependency on nlp Signed-off-by: stevehuang52 <[email protected]> * clean up Signed-off-by: stevehuang52 <[email protected]> * copy token_classifier from nlp to asr Signed-off-by: stevehuang52 <[email protected]> * Address comments Signed-off-by: smajumdar <[email protected]> * Add typing to beam decoding Signed-off-by: smajumdar <[email protected]> * Make prompt format configurable Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * drop asr dependency on nlp Signed-off-by: stevehuang52 <[email protected]> --------- Signed-off-by: smajumdar <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: stevehuang52 <[email protected]> * fix transcribe, update asr evaluator Signed-off-by: stevehuang52 <[email protected]> * Extend the docs for the canary prompt_fn Signed-off-by: Piotr Żelasko <[email protected]> * Incorporate changes from Nithin's code review Signed-off-by: Piotr Żelasko <[email protected]> * training bug fix and adding launch script for speech_multitask (NVIDIA#8270) * bug fix and adding launch script for speech_multitask Signed-off-by: Krishna Puvvada <[email protected]> * update launch script example in speech_to_text_aed.py Signed-off-by: Krishna Puvvada <[email protected]> --------- Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> * Fix: drop_last must be true in validation/test otherwise the training will hang Signed-off-by: Piotr Żelasko <[email protected]> * revert to current transcribe API Signed-off-by: stevehuang52 <[email protected]> * revert changes to NLP, update docs Signed-off-by: stevehuang52 <[email protected]> * update eval utils Signed-off-by: stevehuang52 <[email protected]> * update docs Signed-off-by: stevehuang52 <[email protected]> * Remove DALI; rename compute_audio_loss to compute_loss Signed-off-by: Piotr Żelasko <[email protected]> * set default use_model_transcribe=False Signed-off-by: stevehuang52 <[email protected]> * change os.path.dirname to pathlib Signed-off-by: stevehuang52 <[email protected]> * [canary] Test for CanaryTokenizer + refactoring (NVIDIA#8285) * Test for CanaryTokenizer Signed-off-by: Piotr Żelasko <[email protected]> * Attempt at refactor... Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * Update config for AED models (NVIDIA#8294) Signed-off-by: smajumdar <[email protected]> * set default calculate_wer=False in transcribe_speech.py Signed-off-by: stevehuang52 <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks Signed-off-by: Piotr Żelasko <[email protected]> * Apply suggestions from code review, part 1 Co-authored-by: Nithin Rao <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> * Apply suggestions from code review, part 2 Signed-off-by: Piotr Żelasko <[email protected]> * Document compute_loss Signed-off-by: Piotr Żelasko <[email protected]> * update transcribe_speech.py Signed-off-by: stevehuang52 <[email protected]> * add docstring Signed-off-by: stevehuang52 <[email protected]> * Attention encoder-decoder models for multiple speech-to-text tasks Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: Krishna Puvvada <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> Co-authored-by: stevehuang52 <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: He Huang (Steve) <[email protected]> Co-authored-by: Nithin Rao <[email protected]> (cherry picked from commit 86efc4e) Co-authored-by: Piotr Żelasko <[email protected]> * Multimodal r1.23.0 bug fix (NVIDIA#8315) * Rename quick-gelu Signed-off-by: yaoyu-33 <[email protected]> * ddpm config guard Signed-off-by: yaoyu-33 <[email protected]> * Fix ddpm edit api Signed-off-by: yaoyu-33 <[email protected]> * Fix insert_image_token cfg issue Signed-off-by: yaoyu-33 <[email protected]> * neva updates Signed-off-by: yaoyu-33 <[email protected]> * reformat Signed-off-by: yaoyu-33 <[email protected]> * Add back jenkins Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix jenkins Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix bugs Signed-off-by: yaoyu-33 <[email protected]> * Update default neva template Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fixes for MoE parameter passing & use of AutoTokenizer/Model for mistral. (NVIDIA#8272) Signed-off-by: Alexandros Koumparoulis <[email protected]> * Keep max_seqlen and cu_seqlens_argmin for later micro-batches when PP>1 (NVIDIA#8334) Signed-off-by: Sangkug Lym <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Remove asr webapp (NVIDIA#8347) Signed-off-by: smajumdar <[email protected]> * remove _target_ at model level in aed config (NVIDIA#8351) Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> * Add change_vocabulary and save_tokenizers() support to Multitask ASR models (NVIDIA#8357) * Add change_vocabulary and save_tokenizers() support Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update nemo/collections/asr/models/aed_multitask_models.py Co-authored-by: Piotr Żelasko <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> --------- Signed-off-by: smajumdar <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Piotr Żelasko <[email protected]> * Change default (NVIDIA#8371) Signed-off-by: smajumdar <[email protected]> * bug fix in fast-conformer-aed.yaml and adding jenkins test for speech_to_text_aed model (NVIDIA#8368) Signed-off-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> * Enable megatron core loggers for GPT pretraining (NVIDIA#8354) * Logging changes tested for gpt_pretraining Signed-off-by: Aishwarya Bhandare <[email protected]> * Additional args Signed-off-by: Aishwarya Bhandare <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Aishwarya Bhandare <[email protected]> Co-authored-by: Aishwarya Bhandare <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <[email protected]> * mcore ds fix (NVIDIA#8283) * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (NVIDIA#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * mcore ds fix Signed-off-by: Dmytro Pykhtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update mcore Signed-off-by: dimapihtar <[email protected]> * revert asr files Signed-off-by: dimapihtar <[email protected]> * add comments Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for mcore mock dataset Signed-off-by: dimapihtar <[email protected]> * update mcore version Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update gpt cfg Signed-off-by: dimapihtar <[email protected]> * update mcore commit Signed-off-by: dimapihtar <[email protected]> * fix Bert unit tests Signed-off-by: dimapihtar <[email protected]> * update bert tests Signed-off-by: dimapihtar <[email protected]> * fix bert mcore test Signed-off-by: dimapihtar <[email protected]> * fix gpt jenkins tests Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update apex & TE commits Signed-off-by: dimapihtar <[email protected]> * revert apex installation Signed-off-by: dimapihtar <[email protected]> * turn off the fusion for jenkins Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: dimapihtar <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Pablo Garay <[email protected]> * Add Finetuning tutorial with HF Datasets (NVIDIA#8356) * Add Finetuning tutorial with HF Datasets Signed-off-by: Nithin Rao Koluguri <nithinraok> * update on Som comments Signed-off-by: Nithin Rao Koluguri <nithinraok> --------- Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * release updates (NVIDIA#8378) * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (NVIDIA#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * mcore ds fix Signed-off-by: Dmytro Pykhtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update mcore Signed-off-by: dimapihtar <[email protected]> * revert asr files Signed-off-by: dimapihtar <[email protected]> * add comments Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for mcore mock dataset Signed-off-by: dimapihtar <[email protected]> * update mcore version Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update gpt cfg Signed-off-by: dimapihtar <[email protected]> * update mcore commit Signed-off-by: dimapihtar <[email protected]> * fix Bert unit tests Signed-off-by: dimapihtar <[email protected]> * update bert tests Signed-off-by: dimapihtar <[email protected]> * fix bert mcore test Signed-off-by: dimapihtar <[email protected]> * fix gpt jenkins tests Signed-off-by: dimapihtar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for dict data input type Signed-off-by: dimapihtar <[email protected]> * add mock ds test Signed-off-by: dimapihtar <[email protected]> * add test for dict data input type Signed-off-by: dimapihtar <[email protected]> * mcore ds fix Signed-off-by: dimapihtar <[email protected]> * data input fix Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: dimapihtar <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Pablo Garay <[email protected]> * MCore dataset compatibility for tokenizers (NVIDIA#8390) * Add unique_identifiers for all tokenizers and eod for SentencePieceTokenizer Signed-off-by: Valerie Sarge <[email protected]> * Add generalized token aliases to TokenizerSpec to conform with MegatronTokenizer's interface. Remove now-redundant individual fixes from AutoTokenizer and SentencePieceTokenizer. Signed-off-by: Valerie Sarge <[email protected]> --------- Signed-off-by: Valerie Sarge <[email protected]> Co-authored-by: Pablo Garay <[email protected]> * Mcore customization doc (NVIDIA#8298) * [tutorial] fixed missing RIR scripts file. (NVIDIA#8257) Signed-off-by: Xuesong Yang <[email protected]> * add values to en tts dict (NVIDIA#7879) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * Add Bert HF checkpoint converter (NVIDIA#8088) * Add Bert HF checkpoint converter Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Reformat Signed-off-by: yaoyu-33 <[email protected]> * Add BERT ONNX export * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add NeMo BERT to HF BERT script * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Clean code Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update argument names Signed-off-by: yaoyu-33 <[email protected]> * Update build_transformer_config in Bert Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> * initial placeholder Signed-off-by: Huiying Li <[email protected]> * add to intro/index.rst Signed-off-by: Huiying Li <[email protected]> * initial content update Signed-off-by: Huiying Li <[email protected]> * add diff images Signed-off-by: Huiying Li <[email protected]> size Signed-off-by: Huiying Li <[email protected]> * minor fixes * minor language change Signed-off-by: Chen Cui <[email protected]> * clean changes --------- Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Chen Cui <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bobby Chen <[email protected]> Co-authored-by: Huiying Li <[email protected]> Co-authored-by: Chen Cui <[email protected]> * wer fix (NVIDIA#8404) Signed-off-by: Travis Bartley <[email protected]> * updated link to pubmed (NVIDIA#8402) Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * Update NFA video download link (NVIDIA#8406) * update nfa nasa video link Signed-off-by: Elena Rastorgueva <[email protected]> * update link in markdown Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> * revert changes (NVIDIA#8410) Signed-off-by: Chen Cui <[email protected]> * Fix dreambooth data sampler issue (NVIDIA#8400) * Turn on drop last Signed-off-by: yaoyu-33 <[email protected]> * Some neva fixes Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fixed errors in the CTM gen functions (NVIDIA#8416) Signed-off-by: Taejin Park <[email protected]> * add ensemble decoding fix (NVIDIA#8427) Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * SDE bugfix log (NVIDIA#8430) Signed-off-by: George <[email protected]> * mcore customization doc minor fix (NVIDIA#8421) Signed-off-by: Huiying Li <[email protected]> * NeMo-Mistral to HF converter bugfix. (NVIDIA#8353) Signed-off-by: Alexandros Koumparoulis <[email protected]> * Fixing mcore bert for TP, PP and SP (NVIDIA#8336) * Fixing mcore bert for TP, PP and SP * Fixing mcore bert for TP, PP and SP * Fixing mcore version * Fixing mcore version * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> --------- Signed-off-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Add settings to suppress bf16 compile errors in CI on V100 (NVIDIA#8481) * Add settings to suppress bf16 compile errors in CI on V100 Signed-off-by: Abhishree <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Abhishree <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * MoE parameter passing (NVIDIA#8255) * MoE parameter passing Signed-off-by: Alexandros Koumparoulis <[email protected]> * Pass EP/MoE params in consumer scripts. Signed-off-by: Alexandros Koumparoulis <[email protected]> * PR fixes Signed-off-by: Alexandros Koumparoulis <[email protected]> * Use latest commit of mcore-0.5 Signed-off-by: Alexandros Koumparoulis <[email protected]> * CI fix Signed-off-by: Alexandros Koumparoulis <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Update k2 version (NVIDIA#8478) (NVIDIA#8492) Signed-off-by: Vladimir Bataev <[email protected]> * Add fp8 support for SD/Update notebook paths (NVIDIA#8489) * Add fp8 support for SD/Update notebook paths Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <[email protected]> * pin to 0.5.0 (NVIDIA#8465) Signed-off-by: eharper <[email protected]> * Update NeMo Multimodal Requirements (NVIDIA#8515) * Update requirements_multimodal.txt Signed-off-by: yaoyu-33 <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * update github raw content link (NVIDIA#8517) Signed-off-by: Chen Cui <[email protected]> * Add dep notice for notebooks (NVIDIA#8522) * add dep notice Signed-off-by: eharper <[email protected]> * revert Signed-off-by: eharper <[email protected]> --------- Signed-off-by: eharper <[email protected]> * Revert FP8 integration (NVIDIA#8520) * Revert FP8 integration Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Update data prep notebook (NVIDIA#8532) Signed-off-by: Mingyuan Ma <[email protected]> * Add back fp8 support * SD-FP8: fix the bug of normalization location Signed-off-by: Mingyuan Ma <[email protected]> * map potential FP8 ckpt to FP16 Signed-off-by: Mingyuan Ma <[email protected]> * Add TE fp8 training Signed-off-by: Mingyuan Ma <[email protected]> * Only overwrite unet precision when self.megatron_amp_O2 is true Signed-off-by: Mingyuan Ma <[email protected]> * New structure is now compatible with old ckpts Signed-off-by: Mingyuan Ma <[email protected]> * Add support on mapping old unet checkpoint to new structure and FP8 structure Signed-off-by: Mingyuan Ma <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Sync with main branch Signed-off-by: Mingyuan Ma <[email protected]> --------- Signed-off-by: eharper <[email protected]> Signed-off-by: Mikołaj Błaż <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: dimapihtar <[email protected]> Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: Nithin Rao Koluguri <nithinraok> Signed-off-by: Jimmy Zhang <[email protected]> Signed-off-by: Chen Cui <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Alexandros Koumparoulis <[email protected]> Signed-off-by: Sangkug Lym <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: Krishna Puvvada <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Signed-off-by: Aishwarya Bhandare <[email protected]> Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: Valerie Sarge <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Huiying Li <[email protected]> Signed-off-by: Travis Bartley <[email protected]> Signed-off-by: Taejin Park <[email protected]> Signed-off-by: George <[email protected]> Signed-off-by: Shanmugam Ramasamy <[email protected]> Signed-off-by: Abhishree <[email protected]> Signed-off-by: Vladimir Bataev <[email protected]> Signed-off-by: Mingyuan Ma <[email protected]> Co-authored-by: eharper <[email protected]> Co-authored-by: mikolajblaz <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: dimapihtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Piotr Żelasko <[email protected]> Co-authored-by: Elena Rastorgueva <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: JimmyZhang12 <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: Chen Cui <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: akoumpa <[email protected]> Co-authored-by: Sangkug Lym <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: Krishna Puvvada <[email protected]> Co-authored-by: ashbhandare <[email protected]> Co-authored-by: Aishwarya Bhandare <[email protected]> Co-authored-by: Mariana <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Pablo Garay <[email protected]> Co-authored-by: Valerie Sarge <[email protected]> Co-authored-by: Huiying <[email protected]> Co-authored-by: Bobby Chen <[email protected]> Co-authored-by: Huiying Li <[email protected]> Co-authored-by: tbartley94 <[email protected]> Co-authored-by: Taejin Park <[email protected]> Co-authored-by: George <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Abhishree Thittenamane <[email protected]> Co-authored-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: Vladimir Bataev <[email protected]> Co-authored-by: Mengdi Wang <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What does this PR do ?
Add FP8 support for SD training.
Collection: [multimodal/text_to_image/statble_diffusion]
Changelog
Usage
Jenkins CI
To run Jenkins, a NeMo User with write access must comment
jenkins
on the PR.Before your PR is "Ready for review"
Pre checks:
PR Type:
If you haven't finished some of the above items you can still open "Draft" PR.
Who can review?
Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.
Additional Information