Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release 2.0.0rc1 #9786

Closed
wants to merge 173 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
173 commits
Select commit Hold shift + click to select a range
9b91df5
Nemotron export - fixing megatron_export.py (#9625)
borisfom Jul 8, 2024
62459cc
support lora when kv_channel != hidden_size / num_heads (#9636)
suiyoubi Jul 8, 2024
55ee9f4
[Nemo CICD] Docker temp files auto-cleanup (#9642)
pablo-garay Jul 9, 2024
b97da9c
Update Dockerfile.ci (#9651)
huvunvidia Jul 9, 2024
1c73e1b
SDXL improvements (and support for Draft+) [DRAFT PR] (#9543)
rohitrango Jul 9, 2024
8898b76
Triton deployment improvements for in-framework models (#9600)
jukim-nv Jul 9, 2024
2ee8646
Use FP8 in GPT TP2 test (#9451)
jbaczek Jul 9, 2024
f5d5221
enables default data step in megatron parallel to operate on a wider …
jomitchellnv Jul 10, 2024
355d3c5
Revert "enables default data step in megatron parallel to operate on …
marcromeyn Jul 10, 2024
74e32c8
Contrastive Reranker/Reward model (#9171)
arendu Jul 10, 2024
b4821e1
unpin transformers version (#9606)
dimapihtar Jul 10, 2024
14d42dc
Added CPU offloading docs (#9479)
sanandaraj5597 Jul 10, 2024
3ab0a2a
Update llama-3 PEFT notebook to download model from NGC (#9667)
shashank3959 Jul 10, 2024
4e5174b
fix pipeline parallel dtype bug (#9637) (#9661)
github-actions[bot] Jul 10, 2024
900ca0b
LITA integration (#9578)
Slyne Jul 11, 2024
08cc515
Parametrize FPS group (#9648) (#9669)
github-actions[bot] Jul 11, 2024
d4f1d3c
Huvu/mcore t5 (#9677)
huvunvidia Jul 11, 2024
3cf5a1d
chore: Version bump NeMo (#9631)
ko3n1g Jul 11, 2024
693c55f
add a bit more for timeout (#9702)
pablo-garay Jul 11, 2024
3482dc1
Alit/mamba (#9696)
JRD971000 Jul 11, 2024
6f91dcc
NeMo performance feature documentation (#9482)
erhoo82 Jul 11, 2024
472ff9f
[TTS] Add fullband mel codec checkpoints (#9704)
rlangman Jul 11, 2024
3e2bb21
Adding support for mcore T5 Eval - SFT - PEFT (#9679)
huvunvidia Jul 12, 2024
081a163
Allows non-strict load with distributed checkpoints (#9613) (#9715)
github-actions[bot] Jul 12, 2024
599b60f
refactor: Uniform BRANCH for notebooks (#9710)
ko3n1g Jul 12, 2024
56f024d
fix legacy ds padding bug (#9716)
dimapihtar Jul 15, 2024
f477ed1
enables default data step in megatron parallel to operate on a wider …
jomitchellnv Jul 15, 2024
8e9ef94
[NeMo-UX] Fix when optimizers are setup for PEFT (#9619) (#9647)
github-actions[bot] Jul 15, 2024
02ff85b
refactor: README (#9712)
ko3n1g Jul 15, 2024
27b5c47
Remove mask if use fusion mask (#9723)
hsiehjackson Jul 15, 2024
b166a8f
[NeMo-UX] Fix imports so local configuration of runs works again (#96…
github-actions[bot] Jul 15, 2024
a9dbe37
add contianer (#9731)
JRD971000 Jul 15, 2024
c21e011
update pretrained model text (#9724) (#9745)
github-actions[bot] Jul 15, 2024
8128122
[Nemo-UX] Including all trainable-params in a PEFT-checkpoint (#9650)…
github-actions[bot] Jul 15, 2024
f2e3232
[NeMo-UX] Make TE and Apex dependencies optional (#9732)
ashors1 Jul 15, 2024
34bfe1b
[NeMo-UX] Minor bug fix when TE/Apex not installed (#9749)
ashors1 Jul 16, 2024
2d87359
make 'load_directly_on_device' configurable (#9657) (#9674)
github-actions[bot] Jul 16, 2024
b39a487
TorchAudio installation workaround for incorrect `PYTORCH_VERSION` en…
github-actions[bot] Jul 16, 2024
c279c13
Create __init__.py (#9755)
stevehuang52 Jul 16, 2024
ca94e03
Canary Adapters tutorial (#9670)
titu1994 Jul 16, 2024
4696e64
match nemo 1's default behavior for drop_last and pad_samples_to_glob…
github-actions[bot] Jul 17, 2024
af4f0ed
ci: Bump MCore tag (#9744)
ko3n1g Jul 17, 2024
f65fea2
Fix the serialization of partial functions in nemo 2.0 (#9668)
sararb Jul 17, 2024
ae62a4d
ci: Add PAT to create-pullrequest action (#9769)
ko3n1g Jul 17, 2024
649ad1f
Speeds up copying of necessary artifact files with SaveRestoreConnect…
terrykong Jul 17, 2024
8c04749
ci: Remove ko3n1g from reviewers (#9773)
ko3n1g Jul 17, 2024
f4fa399
bump mcore commit in Dockerfile (#9766)
ashors1 Jul 17, 2024
9c138a4
Yuya/add checkpoints section (#9329)
yaoyu-33 Jul 17, 2024
f01cbe2
Release automation (#9687)
ko3n1g Jul 18, 2024
8949dc8
Rename speech dockerfile appropriately (#9778)
pablo-garay Jul 18, 2024
234ac8b
Add option to convert PyTriton response to OpenAI format (#9726)
athitten Jul 18, 2024
8035dd0
ci: Fix changelog-config (#9788)
ko3n1g Jul 18, 2024
e244cfd
Support configurable extra fields for LazyNeMoTarredIterator (#9548)
pzelasko Jul 19, 2024
5546190
upper bound huggingface-hub version to 0.24.0 (exc.) (#9799)
akoumpa Jul 19, 2024
ab8988e
Add Lita, Vila and Vita TRTLLM export (#9734)
xuanzic Jul 19, 2024
2f9bcae
Fix null/None truncation field with extra generating tokens (#9379)
hsiehjackson Jul 19, 2024
d6e0c15
vLLM 0.5.1 update (#9779) (#9798)
github-actions[bot] Jul 19, 2024
6f9f731
minor fix tutorial (#9813)
JRD971000 Jul 19, 2024
425d5dd
Adds Tiktoken tokenizer for Nemotron-Mistral 12B (#9797)
ertkonuk Jul 22, 2024
903e2aa
typos and branch name update to r2.0.0rc1 (#9847)
github-actions[bot] Jul 23, 2024
7e16edb
[Audio] Remove torchaudio from spectrogram transforms (#9802)
anteju Jul 23, 2024
00fe96f
Add "offline" data cache generation support (#9576)
dimapihtar Jul 23, 2024
f83f974
Fix few issues and docs for neva and clip in r2.0.0rc1 (#9681) (#9808)
github-actions[bot] Jul 23, 2024
d28c1b2
fix lita bugs (#9810) (#9828)
github-actions[bot] Jul 23, 2024
9c06389
Add SpeechLLM docs (#9780)
stevehuang52 Jul 23, 2024
b7ca4a3
Add Llama 3.1 LoRA PEFT and NIM Deployment tutorial (#9844)
shashank3959 Jul 23, 2024
1711334
[Audio] Metric with Squim objective and MOS (#9751)
anteju Jul 23, 2024
3885287
add dummy vision and text transformer config (assumed mcore to be fal…
github-actions[bot] Jul 24, 2024
53d7a91
Jpg2p jun18 (#9538)
BuyuanCui Jul 24, 2024
10b5442
Query TransformerConfig attributes when copying btw configs (#9832)
akoumpa Jul 24, 2024
153efc6
NeMo MoE docs (#9579)
akoumpa Jul 24, 2024
5726d49
Fix hf hub for 0.24+ (#9806) (#9857)
github-actions[bot] Jul 24, 2024
a022765
Fix RNNT alignments test (#9770) (#9862)
artbataev Jul 24, 2024
cd0d2c2
Update arch check for SD (#9783)
minitu Jul 24, 2024
11bbe0b
Revert "Jpg2p jun18 (#9538)" (#9874)
pablo-garay Jul 24, 2024
e400b6d
Revert and further changes for CICD bugfix (#9875)
pablo-garay Jul 24, 2024
0ec398e
Change decord to guard import (#9865)
meatybobby Jul 25, 2024
8a08319
By default trust remote code from HF Datasets (#9886) (#9887)
github-actions[bot] Jul 25, 2024
28a2e52
Docs: add "Nemo Fundamentals" page (#9835) (#9889)
github-actions[bot] Jul 25, 2024
7eac53c
Set default Torch version if YY.MM format is not met (#9776)
thomasdhc Jul 25, 2024
5491642
fix arg name (#9848)
erhoo82 Jul 25, 2024
bd185cb
Added defer wgrad support with mcore optim (#9896)
sanandaraj5597 Jul 25, 2024
fe16259
tutorial fixes (#9907)
JRD971000 Jul 26, 2024
74c2caf
[TTS][Vietnamese] Add VietnameseCharsTokenizer (#9665)
huutuongtu Jul 26, 2024
c81f7cf
Integrate TRT-LLM v0.11 (#9705)
oyilmaz-nvidia Jul 26, 2024
fc0e4ab
add code owner (#9917)
pablo-garay Jul 26, 2024
67aee7f
Fix Docker build. Make Dockerfile consistent with CI (#9784) (#9915)
github-actions[bot] Jul 26, 2024
eaa8fa9
Fix missing parallelisms (#9725) (#9758)
github-actions[bot] Jul 26, 2024
b484116
Remove assert guard preventing PEFT + VP being used together (#9833)
vysarge Jul 26, 2024
940bdb3
minor fix tutorial (#9927)
JRD971000 Jul 26, 2024
7506f98
New mcore transformer block spec (#9035)
github-actions[bot] Jul 26, 2024
4152edf
[NeMo-UX] log val loss (#9814) (#9831)
github-actions[bot] Jul 26, 2024
b505940
[NeMo-UX] Fix some dataloading bugs (#9807) (#9850)
github-actions[bot] Jul 26, 2024
7a9b2ef
[NeMo UX] Support generating datasets using different train/valid/tes…
github-actions[bot] Jul 26, 2024
491b577
[TTS][ja-JP] g2p and tokenizer. (#9879)
XuesongYang Jul 26, 2024
515fcd9
comment out flaky QAT test (#9935)
pablo-garay Jul 26, 2024
6fe71f7
update branch name for script (#9936) (#9937)
github-actions[bot] Jul 26, 2024
ac706e9
updte branch (#9942) (#9943)
github-actions[bot] Jul 27, 2024
bc6d534
Fix for scoring when Canary outputs a lot of endoftext tokens. (#9901)
pzelasko Jul 27, 2024
9dbdf57
Fixing canary prompts for back compatibility (#9836)
tbartley94 Jul 27, 2024
72f630d
Editing mcore T5 initialization interface to compatible with latest M…
huvunvidia Jul 28, 2024
558cf2e
Adding distributed training functionality in bert embedding model and…
adityavavre Jul 29, 2024
0a9eb95
Draft: Add LoRA test with sequence parallelism (#9433)
michal2409 Jul 29, 2024
e201b00
[NeMo-UX] Set async_save from strategy rather than ModelCheckpoint (#…
github-actions[bot] Jul 30, 2024
bd17e77
[NeMo-UX] Adding recipes (#9720) (#9851)
github-actions[bot] Jul 30, 2024
86bfac2
fix a minor bug with async checkpointing where a checkpoint would get…
github-actions[bot] Jul 30, 2024
7dd9378
Akoumparouli/mixtral fixes for r2.0.0rc1 (#9911) (#9933)
github-actions[bot] Jul 30, 2024
68d4dba
fix mem (#9957)
gshennvm Jul 30, 2024
ad4dbdd
Run a sample query for a quantized model conditionally (#9717)
janekl Jul 30, 2024
c29d91a
Adds a Knob for OnlineSampling by introducing 'global_sample_mapping'…
conver334 Jul 30, 2024
0716675
ci: Refactor tests to template (#9950)
ko3n1g Jul 30, 2024
1a8c9b6
mistral-2407 checkpoint converter (#9953)
akoumpa Jul 30, 2024
9f9a01a
Updated Latest News and added Blog section (#9898)
jgerh Jul 31, 2024
6e5c87c
[NeMo-UX] Fix some serialization bugs (#9868) (#9928)
github-actions[bot] Jul 31, 2024
c63fd16
make progress bar easier to parse (#9877) (#9888)
github-actions[bot] Jul 31, 2024
f3fd44c
Resiliency features update (#9714) (#9979)
github-actions[bot] Jul 31, 2024
4d1d5bf
hypen -> hyphen (#9976)
akoumpa Jul 31, 2024
e5b0fef
Fix for `train.controlnet.controlnet_v1_5_1node_100steps` (#9678)
rohitrango Jul 31, 2024
876c851
Fix Canary not stripping prompt from reference + more test coverage (…
pzelasko Aug 1, 2024
adcc72b
add bert conversion and cicd (#9966)
JRD971000 Aug 1, 2024
5f30ede
Rename sdk references to NeMo Run (#9872) (#9925)
github-actions[bot] Aug 1, 2024
92f5f53
[NeMo-UX] Fixes to make PreemptionCallback work (#9830) (#9908)
github-actions[bot] Aug 1, 2024
a7fbf6b
Support hf tokenizer in packed seq preparation script (#9974)
cuichenx Aug 1, 2024
38517a9
[NeMo-UX] Use single instance of loss reductions in GPTModel (#9801) …
github-actions[bot] Aug 1, 2024
d5507f0
Fix sdxl inference and add compatible changes for launcher support (#…
Victor49152 Aug 1, 2024
c277b9a
Add Mcore microbatches calculator supports (#9968)
BoxiangW Aug 1, 2024
fdf07a9
Allow users to pass HF via local-path: model_cls.import_ckpt("hf:///p…
akoumpa Aug 2, 2024
0b0e5bc
clean up (#10024)
stevehuang52 Aug 2, 2024
df07e7f
Update precision arg (#9859)
eagle705 Aug 2, 2024
0176224
Fix bug with distopt buckets when virtual pipelines are enabled (#9408)
timmoon10 Aug 2, 2024
7cf703c
TRT-LLM checkpoint in safetensors (#10011)
oyilmaz-nvidia Aug 2, 2024
406bdb0
[NeMo-UX] Add more NeMo Logger tests (#9795) (#9931)
github-actions[bot] Aug 2, 2024
58ec3e6
[NeMo-UX] Wait for async checkpoint calls to complete in preemption c…
hemildesai Aug 2, 2024
b3aa26a
[🤠]: Howdy folks, let's bump `Dockerfile.ci` to 0b4c4cf ! (#10017)
ko3n1g Aug 2, 2024
dcb9832
Re-tarring script (#10004)
pzelasko Aug 3, 2024
75a734a
Revert back updated test (#9756)
pablo-garay Aug 3, 2024
d899549
log TFLOPs (#9932)
malay-nagda Aug 4, 2024
7876e03
[NeMo-UX] Add missing docstrings and update some defaults (#9895) (#9…
github-actions[bot] Aug 5, 2024
df1ea12
[NeMo-UX] Add distributed checkpointing unit tests (#9922)
github-actions[bot] Aug 5, 2024
7cfca95
Make Mistral HF to NeMo checkpoint converter iterable to allow for lo…
akoumpa Aug 5, 2024
53f361a
Schroedinger bridge model for audio processing (#9880)
anteju Aug 5, 2024
6acab7a
Update continual pre-training argument override strategy (#10027)
yaoyu-33 Aug 5, 2024
88b43e7
remove prepare_youmakeup.py (#10028)
Slyne Aug 5, 2024
5d2e0d4
pin dask version (#9910) (#9918)
github-actions[bot] Aug 6, 2024
8bd2a71
r2.0.0rc1 fix for dist checkpoint loading (#9854) (#9924)
github-actions[bot] Aug 6, 2024
7f16668
fix vision clip (#9842)
Slyne Aug 6, 2024
8880c37
[🤠]: Howdy folks, let's bump `Dockerfile.ci` to 2fd6e2b ! (#10045)
ko3n1g Aug 6, 2024
8c70d7b
Update to vLLM 0.5.3.post1, LLAMA3.1 support (#9873)
apanteleev Aug 6, 2024
874a1ea
Add moe_router_pre_softmax=True to checkpoint converters & nemo-ux (#…
akoumpa Aug 6, 2024
1bc5a87
Add support for overlapped gradient and parameter synchronization for…
michal2409 Aug 6, 2024
d4d41ed
Nemotron Conversion script (#10031)
suiyoubi Aug 6, 2024
c98be2f
Change default interval to step and add lr in prog_bar in nemo.lightn…
hemildesai Aug 6, 2024
71ab9d7
nemo ux mixtral 8x22b config (#9977)
akoumpa Aug 6, 2024
23697b9
[NeMo-UX] Fix logging of consumed samples in MegatronDataSampler (#10…
hemildesai Aug 6, 2024
cb72f02
[NeMo-UX] Update default PTL logging `save_dir` (#9954) (#9985)
github-actions[bot] Aug 6, 2024
7b7d02f
[NeMo-UX] Wrap task config save in a try/except (#9956) (#9984)
github-actions[bot] Aug 6, 2024
c6412af
Use directly trtllm-build command for quantized checkpoints (#9982)
janekl Aug 7, 2024
d0efff0
Fix Canary's transcribe(): move to device and prompt feeding (#10054)
pzelasko Aug 7, 2024
8dbe1da
nemo-ux: Use kv_channels to enable cases where head_dim != hidden_siz…
akoumpa Aug 7, 2024
e5e648d
[lhotse] Support for NeMo tarred manifests with offset field (#10035)
pzelasko Aug 7, 2024
695fadc
remove assertation for models with unknown chat template (#10042)
akoumpa Aug 7, 2024
7cae5c4
add mixtral neva tutorial + update tutorials + update configs (#9926)…
github-actions[bot] Aug 7, 2024
6cf59fa
add the mcore interface for optim arg; average_in_collective (#10010)
erhoo82 Aug 7, 2024
633c373
mixtral recipe (#9975)
akoumpa Aug 7, 2024
b9ecf00
Make MegatronStrategy.parallelism return ParallelismConfig (#10012)
akoumpa Aug 7, 2024
4ee9148
log learning rate before optimizer step (#10063)
ashors1 Aug 7, 2024
e879330
Update dev doc for features (#10049)
yaoyu-33 Aug 7, 2024
3ba23bd
Update base image for tts asr import check test (#10072)
thomasdhc Aug 8, 2024
58606fe
Drop PyTorch 2.1 version check from fabric strategies (#10079)
farhadrgh Aug 9, 2024
86715c1
Moe doc fixes (#10077)
akoumpa Aug 9, 2024
de29d19
Comment docs (#10109)
ericharper Aug 12, 2024
4b4f763
ci: Token permission to cancel Workflow run (#10095)
ko3n1g Aug 12, 2024
d6cfdc0
ci: Proper cleanup (#10114)
ko3n1g Aug 12, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
2 changes: 2 additions & 0 deletions .github/CODEOWNERS
Validating CODEOWNERS rules …
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
.github/ @pablo-garay @ko3n1g
Dockerfile.ci @pablo-garay @ko3n1g
24 changes: 11 additions & 13 deletions .github/workflows/_test_template.yml
Original file line number Diff line number Diff line change
Expand Up @@ -39,18 +39,13 @@ jobs:
outputs:
conclusion: ${{ steps.main.conclusion }}
log: ${{ steps.main.outputs.log }}
container:
image: nemoci.azurecr.io/nemo_container_${{ github.run_id }}
options:
--device=/dev/nvidia0
--gpus all
--shm-size=8g
--env TRANSFORMERS_OFFLINE=0
--env HYDRA_FULL_ERROR=1
--volume /mnt/datadrive/TestData:/home/TestData
permissions:
actions: write # Required for cancelling workflows
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Docker system cleanup
run: |
docker system prune -a --filter "until=48h" --force

- id: main
name: Run main script
timeout-minutes: ${{ inputs.TIMEOUT }}
Expand All @@ -59,7 +54,7 @@ jobs:
(
set -e

${{ inputs.SCRIPT }}
docker run --rm --device=/dev/nvidia0 --gpus all --shm-size=8g --env TRANSFORMERS_OFFLINE=0 --env HYDRA_FULL_ERROR=1 --volume /mnt/datadrive/TestData:/home/TestData nemoci.azurecr.io/nemo_container_${{ github.run_id }} bash -c '${{ inputs.SCRIPT }}'
) 2> >(tee err.log)

EXIT_CODE=$?
Expand All @@ -70,6 +65,9 @@ jobs:

- uses: "NVIDIA/NeMo/.github/actions/cancel-workflow@main"
if: failure() && inputs.IS_OPTIONAL == false

- name: after_script
if: always() && inputs.AFTER_SCRIPT != ':'
run: ${{ inputs.AFTER_SCRIPT }}
run: |
docker run --rm --device=/dev/nvidia0 --gpus all --shm-size=8g --env TRANSFORMERS_OFFLINE=0 --env HYDRA_FULL_ERROR=1 --volume /mnt/datadrive/TestData:/home/TestData nemoci.azurecr.io/nemo_container_${{ github.run_id }} bash -c '${{ inputs.AFTER_SCRIPT }}'

1,705 changes: 922 additions & 783 deletions .github/workflows/cicd-main.yml

Large diffs are not rendered by default.

20 changes: 10 additions & 10 deletions .github/workflows/config/changelog-config.json
Original file line number Diff line number Diff line change
@@ -1,47 +1,47 @@
{
"categories": [
{
"title": "## ASR \n\n<details><summary>Changelog</summary>\n\n</details>\n\n",
"title": "## ASR\n\n<details><summary>Changelog</summary>",
"labels": ["asr"],
"exclude_labels": ["cherry-pick"]
},
{
"title": "## TTS \n\n<details><summary>Changelog</summary>\n\n</details>\n\n",
"title": "</details>\n\n## TTS\n\n<details><summary>Changelog</summary>",
"labels": ["tts"],
"exclude_labels": ["cherry-pick"]
},
{
"title": "## NLP / NMT \n\n<details><summary>Changelog</summary>\n\n</details>\n\n",
"title": "</details>\n\n## NLP / NMT\n\n<details><summary>Changelog</summary>",
"labels": ["nlp", "nmt", "megatron"],
"exclude_labels": ["cherry-pick"]
},
{
"title": "## Text Normalization / Inverse Text Normalization \n\n<details><summary>Changelog</summary>\n\n</details>\n\n",
"title": "</details>\n\n## Text Normalization / Inverse Text Normalization\n\n<details><summary>Changelog</summary>",
"labels": ["tn", "itn"],
"exclude_labels": ["cherry-pick"]
},
{
"title": "## NeMo Tools \n\n<details><summary>Changelog</summary>\n\n</details>\n\n",
"title": "</details>\n\n## NeMo Tools\n\n<details><summary>Changelog</summary>",
"labels": ["tools"],
"exclude_labels": ["cherry-pick"]
},
{
"title": "## Export \n\n<details><summary>Changelog</summary>\n\n</details>\n\n",
"title": "</details>\n\n## Export\n\n<details><summary>Changelog</summary>",
"labels": ["export"],
"exclude_labels": ["cherry-pick"]
},
{
"title": "## Documentation \n\n<details><summary>Changelog</summary>\n\n</details>\n\n",
"title": "</details>\n\n## Documentation\n\n<details><summary>Changelog</summary>",
"labels": ["docs"],
"exclude_labels": ["cherry-pick"]
},
{
"title": "## Bugfixes \n\n<details><summary>Changelog</summary>\n\n</details>\n\n",
"title": "</details>\n\n## Bugfixes\n\n<details><summary>Changelog</summary>",
"labels": ["bug"],
"exclude_labels": ["cherry-pick"]
},
{
"title": "## Cherrypick \n\n<details><summary>Changelog</summary>\n\n</details>\n\n",
"title": "</details>\n\n## Cherrypick\n\n<details><summary>Changelog</summary>",
"labels": ["cherry-pick"],
"exclude_labels": ["cherry-pick"]
}
Expand All @@ -50,7 +50,7 @@
"ignore"
],
"sort": "ASC",
"template": "\n${{CHANGELOG}}\nUncategorized:\n${{UNCATEGORIZED}}\n\n",
"template": "\n${{CHANGELOG}}</details>\n\n## Uncategorized:\n\n<details><summary>Changelog</summary>\n\n${{UNCATEGORIZED}}\n</details>\n",
"pr_template": "- ${{TITLE}} by @${{AUTHOR}} :: PR: #${{NUMBER}}",
"empty_template": "${{OWNER}}\n${{REPO}}\n${{FROM_TAG}}\n${{TO_TAG}}",
"label_extractor": [
Expand Down
6 changes: 3 additions & 3 deletions .github/workflows/import-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ jobs:
test-asr-imports:
runs-on: ubuntu-latest
container:
image: pytorch/pytorch:2.0.1-cuda11.7-cudnn8-runtime
image: pytorch/pytorch:2.4.0-cuda11.8-cudnn9-runtime
steps:
- name: Checkout repo
uses: actions/checkout@v2
Expand Down Expand Up @@ -43,7 +43,7 @@ jobs:
test-tts-imports:
runs-on: ubuntu-latest
container:
image: pytorch/pytorch:2.0.1-cuda11.7-cudnn8-runtime
image: pytorch/pytorch:2.4.0-cuda11.8-cudnn9-runtime
steps:
- name: Checkout repo
uses: actions/checkout@v2
Expand All @@ -70,4 +70,4 @@ jobs:
# Run import checks
python tests/core_ptl/check_imports.py --domain "tts"
# Uninstall NeMo
pip uninstall -y nemo_toolkit
pip uninstall -y nemo_toolkit
59 changes: 59 additions & 0 deletions .github/workflows/mcore-tag-bump-bot.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
# Regularly updates the CI container
name: MCore Tag Bump Bot
on:
workflow_dispatch:
schedule:
- cron: 0 0 * * *

jobs:
main:
runs-on: ubuntu-latest
environment: main
steps:
- name: Checkout NVIDIA/Megatron-LM
uses: actions/checkout@v4
with:
repository: NVIDIA/Megatron-LM
ref: main
path: ${{ github.run_id }}

- name: Get latest mcore commit
id: ref
run: |
cd ${{ github.run_id }}
sha=$(git rev-parse origin/main)
echo "sha=${sha}" >> "$GITHUB_OUTPUT"
echo "short_sha=${sha:0:7}" >> "$GITHUB_OUTPUT"
echo "date=$(date +%F)" >> "$GITHUB_OUTPUT"

- name: Checkout ${{ github.repository }}
uses: actions/checkout@v4
with:
path: ${{ github.run_id }}
token: ${{ secrets.PAT }}

- name: Bump MCORE_TAG
run: |
cd ${{ github.run_id }}
sed -i 's/^ARG MCORE_TAG=.*$/ARG MCORE_TAG=${{ steps.ref.outputs.sha }}/' Dockerfile.ci

- name: Create Bump PR
uses: peter-evans/create-pull-request@v6
id: create-pull-request
with:
path: ${{ github.run_id }}
branch: bump-ci-container-${{ steps.ref.outputs.date }}
base: main
title: 'Bump `Dockerfile.ci` (${{ steps.ref.outputs.date }})'
token: ${{ secrets.PAT }}
body: |
🚀 PR to Bump `Dockerfile.ci`.

📝 Please remember the following to-do's before merge:
- [ ] Verify the presubmit CI

🙏 Please merge this PR only if the CI workflow completed successfully.
commit-message: "[🤠]: Howdy folks, let's bump `Dockerfile.ci` to ${{ steps.ref.outputs.short_sha }} !"
signoff: true
reviewers: 'pablo-garay'
labels: 'Run CICD'
192 changes: 192 additions & 0 deletions .github/workflows/release-freeze.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,192 @@
name: "NeMo Code freeze"

on:
workflow_dispatch:
inputs:
next_version:
description: 'MAJOR.MINOR.PATCH[rcN] (Example: 2.0.0rc1, or 2.1.0)'
required: true
type: string
mcore_version:
description: 'Version of MCore to use (must be a valid git ref)'
required: true
type: string
jobs:
create-release-branch:
runs-on: ubuntu-latest
if: contains(fromJSON('["ko3n1g"]'), github.actor)
environment:
name: main
outputs:
version: ${{ steps.release-branch.outputs.version }}
steps:
- name: Checkout repository
uses: actions/checkout@v4
with:
path: ${{ github.run_id }}
fetch-depth: 0
fetch-tags: true
ref: main

- name: Get Previous tag
id: previous-tag
# git for-each-ref --sort=-creatordate --format '%(refname)' refs/tags ==> refs/tags/vX.Y.Z in descending order of date
# awk 'FNR == 2 {print substr($1, 11, length($1))}') ==> Selects the 2nd tag from the list, then strips the /refs/tags/ part of the tag
# set-output name=tag_name:: ==> Takes the clean tag vX.Y.Z and sets it to steps.previous_tag.outputs.tag_name
run: |
TAG=$(git for-each-ref --sort=-creatordate --format '%(refname)' refs/tags | awk 'FNR == 2 {print substr($1, 11, length($1))}')
echo "tag-name=$TAG" >> "$GITHUB_OUTPUT"

- name: Get release branch ref
id: release-branch
run: |
cd ${{ github.run_id }}

VERSION=$(python -c 'import nemo; print(nemo.__version__)')
echo "Release version r$VERSION" > version
echo "version=$VERSION" >> "$GITHUB_OUTPUT"

- name: Pin branch name in Notebooks
run: |
cd ${{ github.run_id }}
find tutorials -type f -name "*.ipynb" -exec sed -i "s/BRANCH = 'main'/BRANCH = 'r${{ steps.release-branch.outputs.version }}'/g" {} +

- name: Pin MCore in Dockerfile
run: |
cd ${{ github.run_id }}
sed -i 's/^ARG MCORE_TAG=.*$/ARG MCORE_TAG=${{ inputs.mcore_version }}/' Dockerfile.ci

- name: Build Changelog
id: build-changelog
uses: mikepenz/[email protected]
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
# Configuration file is setup with filters for domains
# owner:repo must point to current repo
# fromTag: Auto resolved from historical tag order (previous tag compared to current tag)
# toTag: Current tag reference
configuration: ".github/workflows/config/changelog-config.json"
owner: ${{ github.repository_owner }}
repo: ${{ github.event.repository.name }}
ignorePreReleases: "false"
failOnError: "false"
fromTag: ${{ steps.previous-tag.outputs.tag-name }}
toTag: main

- name: Append Changelog
run: |
echo "${{ steps.build-changelog.outputs.changelog }}"

- name: Create Release PR
uses: peter-evans/create-pull-request@v6
id: create-pull-request
with:
path: ${{ github.run_id }}
branch: r${{ steps.release-branch.outputs.version }}
title: 'Release `${{ steps.release-branch.outputs.version }}`'
body: |
🚀 PR to release NeMo `${{ steps.release-branch.outputs.version }}`.

📝 Please remember the following to-do's before merge:
- [ ] Fill-in the comment `Highlights`
- [ ] Review the comment `Detailed Changelogs`

🚨 Please also keep in mind to _not_ delete the headings of the task commits. They are required by the post-merge automation.

🙏 Please merge this PR only if the CI workflow completed successfully.

commit-message: "[🤠]: Howdy folks, let's release NeMo `${{ steps.release-branch.outputs.version }}` !"
signoff: true
assignees: okoenig
labels: 'Run CICD'

- name: Add Summary comment
uses: peter-evans/create-or-update-comment@v4
with:
issue-number: ${{ steps.create-pull-request.outputs.pull-request-number }}
body: |
# Highlights
_<here-goes-the-summary...>_

- name: Add Changelog comment
uses: peter-evans/create-or-update-comment@v4
with:
issue-number: ${{ steps.create-pull-request.outputs.pull-request-number }}
body: |
# Detailed Changelogs
${{ steps.build-changelog.outputs.changelog }}

bump-next-version:
runs-on: ubuntu-latest
needs: [create-release-branch]
environment:
name: main
env:
VERSION_FILE: nemo/package_info.py
steps:
- name: Checkout repository
uses: actions/checkout@v4
with:
path: ${{ github.run_id }}
fetch-depth: 0
fetch-tags: true
ref: main
token: ${{ secrets.PAT }}

- name: Bump version
id: bump-version
run: |
cd ${{ github.run_id }}
FULL_VERSION_NUM=${{ inputs.next_version }}
VERSION=${FULL_VERSION_NUM%%rc*}
MAJOR=$(echo "$VERSION" | cut -d. -f1)
MINOR=$(echo "$VERSION" | cut -d. -f2)
PATCH=$(echo "$VERSION" | cut -d. -f3)
PRE_RELEASE=${FULL_VERSION_NUM#$VERSION}

sed -i 's/^MAJOR\s*=\s*[0-9]\+/MAJOR = '$MAJOR'/' $VERSION_FILE
sed -i 's/^MINOR\s*=\s*[0-9]\+/MINOR = '$MINOR'/' $VERSION_FILE
sed -i 's/^PATCH\s*=\s*[0-9]\+/PATCH = '$PATCH'/' $VERSION_FILE
sed -i 's/^PRE_RELEASE\s*=\s*'.*'/PRE_RELEASE = '\'$PRE_RELEASE\''/' $VERSION_FILE

cat $VERSION_FILE
PRE_RELEASE=$(echo $PRE_RELEASE | tr -d "'")
echo "version=$MAJOR.$MINOR.$PATCH$PRE_RELEASE" >> "$GITHUB_OUTPUT"

- name: Create Version Bump PR
uses: peter-evans/create-pull-request@v6
id: create-pull-request
with:
path: ${{ github.run_id }}
branch: bot/chore/version-bump-${{ inputs.next_version }}
title: 'Version bump to `${{ inputs.next_version }}`'
body: |
🚀 Version bump NeMo toolkit to `${{ inputs.next_version }}`

commit-message: "[🤠]: Howdy folks, let's bump NeMo `${{ inputs.next_version }}` !"
signoff: true
assignees: okoenig
labels: 'Run CICD'

notify:
runs-on: ubuntu-latest
needs: [create-release-branch, bump-next-version]
environment:
name: main
steps:
- name: Main
run: |
MESSAGE='{
"blocks": [
{
"type": "section",
"text": {
"type": "mrkdwn",
"text": "Releasebot 🤖: NeMo Toolkit has been frozen 🎉 to branch `r${{ needs.create-release-branch.outputs.version }}`"
}
}
]
}'

curl -X POST -H "Content-type: application/json" --data "$MESSAGE" ${{ secrets.SLACK_RELEASE_ENDPOINT }}
Loading
Loading