Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Yash/dev llava next #10749

Closed
wants to merge 99 commits into from
Closed

Yash/dev llava next #10749

wants to merge 99 commits into from

Commits on Oct 15, 2024

  1. locate weights path within MegatronCheckpointIO

    Signed-off-by: ashors1 <[email protected]>
    ashors1 committed Oct 15, 2024
    Configuration menu
    Copy the full SHA
    dcb38f3 View commit details
    Browse the repository at this point in the history
  2. small refactor

    Signed-off-by: ashors1 <[email protected]>
    ashors1 committed Oct 15, 2024
    Configuration menu
    Copy the full SHA
    6010377 View commit details
    Browse the repository at this point in the history
  3. remove another instance of ckpt_to_weights_subdir

    Signed-off-by: ashors1 <[email protected]>
    ashors1 committed Oct 15, 2024
    Configuration menu
    Copy the full SHA
    2417023 View commit details
    Browse the repository at this point in the history

Commits on Oct 16, 2024

  1. move ckpt_to_weights_subdir

    Signed-off-by: ashors1 <[email protected]>
    ashors1 committed Oct 16, 2024
    Configuration menu
    Copy the full SHA
    eed4bad View commit details
    Browse the repository at this point in the history
  2. Apply isort and black reformatting

    Signed-off-by: ashors1 <[email protected]>
    ashors1 committed Oct 16, 2024
    Configuration menu
    Copy the full SHA
    52c0ad3 View commit details
    Browse the repository at this point in the history
  3. Apply isort and black reformatting

    Signed-off-by: artbataev <[email protected]>
    artbataev committed Oct 16, 2024
    Configuration menu
    Copy the full SHA
    e5dbd61 View commit details
    Browse the repository at this point in the history
  4. add weights path in save_checkpoint

    Signed-off-by: ashors1 <[email protected]>
    ashors1 committed Oct 16, 2024
    Configuration menu
    Copy the full SHA
    45df47d View commit details
    Browse the repository at this point in the history
  5. fix circular import

    Signed-off-by: ashors1 <[email protected]>
    ashors1 committed Oct 16, 2024
    Configuration menu
    Copy the full SHA
    c49e2a6 View commit details
    Browse the repository at this point in the history
  6. Apply isort and black reformatting

    Signed-off-by: ashors1 <[email protected]>
    ashors1 committed Oct 16, 2024
    Configuration menu
    Copy the full SHA
    d3ffd5d View commit details
    Browse the repository at this point in the history
  7. handle saving in ckpt_to_weights_subdir

    Signed-off-by: ashors1 <[email protected]>
    ashors1 committed Oct 16, 2024
    Configuration menu
    Copy the full SHA
    ea49e20 View commit details
    Browse the repository at this point in the history
  8. fix minor typo

    Signed-off-by: ashors1 <[email protected]>
    ashors1 committed Oct 16, 2024
    Configuration menu
    Copy the full SHA
    c4c3fd5 View commit details
    Browse the repository at this point in the history
  9. bug fixes

    Signed-off-by: ashors1 <[email protected]>
    ashors1 committed Oct 16, 2024
    Configuration menu
    Copy the full SHA
    3ae933e View commit details
    Browse the repository at this point in the history

Commits on Oct 17, 2024

  1. fix undefined variable

    Signed-off-by: ashors1 <[email protected]>
    ashors1 committed Oct 17, 2024
    Configuration menu
    Copy the full SHA
    f1fbec5 View commit details
    Browse the repository at this point in the history
  2. move function

    Signed-off-by: ashors1 <[email protected]>
    ashors1 committed Oct 17, 2024
    Configuration menu
    Copy the full SHA
    8161076 View commit details
    Browse the repository at this point in the history
  3. Apply isort and black reformatting

    Signed-off-by: ashors1 <[email protected]>
    ashors1 committed Oct 17, 2024
    Configuration menu
    Copy the full SHA
    994719e View commit details
    Browse the repository at this point in the history
  4. fix adapter meta file path

    Signed-off-by: Chen Cui <[email protected]>
    cuichenx committed Oct 17, 2024
    Configuration menu
    Copy the full SHA
    ea51ab2 View commit details
    Browse the repository at this point in the history
  5. Apply isort and black reformatting

    Signed-off-by: cuichenx <[email protected]>
    cuichenx committed Oct 17, 2024
    Configuration menu
    Copy the full SHA
    871ac85 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    f5889ca View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    df2c4b1 View commit details
    Browse the repository at this point in the history

Commits on Oct 18, 2024

  1. fix mixtral test

    Signed-off-by: ashors1 <[email protected]>
    ashors1 committed Oct 18, 2024
    Configuration menu
    Copy the full SHA
    5aec05b View commit details
    Browse the repository at this point in the history
  2. fix mixtral test

    Signed-off-by: ashors1 <[email protected]>
    ashors1 committed Oct 18, 2024
    Configuration menu
    Copy the full SHA
    2df54e3 View commit details
    Browse the repository at this point in the history
  3. use function for weights subdir

    Signed-off-by: Chen Cui <[email protected]>
    cuichenx committed Oct 18, 2024
    Configuration menu
    Copy the full SHA
    440a244 View commit details
    Browse the repository at this point in the history
  4. address comments

    Signed-off-by: ashors1 <[email protected]>
    ashors1 committed Oct 18, 2024
    Configuration menu
    Copy the full SHA
    b2883a1 View commit details
    Browse the repository at this point in the history
  5. move asserts

    Signed-off-by: ashors1 <[email protected]>
    ashors1 committed Oct 18, 2024
    Configuration menu
    Copy the full SHA
    26a8d8d View commit details
    Browse the repository at this point in the history

Commits on Oct 21, 2024

  1. fix undefined vars

    Signed-off-by: ashors1 <[email protected]>
    ashors1 committed Oct 21, 2024
    Configuration menu
    Copy the full SHA
    ac1779b View commit details
    Browse the repository at this point in the history
  2. bug fix

    Signed-off-by: ashors1 <[email protected]>
    ashors1 committed Oct 21, 2024
    Configuration menu
    Copy the full SHA
    f380df7 View commit details
    Browse the repository at this point in the history

Commits on Oct 22, 2024

  1. fix mixtral test

    Signed-off-by: ashors1 <[email protected]>
    ashors1 committed Oct 22, 2024
    Configuration menu
    Copy the full SHA
    e15cafa View commit details
    Browse the repository at this point in the history

Commits on Oct 24, 2024

  1. Integrating mcore export (#10238)

    * Integrating mcore export
    
    * Integrating mcore export
    
    * Apply isort and black reformatting
    
    Signed-off-by: shanmugamr1992 <[email protected]>
    
    * Move trt imports in nemo.collections.llm inside respective functions (#10234)
    
    Signed-off-by: Hemil Desai <[email protected]>
    
    * Add tests for LazyNeMoIterator and fix case with metadata_only=True and offsets in manifest (#10198)
    
    * Add tests for LazyNeMoIterator and fix case with manifest_only=True and offsets in manifest
    
    Signed-off-by: Piotr Żelasko <[email protected]>
    
    * Address code review
    
    Signed-off-by: Piotr Żelasko <[email protected]>
    
    * fix tests
    
    Signed-off-by: Piotr Żelasko <[email protected]>
    
    * fix tests
    
    Signed-off-by: Piotr Żelasko <[email protected]>
    
    ---------
    
    Signed-off-by: Piotr Żelasko <[email protected]>
    
    * [NeMo-UX] Fix a serialization bug that prevents users from moving checkpoints (#9939)
    
    * perfor serialization using relative paths to allow users to move checkpoints after they're saved
    
    Signed-off-by: ashors1 <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: ashors1 <[email protected]>
    
    * remove unused import
    
    Signed-off-by: ashors1 <[email protected]>
    
    * fix artifact load
    
    Signed-off-by: ashors1 <[email protected]>
    
    * fix path artifact
    
    Signed-off-by: ashors1 <[email protected]>
    
    * remove unused import
    
    Signed-off-by: ashors1 <[email protected]>
    
    ---------
    
    Signed-off-by: ashors1 <[email protected]>
    Signed-off-by: ashors1 <[email protected]>
    Co-authored-by: ashors1 <[email protected]>
    
    * Add MemoryProfileCallback (#10166)
    
    * Add MemoryProfileCallback
    
    Signed-off-by: Shriya Palsamudram <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: ShriyaPalsamudram <[email protected]>
    
    * Remove reference cycles, save snapshot on specific ranks
    
    Signed-off-by: Shriya Palsamudram <[email protected]>
    
    * Remove unnecessary imports
    
    Signed-off-by: Shriya Palsamudram <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: ShriyaPalsamudram <[email protected]>
    
    * Update docstring
    
    Signed-off-by: Shriya Palsamudram <[email protected]>
    
    ---------
    
    Signed-off-by: Shriya Palsamudram <[email protected]>
    Signed-off-by: ShriyaPalsamudram <[email protected]>
    Signed-off-by: Shriya Rishab <[email protected]>
    Co-authored-by: ShriyaPalsamudram <[email protected]>
    
    * Lower bound transformers to support nemotron (#10240)
    
    Signed-off-by: Dong Hyuk Chang <[email protected]>
    Co-authored-by: Dong Hyuk Chang <[email protected]>
    
    * [Audio] SSL Pretraining framework for flow-matching model for audio processing (#10052)
    
    Flow matching generative model with SSL pretraining framework
    
    Signed-off-by: Pin-Jui Ku <[email protected]>
    Co-authored-by: Kuray107 <[email protected]>
    
    * Revert torchrun fix for model import (#10251)
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * [NeMo-UX[ Move nemotron imports inline (#10255)
    
    * Move nemotron transformers + tokenizer imports inline to reduce number of required deps
    
    Signed-off-by: Marc Romeyn <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: marcromeyn <[email protected]>
    
    ---------
    
    Signed-off-by: Marc Romeyn <[email protected]>
    Signed-off-by: marcromeyn <[email protected]>
    Co-authored-by: marcromeyn <[email protected]>
    
    * Wrap CPU model init with megatron_lazy_init_context (#10219)
    
    * Wrap CPU model init with megatron_lazy_init_context
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * Cleanup checkpoint-dir if saving fails
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: akoumpa <[email protected]>
    
    ---------
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    Signed-off-by: akoumpa <[email protected]>
    Co-authored-by: akoumpa <[email protected]>
    
    * Bump `Dockerfile.ci` (2024-08-22) (#10227)
    
    * [🤠]: Howdy folks, let's bump `Dockerfile.ci` to 124bcff !
    
    Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
    
    * fix bert flags
    
    Signed-off-by: Oliver Koenig <[email protected]>
    
    ---------
    
    Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
    Signed-off-by: Oliver Koenig <[email protected]>
    Co-authored-by: pablo-garay <[email protected]>
    
    * salm export trtllm (#10245)
    
    Signed-off-by: slyne deng <[email protected]>
    Co-authored-by: slyne deng <[email protected]>
    
    * [🤠]: Howdy folks, let's bump `Dockerfile.ci` to ef85bc9 ! (#10250)
    
    Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
    Co-authored-by: pablo-garay <[email protected]>
    
    * [🤠]: Howdy folks, let's bump `Dockerfile.ci` to 01ca03f ! (#10266)
    
    Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
    Signed-off-by: oliver könig <[email protected]>
    Co-authored-by: pablo-garay <[email protected]>
    
    * Load model in the target export precision by default in PTQ (#10267)
    
    * Load model in the target export precision by default
    
    Signed-off-by: Jan Lasek <[email protected]>
    
    * Enable megatron_amp_O2=true to actually use half-precision
    
    Signed-off-by: Jan Lasek <[email protected]>
    
    ---------
    
    Signed-off-by: Jan Lasek <[email protected]>
    Signed-off-by: Jan Lasek <[email protected]>
    
    * Add WandbPlugin, NsysPlugin and PreemptionPlugin to nemo.lightning.run.plugins (#10223)
    
    * Add WandbPlugin, NsysPlugin and PreemptionPlugin to nemo.lightning.run.plugins
    
    Signed-off-by: Hemil Desai <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: hemildesai <[email protected]>
    
    * Remove duplicate
    
    Signed-off-by: Hemil Desai <[email protected]>
    
    * Add entity to wandb logger
    
    Signed-off-by: Hemil Desai <[email protected]>
    
    * Add documentation
    
    Signed-off-by: Hemil Desai <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: hemildesai <[email protected]>
    
    * Add warning
    
    Signed-off-by: Hemil Desai <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: hemildesai <[email protected]>
    
    * PR feedback
    
    Signed-off-by: Hemil Desai <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: hemildesai <[email protected]>
    
    * Add comments
    
    Signed-off-by: Hemil Desai <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: hemildesai <[email protected]>
    
    ---------
    
    Signed-off-by: Hemil Desai <[email protected]>
    Signed-off-by: hemildesai <[email protected]>
    Co-authored-by: hemildesai <[email protected]>
    
    * [NeMo-UX] Handle absolute logger directories in nemo_logger (#10259)
    
    * handle absolute and relative logger directories
    
    Signed-off-by: Anna Shors <[email protected]>
    
    * merge lines
    
    Signed-off-by: ashors1 <[email protected]>
    
    ---------
    
    Signed-off-by: Anna Shors <[email protected]>
    Signed-off-by: ashors1 <[email protected]>
    
    * Add sdxl notebook (#10139)
    
    * Add sdxl notebook
    
    Signed-off-by: mingyuanm <[email protected]>
    
    * Rename
    
    Signed-off-by: mingyuanm <[email protected]>
    
    * final Update SDXL notebook
    
    Signed-off-by: mingyuanm <[email protected]>
    
    ---------
    
    Signed-off-by: mingyuanm <[email protected]>
    
    * Updating some coments
    
    * Apply isort and black reformatting
    
    Signed-off-by: shanmugamr1992 <[email protected]>
    
    * Updating some coments
    
    * Apply isort and black reformatting
    
    Signed-off-by: shanmugamr1992 <[email protected]>
    
    * Updating some coments
    
    * Small change
    
    * Apply isort and black reformatting
    
    Signed-off-by: shanmugamr1992 <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: shanmugamr1992 <[email protected]>
    
    * ADD support for layernorm1p
    
    * Apply isort and black reformatting
    
    Signed-off-by: shanmugamr1992 <[email protected]>
    
    * Update Dockerfile.ci
    
    Signed-off-by: Shanmugam Ramasamy <[email protected]>
    
    * Update Dockerfile.ci
    
    Signed-off-by: Shanmugam Ramasamy <[email protected]>
    
    * Update Dockerfile.ci
    
    Signed-off-by: Shanmugam Ramasamy <[email protected]>
    
    ---------
    
    Signed-off-by: shanmugamr1992 <[email protected]>
    Signed-off-by: Hemil Desai <[email protected]>
    Signed-off-by: Piotr Żelasko <[email protected]>
    Signed-off-by: ashors1 <[email protected]>
    Signed-off-by: ashors1 <[email protected]>
    Signed-off-by: Shriya Palsamudram <[email protected]>
    Signed-off-by: ShriyaPalsamudram <[email protected]>
    Signed-off-by: Shriya Rishab <[email protected]>
    Signed-off-by: Dong Hyuk Chang <[email protected]>
    Signed-off-by: Pin-Jui Ku <[email protected]>
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    Signed-off-by: Marc Romeyn <[email protected]>
    Signed-off-by: marcromeyn <[email protected]>
    Signed-off-by: akoumpa <[email protected]>
    Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
    Signed-off-by: Oliver Koenig <[email protected]>
    Signed-off-by: slyne deng <[email protected]>
    Signed-off-by: oliver könig <[email protected]>
    Signed-off-by: Jan Lasek <[email protected]>
    Signed-off-by: Jan Lasek <[email protected]>
    Signed-off-by: hemildesai <[email protected]>
    Signed-off-by: Anna Shors <[email protected]>
    Signed-off-by: mingyuanm <[email protected]>
    Signed-off-by: Shanmugam Ramasamy <[email protected]>
    Co-authored-by: Shanmugam Ramasamy <[email protected]>
    Co-authored-by: shanmugamr1992 <[email protected]>
    Co-authored-by: Hemil Desai <[email protected]>
    Co-authored-by: Piotr Żelasko <[email protected]>
    Co-authored-by: Anna Shors <[email protected]>
    Co-authored-by: ashors1 <[email protected]>
    Co-authored-by: Shriya Rishab <[email protected]>
    Co-authored-by: ShriyaPalsamudram <[email protected]>
    Co-authored-by: Dong Hyuk Chang <[email protected]>
    Co-authored-by: Dong Hyuk Chang <[email protected]>
    Co-authored-by: Kuray107 <[email protected]>
    Co-authored-by: Kuray107 <[email protected]>
    Co-authored-by: Alexandros Koumparoulis <[email protected]>
    Co-authored-by: Marc Romeyn <[email protected]>
    Co-authored-by: marcromeyn <[email protected]>
    Co-authored-by: akoumpa <[email protected]>
    Co-authored-by: oliver könig <[email protected]>
    Co-authored-by: pablo-garay <[email protected]>
    Co-authored-by: Slyne Deng <[email protected]>
    Co-authored-by: slyne deng <[email protected]>
    Co-authored-by: Jan Lasek <[email protected]>
    Co-authored-by: hemildesai <[email protected]>
    Co-authored-by: Ming <[email protected]>
    Co-authored-by: Shanmugam Ramasamy <[email protected]>
    25 people authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    131d14e View commit details
    Browse the repository at this point in the history
  2. Fix artifact saving (#10914)

    Signed-off-by: Hemil Desai <[email protected]>
    hemildesai authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    b4bb088 View commit details
    Browse the repository at this point in the history
  3. Lora improvement (#10918)

    * pull out freeze model
    
    Signed-off-by: Chen Cui <[email protected]>
    
    * add wildcard match to lora target modules
    
    Signed-off-by: Chen Cui <[email protected]>
    
    ---------
    
    Signed-off-by: Chen Cui <[email protected]>
    cuichenx authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    ea08767 View commit details
    Browse the repository at this point in the history
  4. Huvu/t5 nemo2.0 peft (#10916)

    * adding peft test and cicd
    
    * add setting mcore model to train in peft.py
    
    * adding test for T5 lora
    
    * fix follow Chen's fix
    
    * restore cicd-main.yml
    
    ---------
    
    Co-authored-by: Huy Vu2 <[email protected]>
    2 people authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    ff80ad8 View commit details
    Browse the repository at this point in the history
  5. Add tie_word_embeddings=True (#10710)

    Signed-off-by: Yoshi Suhara <[email protected]>
    suhara authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    c17a554 View commit details
    Browse the repository at this point in the history
  6. Use a context-manager when opening files (#10895)

    * Use a context-manager when opening files
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: akoumpa <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: artbataev <[email protected]>
    
    ---------
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    Signed-off-by: akoumpa <[email protected]>
    Signed-off-by: artbataev <[email protected]>
    Co-authored-by: akoumpa <[email protected]>
    Co-authored-by: artbataev <[email protected]>
    3 people authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    710e6f0 View commit details
    Browse the repository at this point in the history
  7. long context performance numbers in doc (#10784)

    * long context perf
    
    Signed-off-by: Youngeun Kwon <[email protected]>
    
    * update the long context perf
    
    Signed-off-by: Youngeun Kwon <[email protected]>
    
    * Akoumparouli/mcore microbatch calculator fix (#10780)
    
    * move tests/lightning/{,_}io
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * add microbatch calculator context manager
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * use microbatch calculator context manager
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * add on_load_checkpoint test to ValidateModelRestoration; use ctx manager to reconfigure microbatch calculator; update save/restore path; add cleanup step at the end
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * remove unused var
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * fix
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: akoumpa <[email protected]>
    
    ---------
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    Signed-off-by: akoumpa <[email protected]>
    Co-authored-by: akoumpa <[email protected]>
    Signed-off-by: Youngeun Kwon <[email protected]>
    
    * remove 8x3b recipes (#10764)
    
    * remove 8x3b recipes
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * remove 8x3b from test_nemo_run
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * rm from __init__
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    ---------
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    Signed-off-by: Youngeun Kwon <[email protected]>
    
    * change the figure file name
    
    Signed-off-by: Youngeun Kwon <[email protected]>
    
    * Accommodating the reviewer's comment
    
    Signed-off-by: Youngeun Kwon <[email protected]>
    
    * update the y-axis title
    
    Signed-off-by: Youngeun Kwon <[email protected]>
    
    * [🤠]: Howdy folks, let's bump `Dockerfile.ci` to 3f90b98 ! (#10789)
    
    Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
    Co-authored-by: pablo-garay <[email protected]>
    Signed-off-by: Youngeun Kwon <[email protected]>
    
    * Add ModelOpt transformer model pruning example for Llama models, default to llama3.1-8b-base (#10294)
    
    * Add ModelOpt transformer model pruning example for Llama3 model
    
    Signed-off-by: Shengliang Xu <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: shengliangxu <[email protected]>
    Signed-off-by: Shengliang Xu <[email protected]>
    
    * examples code is at wrong dir, move them
    
    Signed-off-by: Shengliang Xu <[email protected]>
    
    * changes as suggested in comment
    
    remove some logging and unused config code, update example model to
    llama3.1
    
    Signed-off-by: Shengliang Xu <[email protected]>
    
    * Add pruning of hidden_size into example
    
    Signed-off-by: Shengliang Xu <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: shengliangxu <[email protected]>
    Signed-off-by: Shengliang Xu <[email protected]>
    
    * Update examples/nlp/language_modeling/conf/megatron_gpt_prune.yaml
    
    Signed-off-by: Keval Morabia <[email protected]>
    
    * Add pruning test to cicd-main.yml
    
    Signed-off-by: Keval Morabia <[email protected]>
    
    * Update cicd-main.yml
    
    Signed-off-by: Keval Morabia <[email protected]>
    
    * Update cicd-main.yml
    
    Signed-off-by: Keval Morabia <[email protected]>
    
    * Update cicd-main.yml
    
    Signed-off-by: Keval Morabia <[email protected]>
    
    * Update cicd-main.yml
    
    Signed-off-by: Keval Morabia <[email protected]>
    
    * Update cicd-main.yml
    
    Signed-off-by: Keval Morabia <[email protected]>
    
    ---------
    
    Signed-off-by: Shengliang Xu <[email protected]>
    Signed-off-by: shengliangxu <[email protected]>
    Signed-off-by: Keval Morabia <[email protected]>
    Co-authored-by: shengliangxu <[email protected]>
    Co-authored-by: Keval Morabia <[email protected]>
    Signed-off-by: Youngeun Kwon <[email protected]>
    
    * Update mamba.rst after dist ckpt addition (#10800)
    
    Signed-off-by: Ali Taghibakhshi <[email protected]>
    Signed-off-by: Youngeun Kwon <[email protected]>
    
    * fix chunked infer (#10581)
    
    Signed-off-by: stevehuang52 <[email protected]>
    Signed-off-by: Youngeun Kwon <[email protected]>
    
    * fix state transform (#10728)
    
    Signed-off-by: Chen Cui <[email protected]>
    Signed-off-by: Youngeun Kwon <[email protected]>
    
    * use ckpt_to_weights_subdir in restore (#10786)
    
    * use ckpt_to_weights_subdir in restore
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * make ckpt_to_{weight,context}_subdir idempotent
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: akoumpa <[email protected]>
    
    ---------
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    Signed-off-by: akoumpa <[email protected]>
    Co-authored-by: akoumpa <[email protected]>
    Signed-off-by: Youngeun Kwon <[email protected]>
    
    * Mixtral set seq_length=4k (#10704)
    
    * enable SP & set seq_lenght=4k
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * update test expected values
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * 8x22b 4k
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    ---------
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    Signed-off-by: Youngeun Kwon <[email protected]>
    
    * Fix for crashes with tensorboard_logger=false and VP + LoRA (#10792)
    
    * Fix for crashes with tensorboard_logger=false and virtual pipeline parallel + LoRA
    
    Signed-off-by: Valerie Sarge <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: vysarge <[email protected]>
    
    ---------
    
    Signed-off-by: Valerie Sarge <[email protected]>
    Signed-off-by: vysarge <[email protected]>
    Co-authored-by: vysarge <[email protected]>
    Signed-off-by: Youngeun Kwon <[email protected]>
    
    * Disable checkpoint conversion inside AutoResume (#10645)
    
    * Disable checkpoint conversion inside AutoResume
    
    Signed-off-by: Hemil Desai <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: hemildesai <[email protected]>
    
    * Update resume docstrings
    
    Signed-off-by: Hemil Desai <[email protected]>
    
    * fix
    
    Signed-off-by: Hemil Desai <[email protected]>
    
    * add default finetuning recipe and refactor llama3 8b recipe
    
    Signed-off-by: Chen Cui <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: cuichenx <[email protected]>
    
    * address comment
    
    Signed-off-by: Chen Cui <[email protected]>
    
    * refactor other recipes
    
    Signed-off-by: Chen Cui <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: cuichenx <[email protected]>
    
    * remove 8x3b finetuning recipe for now because HF version not available
    
    Signed-off-by: Chen Cui <[email protected]>
    
    * add copyright header
    
    Signed-off-by: Chen Cui <[email protected]>
    
    * adjust unit tests based on recipe fixes
    
    Signed-off-by: Chen Cui <[email protected]>
    
    * fix failed unit test
    
    Signed-off-by: Chen Cui <[email protected]>
    
    ---------
    
    Signed-off-by: Hemil Desai <[email protected]>
    Signed-off-by: hemildesai <[email protected]>
    Signed-off-by: Chen Cui <[email protected]>
    Signed-off-by: cuichenx <[email protected]>
    Co-authored-by: hemildesai <[email protected]>
    Co-authored-by: Chen Cui <[email protected]>
    Co-authored-by: cuichenx <[email protected]>
    Signed-off-by: Youngeun Kwon <[email protected]>
    
    * replace png file to github assets
    
    Signed-off-by: Youngeun Kwon <[email protected]>
    
    * change image url to github release
    
    Signed-off-by: Youngeun Kwon <[email protected]>
    
    ---------
    
    Signed-off-by: Youngeun Kwon <[email protected]>
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    Signed-off-by: akoumpa <[email protected]>
    Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
    Signed-off-by: Shengliang Xu <[email protected]>
    Signed-off-by: shengliangxu <[email protected]>
    Signed-off-by: Keval Morabia <[email protected]>
    Signed-off-by: Ali Taghibakhshi <[email protected]>
    Signed-off-by: stevehuang52 <[email protected]>
    Signed-off-by: Chen Cui <[email protected]>
    Signed-off-by: Valerie Sarge <[email protected]>
    Signed-off-by: vysarge <[email protected]>
    Signed-off-by: Hemil Desai <[email protected]>
    Signed-off-by: hemildesai <[email protected]>
    Signed-off-by: cuichenx <[email protected]>
    Co-authored-by: Alexandros Koumparoulis <[email protected]>
    Co-authored-by: akoumpa <[email protected]>
    Co-authored-by: oliver könig <[email protected]>
    Co-authored-by: pablo-garay <[email protected]>
    Co-authored-by: Shengliang Xu <[email protected]>
    Co-authored-by: shengliangxu <[email protected]>
    Co-authored-by: Keval Morabia <[email protected]>
    Co-authored-by: Ali Taghibakhshi <[email protected]>
    Co-authored-by: He Huang (Steve) <[email protected]>
    Co-authored-by: Chen Cui <[email protected]>
    Co-authored-by: Valerie Sarge <[email protected]>
    Co-authored-by: vysarge <[email protected]>
    Co-authored-by: Hemil Desai <[email protected]>
    Co-authored-by: hemildesai <[email protected]>
    Co-authored-by: cuichenx <[email protected]>
    16 people authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    aa797d3 View commit details
    Browse the repository at this point in the history
  8. perf recipes and Mcore DistOpt params (#10883)

    * 175b gpt3 recipe
    
    Signed-off-by: Malay Nagda <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: malay-nagda <[email protected]>
    
    * dist opt params
    
    Signed-off-by: Malay Nagda <[email protected]>
    
    * 405b dist opt params
    
    Signed-off-by: Malay Nagda <[email protected]>
    
    * perf recipes and dist opt params
    
    Signed-off-by: Malay Nagda <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: malay-nagda <[email protected]>
    
    * MoE dist opt params
    
    Signed-off-by: Malay Nagda <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: malay-nagda <[email protected]>
    
    * gpt bias fusion params
    
    Signed-off-by: Malay Nagda <[email protected]>
    
    * 175b recipe
    
    Signed-off-by: Malay Nagda <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: malay-nagda <[email protected]>
    
    * perf params comments
    
    Signed-off-by: Malay Nagda <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: malay-nagda <[email protected]>
    
    * MoE perf params comments
    
    Signed-off-by: Malay Nagda <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: malay-nagda <[email protected]>
    
    * perf recipes suffix
    
    Signed-off-by: Malay Nagda <[email protected]>
    
    * specific models fusion params
    
    Signed-off-by: Malay Nagda <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: malay-nagda <[email protected]>
    
    ---------
    
    Signed-off-by: Malay Nagda <[email protected]>
    Signed-off-by: malay-nagda <[email protected]>
    Co-authored-by: malay-nagda <[email protected]>
    2 people authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    52d5ef8 View commit details
    Browse the repository at this point in the history
  9. ci: Fix cherry pick team (#10945)

    Signed-off-by: Oliver Koenig <[email protected]>
    ko3n1g authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    2be9dc5 View commit details
    Browse the repository at this point in the history
  10. Packed sequence bug fixes (#10898)

    * save prepared dataset to different folders according to tokenizer name
    
    Signed-off-by: Chen Cui <[email protected]>
    
    * fix hang
    
    Signed-off-by: Chen Cui <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: cuichenx <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: artbataev <[email protected]>
    
    * fix hang
    
    Signed-off-by: Chen Cui <[email protected]>
    
    * raise mbs>1 error and provide suggestion to user instead of automatically changing config
    
    Signed-off-by: Chen Cui <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: cuichenx <[email protected]>
    
    * add ci for packed seq
    
    Signed-off-by: Chen Cui <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: cuichenx <[email protected]>
    
    * fix bug
    
    Signed-off-by: Chen Cui <[email protected]>
    
    ---------
    
    Signed-off-by: Chen Cui <[email protected]>
    Signed-off-by: cuichenx <[email protected]>
    Signed-off-by: artbataev <[email protected]>
    Co-authored-by: cuichenx <[email protected]>
    Co-authored-by: artbataev <[email protected]>
    3 people authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    186b946 View commit details
    Browse the repository at this point in the history
  11. Fix requirements for MacOS (#10930)

    Signed-off-by: Vladimir Bataev <[email protected]>
    artbataev authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    9e6e117 View commit details
    Browse the repository at this point in the history
  12. Fix nemo 2.0 recipes (#10915)

    * Fix recipe num_nodes and long context docstring
    
    * Fix typo
    
    * Fix PP issue
    
    * Fix unit test
    
    * Change recipes
    
    * fix test
    
    * Fix unit tests
    
    * Fix recipes
    
    * Add general legal test on parallelization settings
    
    * Rename test
    
    * Apply isort and black reformatting
    
    Signed-off-by: BoxiangW <[email protected]>
    
    ---------
    
    Signed-off-by: BoxiangW <[email protected]>
    Co-authored-by: BoxiangW <[email protected]>
    2 people authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    481e380 View commit details
    Browse the repository at this point in the history
  13. Akoumparouli/nemo ux fix dir or string artifact (#10936)

    * Add __repr__ to Artifact
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * nemo.lightning.io.artifact: represent strings as fdl.Config to avoid path adjustment during restoration
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * t5 test minification
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: akoumpa <[email protected]>
    
    ---------
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    Signed-off-by: akoumpa <[email protected]>
    Co-authored-by: akoumpa <[email protected]>
    2 people authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    52c89b9 View commit details
    Browse the repository at this point in the history
  14. ckpt convert bug fixes (#10878)

    * Mistral-NeMo-12B recipe
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * rename mistral to mistral_7b
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * include mistral_nemo_12b in __init__
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: akoumpa <[email protected]>
    
    * add to __init__
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: akoumpa <[email protected]>
    
    * Remove stale imports
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * TP=2
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * remove finetune_reci[e
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * Rename MistralNeMo2407Config12B to MistralNeMoConfig12B per review's suggestion
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * update config names in tests
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * mistral-nemo-12b from llama_8b
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * TP=2; SP=True
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * fix overlap value
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: akoumpa <[email protected]>
    
    * update mistral-nemo-base-12b finetune recipe
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: akoumpa <[email protected]>
    
    * bug fix
    
    Signed-off-by: dimapihtar <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: dimapihtar <[email protected]>
    
    * remove extra file
    
    Signed-off-by: dimapihtar <[email protected]>
    
    * remove extra changes
    
    Signed-off-by: dimapihtar <[email protected]>
    
    * revert changes
    
    Signed-off-by: dimapihtar <[email protected]>
    
    * add ckpt_format configurable
    
    Signed-off-by: dimapihtar <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: dimapihtar <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: artbataev <[email protected]>
    
    * revert changes
    
    Signed-off-by: dimapihtar <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: dimapihtar <[email protected]>
    
    ---------
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    Signed-off-by: akoumpa <[email protected]>
    Signed-off-by: dimapihtar <[email protected]>
    Signed-off-by: dimapihtar <[email protected]>
    Signed-off-by: artbataev <[email protected]>
    Co-authored-by: Alexandros Koumparoulis <[email protected]>
    Co-authored-by: akoumpa <[email protected]>
    Co-authored-by: dimapihtar <[email protected]>
    Co-authored-by: artbataev <[email protected]>
    5 people authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    ca40849 View commit details
    Browse the repository at this point in the history
  15. fix typo in docstring (#10955)

    Signed-off-by: ashors1 <[email protected]>
    ashors1 authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    5a3932e View commit details
    Browse the repository at this point in the history
  16. remove deprecated ci tests (#10922)

    * remove deprecated tutorial
    
    Signed-off-by: dimapihtar <[email protected]>
    
    * remove deprecated ci tests
    
    Signed-off-by: dimapihtar <[email protected]>
    
    * add deprecation note
    
    Signed-off-by: dimapihtar <[email protected]>
    
    * add deprecation note
    
    Signed-off-by: dimapihtar <[email protected]>
    
    * remove bart tests
    
    Signed-off-by: dimapihtar <[email protected]>
    
    ---------
    
    Signed-off-by: dimapihtar <[email protected]>
    dimapihtar authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    3684fb3 View commit details
    Browse the repository at this point in the history
  17. [Nemo CICD] Remove deprecated tests (#10960)

    * remove deprecated tutorial
    
    Signed-off-by: dimapihtar <[email protected]>
    
    * remove deprecated ci tests
    
    Signed-off-by: dimapihtar <[email protected]>
    
    * add deprecation note
    
    Signed-off-by: dimapihtar <[email protected]>
    
    * add deprecation note
    
    Signed-off-by: dimapihtar <[email protected]>
    
    * remove bart tests
    
    Signed-off-by: dimapihtar <[email protected]>
    
    * Remove deleted CI tests
    
    ---------
    
    Signed-off-by: dimapihtar <[email protected]>
    Signed-off-by: Pablo Garay <[email protected]>
    Co-authored-by: dimapihtar <[email protected]>
    2 people authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    7a5d96a View commit details
    Browse the repository at this point in the history
  18. Adithyare/oai chat completion (#10785)

    * updates
    
    Signed-off-by: adithyare <[email protected]>
    
    * open ai chat completion wip
    
    Signed-off-by: adithyare <[email protected]>
    
    * responding with model responses
    
    Signed-off-by: adithyare <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: arendu <[email protected]>
    
    * also support general completion
    
    Signed-off-by: adithyare <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: arendu <[email protected]>
    
    ---------
    
    Signed-off-by: adithyare <[email protected]>
    Signed-off-by: arendu <[email protected]>
    Co-authored-by: arendu <[email protected]>
    2 people authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    c6813ce View commit details
    Browse the repository at this point in the history
  19. Update megatron_t5_pretraining.py (#10952)

    Signed-off-by: Huy Vu <[email protected]>
    huvunvidia authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    739a15d View commit details
    Browse the repository at this point in the history
  20. Convert perf plugin env vars to strings (#10947)

    Signed-off-by: Hemil Desai <[email protected]>
    hemildesai authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    6ecee6b View commit details
    Browse the repository at this point in the history
  21. disable dynamo for ddp checker (#10961)

    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    akoumpa authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    38ccc9c View commit details
    Browse the repository at this point in the history
  22. [🤠]: Howdy folks, let's bump Dockerfile.ci to db7d37b ! (#10965)

    Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
    Co-authored-by: pablo-garay <[email protected]>
    2 people authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    f4aebf3 View commit details
    Browse the repository at this point in the history
  23. Mistral-NeMo-12B recipe (#10607)

    * Mistral-NeMo-12B recipe
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * rename mistral to mistral_7b
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * include mistral_nemo_12b in __init__
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: akoumpa <[email protected]>
    
    * add to __init__
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: akoumpa <[email protected]>
    
    * Remove stale imports
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * TP=2
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * remove finetune_reci[e
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * Rename MistralNeMo2407Config12B to MistralNeMoConfig12B per review's suggestion
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * update config names in tests
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * mistral-nemo-12b from llama_8b
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * TP=2; SP=True
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * fix overlap value
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: akoumpa <[email protected]>
    
    * update mistral-nemo-base-12b finetune recipe
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: akoumpa <[email protected]>
    
    ---------
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    Signed-off-by: akoumpa <[email protected]>
    Co-authored-by: akoumpa <[email protected]>
    2 people authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    03434b0 View commit details
    Browse the repository at this point in the history
  24. Make nemo text processing optional in TTS (#10584)

    * move TN guard to better location; make guard print error message rather than throwing error
    
    Signed-off-by: Jason <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: blisc <[email protected]>
    
    * Forgot to add the actual normalizer
    
    Signed-off-by: Jason <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: blisc <[email protected]>
    
    ---------
    
    Signed-off-by: Jason <[email protected]>
    Signed-off-by: blisc <[email protected]>
    Co-authored-by: blisc <[email protected]>
    2 people authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    d4b3adf View commit details
    Browse the repository at this point in the history
  25. respect warnings' filters (#10953)

    * respect warnings' filters
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: akoumpa <[email protected]>
    
    ---------
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    Signed-off-by: akoumpa <[email protected]>
    Co-authored-by: akoumpa <[email protected]>
    2 people authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    c457d45 View commit details
    Browse the repository at this point in the history
  26. Update T5 tokenizer (adding additional tokens to tokenizer config) (#…

    …10972)
    
    * initial commit
    
    * restore t5_pretraining
    
    * Apply isort and black reformatting
    
    Signed-off-by: huvunvidia <[email protected]>
    
    ---------
    
    Signed-off-by: huvunvidia <[email protected]>
    Co-authored-by: Huy Vu2 <[email protected]>
    Co-authored-by: huvunvidia <[email protected]>
    3 people authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    9b3f602 View commit details
    Browse the repository at this point in the history
  27. Alit/mamba recipe (#10935)

    * add some mamba recipe
    
    * add 130m
    
    * add the rest of the recipes
    
    * add tokenizer
    
    * add tokenizer
    
    * minor fix
    
    * minor fix
    
    * minor fix
    
    * minor fix
    
    * minor fix
    
    * minor fix
    
    * minor fix
    
    * minor fix
    
    * minor fix
    
    * minor fix
    
    * minor fix
    
    * add fixes to ssm for nemorun recipes
    
    * add hybrid tokenizer
    
    * updating some recipes
    
    * Apply isort and black reformatting
    
    Signed-off-by: JRD971000 <[email protected]>
    
    * remove comments
    
    * update gbs
    
    * fix ckpt resume
    
    * fix ckpt resume
    
    * fix ckpt resume
    
    * update recipes final
    
    * Apply isort and black reformatting
    
    Signed-off-by: JRD971000 <[email protected]>
    
    * remove redundant imports
    
    * ckpt convertor dtype fix
    
    * Apply isort and black reformatting
    
    Signed-off-by: JRD971000 <[email protected]>
    
    ---------
    
    Signed-off-by: JRD971000 <[email protected]>
    Signed-off-by: Ali Taghibakhshi <[email protected]>
    Co-authored-by: JRD971000 <[email protected]>
    2 people authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    cb88c41 View commit details
    Browse the repository at this point in the history
  28. Long context performance doc hot fix (#10946)

    * long context perf
    
    Signed-off-by: Youngeun Kwon <[email protected]>
    
    * update the long context perf
    
    Signed-off-by: Youngeun Kwon <[email protected]>
    
    * Akoumparouli/mcore microbatch calculator fix (#10780)
    
    * move tests/lightning/{,_}io
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * add microbatch calculator context manager
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * use microbatch calculator context manager
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * add on_load_checkpoint test to ValidateModelRestoration; use ctx manager to reconfigure microbatch calculator; update save/restore path; add cleanup step at the end
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * remove unused var
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * fix
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: akoumpa <[email protected]>
    
    ---------
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    Signed-off-by: akoumpa <[email protected]>
    Co-authored-by: akoumpa <[email protected]>
    Signed-off-by: Youngeun Kwon <[email protected]>
    
    * remove 8x3b recipes (#10764)
    
    * remove 8x3b recipes
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * remove 8x3b from test_nemo_run
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * rm from __init__
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    ---------
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    Signed-off-by: Youngeun Kwon <[email protected]>
    
    * change the figure file name
    
    Signed-off-by: Youngeun Kwon <[email protected]>
    
    * Accommodating the reviewer's comment
    
    Signed-off-by: Youngeun Kwon <[email protected]>
    
    * update the y-axis title
    
    Signed-off-by: Youngeun Kwon <[email protected]>
    
    * [🤠]: Howdy folks, let's bump `Dockerfile.ci` to 3f90b98 ! (#10789)
    
    Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
    Co-authored-by: pablo-garay <[email protected]>
    Signed-off-by: Youngeun Kwon <[email protected]>
    
    * Add ModelOpt transformer model pruning example for Llama models, default to llama3.1-8b-base (#10294)
    
    * Add ModelOpt transformer model pruning example for Llama3 model
    
    Signed-off-by: Shengliang Xu <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: shengliangxu <[email protected]>
    Signed-off-by: Shengliang Xu <[email protected]>
    
    * examples code is at wrong dir, move them
    
    Signed-off-by: Shengliang Xu <[email protected]>
    
    * changes as suggested in comment
    
    remove some logging and unused config code, update example model to
    llama3.1
    
    Signed-off-by: Shengliang Xu <[email protected]>
    
    * Add pruning of hidden_size into example
    
    Signed-off-by: Shengliang Xu <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: shengliangxu <[email protected]>
    Signed-off-by: Shengliang Xu <[email protected]>
    
    * Update examples/nlp/language_modeling/conf/megatron_gpt_prune.yaml
    
    Signed-off-by: Keval Morabia <[email protected]>
    
    * Add pruning test to cicd-main.yml
    
    Signed-off-by: Keval Morabia <[email protected]>
    
    * Update cicd-main.yml
    
    Signed-off-by: Keval Morabia <[email protected]>
    
    * Update cicd-main.yml
    
    Signed-off-by: Keval Morabia <[email protected]>
    
    * Update cicd-main.yml
    
    Signed-off-by: Keval Morabia <[email protected]>
    
    * Update cicd-main.yml
    
    Signed-off-by: Keval Morabia <[email protected]>
    
    * Update cicd-main.yml
    
    Signed-off-by: Keval Morabia <[email protected]>
    
    ---------
    
    Signed-off-by: Shengliang Xu <[email protected]>
    Signed-off-by: shengliangxu <[email protected]>
    Signed-off-by: Keval Morabia <[email protected]>
    Co-authored-by: shengliangxu <[email protected]>
    Co-authored-by: Keval Morabia <[email protected]>
    Signed-off-by: Youngeun Kwon <[email protected]>
    
    * Update mamba.rst after dist ckpt addition (#10800)
    
    Signed-off-by: Ali Taghibakhshi <[email protected]>
    Signed-off-by: Youngeun Kwon <[email protected]>
    
    * fix chunked infer (#10581)
    
    Signed-off-by: stevehuang52 <[email protected]>
    Signed-off-by: Youngeun Kwon <[email protected]>
    
    * fix state transform (#10728)
    
    Signed-off-by: Chen Cui <[email protected]>
    Signed-off-by: Youngeun Kwon <[email protected]>
    
    * use ckpt_to_weights_subdir in restore (#10786)
    
    * use ckpt_to_weights_subdir in restore
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * make ckpt_to_{weight,context}_subdir idempotent
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: akoumpa <[email protected]>
    
    ---------
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    Signed-off-by: akoumpa <[email protected]>
    Co-authored-by: akoumpa <[email protected]>
    Signed-off-by: Youngeun Kwon <[email protected]>
    
    * Mixtral set seq_length=4k (#10704)
    
    * enable SP & set seq_lenght=4k
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * update test expected values
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * 8x22b 4k
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    ---------
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    Signed-off-by: Youngeun Kwon <[email protected]>
    
    * Fix for crashes with tensorboard_logger=false and VP + LoRA (#10792)
    
    * Fix for crashes with tensorboard_logger=false and virtual pipeline parallel + LoRA
    
    Signed-off-by: Valerie Sarge <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: vysarge <[email protected]>
    
    ---------
    
    Signed-off-by: Valerie Sarge <[email protected]>
    Signed-off-by: vysarge <[email protected]>
    Co-authored-by: vysarge <[email protected]>
    Signed-off-by: Youngeun Kwon <[email protected]>
    
    * Disable checkpoint conversion inside AutoResume (#10645)
    
    * Disable checkpoint conversion inside AutoResume
    
    Signed-off-by: Hemil Desai <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: hemildesai <[email protected]>
    
    * Update resume docstrings
    
    Signed-off-by: Hemil Desai <[email protected]>
    
    * fix
    
    Signed-off-by: Hemil Desai <[email protected]>
    
    * add default finetuning recipe and refactor llama3 8b recipe
    
    Signed-off-by: Chen Cui <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: cuichenx <[email protected]>
    
    * address comment
    
    Signed-off-by: Chen Cui <[email protected]>
    
    * refactor other recipes
    
    Signed-off-by: Chen Cui <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: cuichenx <[email protected]>
    
    * remove 8x3b finetuning recipe for now because HF version not available
    
    Signed-off-by: Chen Cui <[email protected]>
    
    * add copyright header
    
    Signed-off-by: Chen Cui <[email protected]>
    
    * adjust unit tests based on recipe fixes
    
    Signed-off-by: Chen Cui <[email protected]>
    
    * fix failed unit test
    
    Signed-off-by: Chen Cui <[email protected]>
    
    ---------
    
    Signed-off-by: Hemil Desai <[email protected]>
    Signed-off-by: hemildesai <[email protected]>
    Signed-off-by: Chen Cui <[email protected]>
    Signed-off-by: cuichenx <[email protected]>
    Co-authored-by: hemildesai <[email protected]>
    Co-authored-by: Chen Cui <[email protected]>
    Co-authored-by: cuichenx <[email protected]>
    Signed-off-by: Youngeun Kwon <[email protected]>
    
    * replace png file to github assets
    
    Signed-off-by: Youngeun Kwon <[email protected]>
    
    * change image url to github release
    
    Signed-off-by: Youngeun Kwon <[email protected]>
    
    * hot fix on table style
    
    Signed-off-by: Youngeun Kwon <[email protected]>
    
    ---------
    
    Signed-off-by: Youngeun Kwon <[email protected]>
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    Signed-off-by: akoumpa <[email protected]>
    Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
    Signed-off-by: Shengliang Xu <[email protected]>
    Signed-off-by: shengliangxu <[email protected]>
    Signed-off-by: Keval Morabia <[email protected]>
    Signed-off-by: Ali Taghibakhshi <[email protected]>
    Signed-off-by: stevehuang52 <[email protected]>
    Signed-off-by: Chen Cui <[email protected]>
    Signed-off-by: Valerie Sarge <[email protected]>
    Signed-off-by: vysarge <[email protected]>
    Signed-off-by: Hemil Desai <[email protected]>
    Signed-off-by: hemildesai <[email protected]>
    Signed-off-by: cuichenx <[email protected]>
    Co-authored-by: Alexandros Koumparoulis <[email protected]>
    Co-authored-by: akoumpa <[email protected]>
    Co-authored-by: oliver könig <[email protected]>
    Co-authored-by: pablo-garay <[email protected]>
    Co-authored-by: Shengliang Xu <[email protected]>
    Co-authored-by: shengliangxu <[email protected]>
    Co-authored-by: Keval Morabia <[email protected]>
    Co-authored-by: Ali Taghibakhshi <[email protected]>
    Co-authored-by: He Huang (Steve) <[email protected]>
    Co-authored-by: Chen Cui <[email protected]>
    Co-authored-by: Valerie Sarge <[email protected]>
    Co-authored-by: vysarge <[email protected]>
    Co-authored-by: Hemil Desai <[email protected]>
    Co-authored-by: hemildesai <[email protected]>
    Co-authored-by: cuichenx <[email protected]>
    16 people authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    b5c84cf View commit details
    Browse the repository at this point in the history
  29. Performance mode (#10926)

    * llama3 performance mode
    
    Signed-off-by: Malay Nagda <[email protected]>
    
    * llama3 performance mode tests
    
    Signed-off-by: Malay Nagda <[email protected]>
    
    * mixtral performance mode
    
    Signed-off-by: Malay Nagda <[email protected]>
    
    * remove unused
    
    Signed-off-by: Malay Nagda <[email protected]>
    
    * nemotron perf mode
    
    Signed-off-by: Malay Nagda <[email protected]>
    
    * 405b, 174b perf mode
    
    Signed-off-by: Malay Nagda <[email protected]>
    
    * perf mode comment
    
    Signed-off-by: Malay Nagda <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: malay-nagda <[email protected]>
    
    ---------
    
    Signed-off-by: Malay Nagda <[email protected]>
    Signed-off-by: malay-nagda <[email protected]>
    Signed-off-by: malay-nagda <[email protected]>
    Co-authored-by: malay-nagda <[email protected]>
    2 people authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    0ff77b5 View commit details
    Browse the repository at this point in the history
  30. Add flux inference pipeline (#10752)

    * Vae added and matched flux checkpoint
    
    Signed-off-by: mingyuanm <[email protected]>
    
    * Flux model added.
    
    Signed-off-by: mingyuanm <[email protected]>
    
    * Copying FlowMatchEulerScheduler over
    
    Signed-off-by: mingyuanm <[email protected]>
    
    * WIP: Start to test the pipeline forward pass
    
    Signed-off-by: mingyuanm <[email protected]>
    
    * Vae added and matched flux checkpoint
    
    Signed-off-by: mingyuanm <[email protected]>
    
    * Inference pipeline runs with offloading function
    
    Signed-off-by: mingyuanm <[email protected]>
    
    * Start to test image generation
    
    Signed-off-by: mingyuanm <[email protected]>
    
    * Decoding with VAE part has been verified. Still need to check the denoising loop.
    
    Signed-off-by: mingyuanm <[email protected]>
    
    * The inference pipeline is verified.
    
    Signed-off-by: mingyuanm <[email protected]>
    
    * Add arg parsers and refactoring
    
    Signed-off-by: mingyuanm <[email protected]>
    
    * Tested on multi batch sizes and prompts.
    
    Signed-off-by: mingyuanm <[email protected]>
    
    * Add headers
    
    Signed-off-by: mingyuanm <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: Victor49152 <[email protected]>
    
    * Renaming
    
    Signed-off-by: mingyuanm <[email protected]>
    
    * Move shceduler to sampler folder
    
    Signed-off-by: mingyuanm <[email protected]>
    
    * Merging folders.
    
    Signed-off-by: mingyuanm <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: Victor49152 <[email protected]>
    
    * Tested after path changing.
    
    Signed-off-by: mingyuanm <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: Victor49152 <[email protected]>
    
    * Move MMDIT block to NeMo
    
    Signed-off-by: mingyuanm <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: Victor49152 <[email protected]>
    
    * Add joint attention and single attention to NeMo
    
    Signed-off-by: mingyuanm <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: Victor49152 <[email protected]>
    
    * Joint attention updated
    
    Signed-off-by: mingyuanm <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: Victor49152 <[email protected]>
    
    * Remove redundant importing
    
    Signed-off-by: mingyuanm <[email protected]>
    
    * Refactor to inherit megatron module
    
    Signed-off-by: mingyuanm <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: Victor49152 <[email protected]>
    
    ---------
    
    Signed-off-by: mingyuanm <[email protected]>
    Signed-off-by: Victor49152 <[email protected]>
    Co-authored-by: Victor49152 <[email protected]>
    2 people authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    1ca44d7 View commit details
    Browse the repository at this point in the history
  31. Add assertion for always save nemo add model parallel size (#10690)

    * Add assertion for always save nemo add model parallel size
    
    Signed-off-by: Boxiang Wang <[email protected]>
    
    * Add assertions
    
    Signed-off-by: Boxiang Wang <[email protected]>
    
    * Fix typo
    
    Signed-off-by: Boxiang Wang <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: BoxiangW <[email protected]>
    
    * Revert nemo_model_checkpoint.py changes
    
    Signed-off-by: Boxiang Wang <[email protected]>
    
    * Add test
    
    Signed-off-by: Boxiang Wang <[email protected]>
    
    * Fix typo
    
    * Fix test bug
    
    Signed-off-by: Boxiang Wang <[email protected]>
    
    * Fix test
    
    Signed-off-by: Boxiang Wang <[email protected]>
    
    ---------
    
    Signed-off-by: Boxiang Wang <[email protected]>
    Signed-off-by: BoxiangW <[email protected]>
    Co-authored-by: BoxiangW <[email protected]>
    2 people authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    df41eac View commit details
    Browse the repository at this point in the history
  32. [🤠]: Howdy folks, let's bump Dockerfile.ci to 563d5d1 ! (#10979)

    Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
    Co-authored-by: pablo-garay <[email protected]>
    2 people authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    02cfe4c View commit details
    Browse the repository at this point in the history
  33. Reflect CLI change nemorun -> nemo (#10443)

    Signed-off-by: Marc Romeijn <[email protected]>
    Co-authored-by: Alexandros Koumparoulis <[email protected]>
    2 people authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    7cf1907 View commit details
    Browse the repository at this point in the history
  34. minor fix (#10990)

    Co-authored-by: Ali Taghibakhshi <[email protected]>
    2 people authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    b92866e View commit details
    Browse the repository at this point in the history
  35. Fixed sampler override and audio_key in prepare_audio_data (#10980)

    Signed-off-by: Ante Jukić <[email protected]>
    anteju authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    7788041 View commit details
    Browse the repository at this point in the history
  36. Add more recipes (#10957)

    * add recipes
    
    Signed-off-by: Chen Cui <[email protected]>
    
    * adjust finetuning recipe
    
    Signed-off-by: Chen Cui <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: cuichenx <[email protected]>
    
    ---------
    
    Signed-off-by: Chen Cui <[email protected]>
    Signed-off-by: cuichenx <[email protected]>
    Co-authored-by: cuichenx <[email protected]>
    2 people authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    520f3cb View commit details
    Browse the repository at this point in the history
  37. Configuration menu
    Copy the full SHA
    e50cc14 View commit details
    Browse the repository at this point in the history
  38. Upgrade transformers (#10854)

    Signed-off-by: Chen Cui <[email protected]>
    cuichenx authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    7ca0bf8 View commit details
    Browse the repository at this point in the history
  39. Add support and recipes for HF models via AutoModelForCausalLM (#10962)

    * initial hf_lit_module
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * make sft gpt dataset sanity check optional
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * HF sft example
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * Rename HfLitModule to HfAutoModel
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * update default model id
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * move rank&world_size as params
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * fix mbs in example
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * fix for fsdp and logger
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * make loss_fn configurable
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: akoumpa <[email protected]>
    
    * remove optim from HfAutoModel
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * add pytorch native optim
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * add hfAutoModel pretrain nemorun recipe
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * remove debug
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * remove stale imports
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * remove stale import
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * rm stale imports
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * rm stale imports
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * tokenizer fix
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * update example
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * rename pytorch_adam to pytorch_adam_with_cosine_annealing
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * small refactor
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * fix no_weight_decay_cond
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * fix
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * fix
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * switch to flat_lr optim for example
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: akoumpa <[email protected]>
    
    * remove imports & update docstrings
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * add a tokenizer setter to allow it to work with nemo/collections/llm/api.py::_use_tokenizer
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * remove unused import
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * allow loss_mask to be none
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * Add HF-dataset lightning module
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * check if pad_token_id is None
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * rename hf_lit_module.py to hf_auto_model.py
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * class rename
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * rename
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * update example
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * HfAutoModelForCausalLM
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * rm stale import
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * add option to start with random weights
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * add check in megatron-strategy
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * rename param
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * drop mcore sampler from squadmodule
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * make megatron_sampler optional in HfDatasetDataModule
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * copyright
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * use is_hf_model to mark hf classes
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: akoumpa <[email protected]>
    
    ---------
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    Signed-off-by: akoumpa <[email protected]>
    Co-authored-by: akoumpa <[email protected]>
    2 people authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    3f464b7 View commit details
    Browse the repository at this point in the history
  40. ci: Update tests (#10987)

    * ci: Re-enable `L0_Unit_Tests_GPU_Lightning`
    
    Signed-off-by: Oliver Koenig <[email protected]>
    
    * ci: Disable `L2_Megatron_GPT_Pretraining_and_Resume_Training_PP2`
    
    Signed-off-by: Oliver Koenig <[email protected]>
    
    ---------
    
    Signed-off-by: Oliver Koenig <[email protected]>
    ko3n1g authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    25133a9 View commit details
    Browse the repository at this point in the history
  41. [🤠]: Howdy folks, let's bump Dockerfile.ci to 425cdd4 ! (#11001)

    Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
    Co-authored-by: pablo-garay <[email protected]>
    2 people authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    c39d620 View commit details
    Browse the repository at this point in the history
  42. gpt3 175b cli (#10985)

    * gpt3 175b cli
    
    Signed-off-by: Malay Nagda <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: malay-nagda <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: malay-nagda <[email protected]>
    
    ---------
    
    Signed-off-by: Malay Nagda <[email protected]>
    Signed-off-by: malay-nagda <[email protected]>
    Signed-off-by: malay-nagda <[email protected]>
    Co-authored-by: malay-nagda <[email protected]>
    2 people authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    046b422 View commit details
    Browse the repository at this point in the history
  43. Fix for crash with LoRA + tp_overlap_comm=false + sequence_parallel=t…

    …rue (#10920)
    
    * Add fusion defaults for llama2
    
    Signed-off-by: Valerie Sarge <[email protected]>
    
    * Alter ParallelLinearAdapter condition to account for tp_comm_overlap=false
    
    Signed-off-by: Valerie Sarge <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: vysarge <[email protected]>
    
    * Clean up unneeded defaults
    
    Signed-off-by: Valerie Sarge <[email protected]>
    
    * gpt3 175b cli
    
    Signed-off-by: Malay Nagda <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: malay-nagda <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: vysarge <[email protected]>
    
    ---------
    
    Signed-off-by: Valerie Sarge <[email protected]>
    Signed-off-by: vysarge <[email protected]>
    Signed-off-by: Malay Nagda <[email protected]>
    Signed-off-by: malay-nagda <[email protected]>
    Signed-off-by: Eric Harper <[email protected]>
    Co-authored-by: vysarge <[email protected]>
    Co-authored-by: Malay Nagda <[email protected]>
    Co-authored-by: malay-nagda <[email protected]>
    Co-authored-by: Eric Harper <[email protected]>
    5 people authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    2704487 View commit details
    Browse the repository at this point in the history
  44. llm.generate fixes (#10983)

    * fix context path, disable optimizer init, add tp
    
    Signed-off-by: HuiyingLi <[email protected]>
    
    * format
    
    Signed-off-by: HuiyingLi <[email protected]>
    
    * address comments, require user to provide trainer
    
    Signed-off-by: HuiyingLi <[email protected]>
    
    * minor fix
    
    Signed-off-by: HuiyingLi <[email protected]>
    
    * minor fixes
    
    Signed-off-by: HuiyingLi <[email protected]>
    
    ---------
    
    Signed-off-by: HuiyingLi <[email protected]>
    HuiyingLi authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    05273b4 View commit details
    Browse the repository at this point in the history
  45. use __dict__ in check (#11012)

    * check is_hf_model in leaf module
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: akoumpa <[email protected]>
    
    * disable getattr alternative path
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * fix
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * undo;
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    ---------
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    Signed-off-by: akoumpa <[email protected]>
    Co-authored-by: akoumpa <[email protected]>
    2 people authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    f668f94 View commit details
    Browse the repository at this point in the history
  46. LoRA support for HF::AutoModelForCausalLM (#10982)

    * add LinearAdapter
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * add hf lora example
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * remove unused imports
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * fix
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * fix
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * subclass mixin
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * remove stale imports
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * undo
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * fix scale
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * regex selector for peft
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * move lora
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * fmt
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * hf_auto_model_for_causal_lm finetune recipe
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: akoumpa <[email protected]>
    
    ---------
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    Signed-off-by: akoumpa <[email protected]>
    Co-authored-by: akoumpa <[email protected]>
    2 people authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    68e8968 View commit details
    Browse the repository at this point in the history
  47. Change default for always_save_context to True (#11014)

    Signed-off-by: Abhishree <[email protected]>
    Co-authored-by: Pablo Garay <[email protected]>
    2 people authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    a3630de View commit details
    Browse the repository at this point in the history
  48. Add a build option to load_context (#10713)

    * Add a build option to load_context
    
    Signed-off-by: Marc Romeijn <[email protected]>
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * Adding test
    
    Signed-off-by: Marc Romeijn <[email protected]>
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * Trying to fix failing CPU test
    
    Signed-off-by: Marc Romeijn <[email protected]>
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * cherry-pick fix
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    ---------
    
    Signed-off-by: Marc Romeijn <[email protected]>
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    Co-authored-by: Alexandros Koumparoulis <[email protected]>
    2 people authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    b5686c2 View commit details
    Browse the repository at this point in the history
  49. Fix pip install (#11026)

    * Move AutoTokenizer inline
    
    Signed-off-by: Marc Romeyn <[email protected]>
    
    * Move einops to common requirements
    
    Signed-off-by: Marc Romeyn <[email protected]>
    
    * Move AutoTokenizer import to top-level again in fine_tuning
    
    Signed-off-by: Marc Romeyn <[email protected]>
    
    * Move megatron init inside nemo.lightning
    
    Signed-off-by: Marc Romeyn <[email protected]>
    
    * Make megatron_lazy_init_context work when transformer-engine is not installed
    
    Signed-off-by: Marc Romeyn <[email protected]>
    
    * Only import get_nmt_tokenizer when needed
    
    Signed-off-by: Marc Romeyn <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: marcromeyn <[email protected]>
    
    ---------
    
    Signed-off-by: Marc Romeyn <[email protected]>
    Signed-off-by: marcromeyn <[email protected]>
    Co-authored-by: marcromeyn <[email protected]>
    2 people authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    a07902a View commit details
    Browse the repository at this point in the history
  50. [WIP] Add docs for NEST SSL (#10804)

    * add docs
    
    Signed-off-by: stevehuang52 <[email protected]>
    
    * update doc and fix missing param
    
    Signed-off-by: stevehuang52 <[email protected]>
    
    ---------
    
    Signed-off-by: stevehuang52 <[email protected]>
    stevehuang52 authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    e127994 View commit details
    Browse the repository at this point in the history
  51. Change dist ckpt defaults (#10913)

    * Enable ckpt features by default (async ckpt), ckpt every 15mins and reduce preemption time to 1min
    
    Signed-off-by: Shriya Palsamudram <[email protected]>
    
    * fix ssm tests
    
    Signed-off-by: Shriya Palsamudram <[email protected]>
    
    * Make note that ckpt_async_save is disabled for SSMs
    
    Signed-off-by: Shriya Palsamudram <[email protected]>
    
    * Enable async ckpt for SSMs with fix
    
    Signed-off-by: Shriya Palsamudram <[email protected]>
    
    * Disable async ckpt in the peft test as it is a known bug, add note.
    
    Signed-off-by: Shriya Palsamudram <[email protected]>
    
    * Fix failing unit tests
    
    Signed-off-by: Shriya Palsamudram <[email protected]>
    
    * Ashors/peft async ckpt (#11010)
    
    * [WIP] prototype for supporting async checkpointing with peft
    
    Signed-off-by: ashors1 <[email protected]>
    Signed-off-by: Shriya Palsamudram <[email protected]>
    
    * Enable async ckpt for the peft test
    
    Signed-off-by: Shriya Palsamudram <[email protected]>
    
    * Fix peft setup test
    
    Signed-off-by: Shriya Palsamudram <[email protected]>
    
    ---------
    
    Signed-off-by: Shriya Palsamudram <[email protected]>
    Signed-off-by: ashors1 <[email protected]>
    Co-authored-by: ataghibakhsh <[email protected]>
    2 people authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    8eaf5a9 View commit details
    Browse the repository at this point in the history
  52. Akoumparouli/mixtral recipe fix r2.0.0 (#10994)

    * Mixtral TP8 EP1
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    
    * Apply isort and black reformatting
    
    Signed-off-by: akoumpa <[email protected]>
    
    ---------
    
    Signed-off-by: Alexandros Koumparoulis <[email protected]>
    Signed-off-by: akoumpa <[email protected]>
    Co-authored-by: akoumpa <[email protected]>
    2 people authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    cde2e02 View commit details
    Browse the repository at this point in the history
  53. Configuration menu
    Copy the full SHA
    e2db0be View commit details
    Browse the repository at this point in the history
  54. Configuration menu
    Copy the full SHA
    5eb00b0 View commit details
    Browse the repository at this point in the history
  55. neva model changes to support llava-next

    Yashaswi Karnati authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    d263a60 View commit details
    Browse the repository at this point in the history
  56. remove accidentally checked in files

    Signed-off-by: Yashaswi Karnati <[email protected]>
    Yashaswi Karnati authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    97025ee View commit details
    Browse the repository at this point in the history
  57. Apply isort and black reformatting

    Signed-off-by: yashaswikarnati <[email protected]>
    yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    37c6c55 View commit details
    Browse the repository at this point in the history
  58. remove unused imports

    Yashaswi Karnati authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    bac0f64 View commit details
    Browse the repository at this point in the history
  59. added io_init to not save task_encoder and image_processor

    Yashaswi Karnati authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    da05cf1 View commit details
    Browse the repository at this point in the history
  60. Apply isort and black reformatting

    Signed-off-by: yashaswikarnati <[email protected]>
    yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    cfb521c View commit details
    Browse the repository at this point in the history
  61. added scripts for pretrain and finetune

    Signed-off-by: Yashaswi Karnati <[email protected]>
    Yashaswi Karnati authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    d3a718f View commit details
    Browse the repository at this point in the history
  62. Apply isort and black reformatting

    Signed-off-by: yashaswikarnati <[email protected]>
    yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    438c573 View commit details
    Browse the repository at this point in the history
  63. [🤠]: Howdy folks, let's bump Dockerfile.ci to 73e7b58 ! (#10779)

    Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
    Co-authored-by: pablo-garay <[email protected]>
    2 people authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    29a2ed8 View commit details
    Browse the repository at this point in the history
  64. generation example

    Yashaswi Karnati authored and yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    c93cda7 View commit details
    Browse the repository at this point in the history
  65. Apply isort and black reformatting

    Signed-off-by: yashaswikarnati <[email protected]>
    yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    b2689fd View commit details
    Browse the repository at this point in the history
  66. Configuration menu
    Copy the full SHA
    302afb7 View commit details
    Browse the repository at this point in the history
  67. edited merge conflict

    yashaswikarnati committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    accc256 View commit details
    Browse the repository at this point in the history

Commits on Oct 28, 2024

  1. llava next end-end train

    Yashaswi Karnati committed Oct 28, 2024
    Configuration menu
    Copy the full SHA
    590e7cd View commit details
    Browse the repository at this point in the history
  2. Apply isort and black reformatting

    Signed-off-by: yashaswikarnati <[email protected]>
    yashaswikarnati committed Oct 28, 2024
    Configuration menu
    Copy the full SHA
    397aa80 View commit details
    Browse the repository at this point in the history

Commits on Oct 29, 2024

  1. finetune changes

    Yashaswi Karnati committed Oct 29, 2024
    Configuration menu
    Copy the full SHA
    f6e9255 View commit details
    Browse the repository at this point in the history
  2. Apply isort and black reformatting

    Signed-off-by: yashaswikarnati <[email protected]>
    yashaswikarnati committed Oct 29, 2024
    Configuration menu
    Copy the full SHA
    e9d4b98 View commit details
    Browse the repository at this point in the history

Commits on Nov 2, 2024

  1. finetune debug changes

    Yashaswi Karnati committed Nov 2, 2024
    Configuration menu
    Copy the full SHA
    9a58842 View commit details
    Browse the repository at this point in the history