Skip to content

Commit

Permalink
Add LLama32 Vision Model Support in Nemo 2.0 (#10763)
Browse files Browse the repository at this point in the history
* add initial code for llama vlm

Signed-off-by: yaoyu-33 <[email protected]>

* some restructure

Signed-off-by: yaoyu-33 <[email protected]>

* add mock data placeholder

Signed-off-by: yaoyu-33 <[email protected]>

* Fix some importing

Signed-off-by: yaoyu-33 <[email protected]>

* add language component for vlm llama

* update code

Signed-off-by: yaoyu-33 <[email protected]>

* now match num of params

* update language part and fix vision part

Signed-off-by: yaoyu-33 <[email protected]>

* minor fix

Signed-off-by: yaoyu-33 <[email protected]>

* model can now init

Signed-off-by: yaoyu-33 <[email protected]>

* minor update for llama32 text config

Signed-off-by: yaoyu-33 <[email protected]>

* make checkpoint loading work

* missing import

* match vision part tensor shapes with configs

Signed-off-by: yaoyu-33 <[email protected]>

* solve some fwd issues and mismatch issues

Signed-off-by: yaoyu-33 <[email protected]>

* add vision import

* fixes

Signed-off-by: yaoyu-33 <[email protected]>

* update importer to convert both text and image weights

* importer typos and reduce clutter

* fix import qkv

* some fixes for LLM

Signed-off-by: yaoyu-33 <[email protected]>

* Add embedding

* some updates

Signed-off-by: yaoyu-33 <[email protected]>

* enable loading only text or only vision

* add example script

* TP fix

Signed-off-by: yaoyu-33 <[email protected]>

* update

* upload examples

Signed-off-by: yaoyu-33 <[email protected]>

* update generate

Signed-off-by: yaoyu-33 <[email protected]>

* update to newer version

Signed-off-by: yaoyu-33 <[email protected]>

* upload for sharing

* update to new pyt ckpt

* xattn_caches matches (except small differences due to TE RMSNorm)

* cleanup

* embeddings match

* match precision of weights

* update sharded state dict

Signed-off-by: yaoyu-33 <[email protected]>

* change xattn layer num to 3 7 11 etc

* upload llama generation

* minor fix

* fix dummy layer input format

* fix vision qkv order

* fix shareded state dict

Signed-off-by: yaoyu-33 <[email protected]>

* fix vision precision

* fix rope

* match cross attn layer

* remove nrep

* Remove cross attention in ImageTransformerLayer and fix _gate_ffn

* PP draft

Signed-off-by: yaoyu-33 <[email protected]>

* Fix intermediate tensor

* temp save for pp2 is working

Signed-off-by: yaoyu-33 <[email protected]>

* fix pp issues

Signed-off-by: yaoyu-33 <[email protected]>

* merge

* update mcore parallelism initialization

Signed-off-by: yaoyu-33 <[email protected]>

* small update to pretrain script

Signed-off-by: yaoyu-33 <[email protected]>

* update mcore parallelism initialization

Signed-off-by: yaoyu-33 <[email protected]>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <[email protected]>

* added energon dataloader for neva training (#10451)

* added energon dataloader for neva training

* Apply isort and black reformatting

Signed-off-by: yashaswikarnati <[email protected]>

* specify global batch size to support grad accumulation

* adding neva pretrain example

* Apply isort and black reformatting

Signed-off-by: yashaswikarnati <[email protected]>

* change pretraine example to handle new ckpt reloading

* fixed code quality warnings and unused imports

Signed-off-by: ykarnati <[email protected]>

* minor changes for PR comments

* Apply isort and black reformatting

Signed-off-by: yashaswikarnati <[email protected]>

* refactor conversation template config

* Apply isort and black reformatting

Signed-off-by: yashaswikarnati <[email protected]>

* remove optional import

---------

Signed-off-by: yashaswikarnati <[email protected]>
Signed-off-by: ykarnati <[email protected]>
Co-authored-by: yashaswikarnati <[email protected]>
(cherry picked from commit 7354740)

* llama energon dataloader

* have tokenizer for base task encoder class

* Update megatron_init.py

Signed-off-by: Yu Yao <[email protected]>

* Add simple inference

* evian3 update

Signed-off-by: yaoyu-33 <[email protected]>

* add encoder parallel default config

Signed-off-by: yaoyu-33 <[email protected]>

* add encoder parallel default config

Signed-off-by: yaoyu-33 <[email protected]>

* clean up

Signed-off-by: yaoyu-33 <[email protected]>

* add aspect ratio in model

* support energon dataloader

* some pp update

Signed-off-by: yaoyu-33 <[email protected]>

* fixes

Signed-off-by: yaoyu-33 <[email protected]>

* fix kv merging

Signed-off-by: yaoyu-33 <[email protected]>

* fix get_key_value_tensors

Signed-off-by: yaoyu-33 <[email protected]>

* rename files

Signed-off-by: yaoyu-33 <[email protected]>

* update to HF style position embedding

Signed-off-by: yaoyu-33 <[email protected]>

* fix energon dataloader and support batching

* update forward args

Signed-off-by: yaoyu-33 <[email protected]>

* clean up and move to aspect_ratio_ids

Signed-off-by: yaoyu-33 <[email protected]>

* rename back to language.py

Signed-off-by: yaoyu-33 <[email protected]>

* fix loss function

Signed-off-by: yaoyu-33 <[email protected]>

* update and fix energon

Signed-off-by: yaoyu-33 <[email protected]>

* Add hf import

* Fix type

* Change config

* update energon pretrain

Signed-off-by: yaoyu-33 <[email protected]>

* clean up

* clean up

* reformat

Signed-off-by: yaoyu-33 <[email protected]>

* update inference files for new code

* update to instruct

* update to instruct

* update few names

Signed-off-by: yaoyu-33 <[email protected]>

* update generation

Signed-off-by: yaoyu-33 <[email protected]>

* fix importer embedding.weight

* few fixes

Signed-off-by: yaoyu-33 <[email protected]>

* add hf script

Signed-off-by: yaoyu-33 <[email protected]>

* fix kv import

* remove interleaved

* fixes and updates

Signed-off-by: yaoyu-33 <[email protected]>

* lora fixes

Signed-off-by: yaoyu-33 <[email protected]>

* some code clean ups

Signed-off-by: yaoyu-33 <[email protected]>

* update training scripts

Signed-off-by: yaoyu-33 <[email protected]>

* refactors

Signed-off-by: yaoyu-33 <[email protected]>

* add LoRA finetuning

* fixes and nemo update

Signed-off-by: yaoyu-33 <[email protected]>

* fix importer registering issue by adding 11B and 90B configs

* update `decoder_seq_len`

Signed-off-by: yaoyu-33 <[email protected]>

* science vqa script

Signed-off-by: yaoyu-33 <[email protected]>

* clean up script name

Signed-off-by: yaoyu-33 <[email protected]>

* fix ckpt save serialization issue

* fix predefined config classes

* add num_chunks in input

Signed-off-by: yaoyu-33 <[email protected]>

* fix format

Signed-off-by: yaoyu-33 <[email protected]>

* update finetuning scripts for PEFT

* add 11b recipe (need #10645 to test)

* fix mask generation

Signed-off-by: yaoyu-33 <[email protected]>

* minor fix code style

Signed-off-by: yaoyu-33 <[email protected]>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <[email protected]>

* Support no image inference

* add llama svqa eval

* fix masking

Signed-off-by: yaoyu-33 <[email protected]>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <[email protected]>

* fix generation

Signed-off-by: yaoyu-33 <[email protected]>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <[email protected]>

* add 90b recipe and revise 11b recipe

* Apply isort and black reformatting

Signed-off-by: cuichenx <[email protected]>

* clean up typing

* add option to disable vision padding

* Apply isort and black reformatting

Signed-off-by: cuichenx <[email protected]>

* base model finetuning (does not work yet)

* Apply isort and black reformatting

Signed-off-by: cuichenx <[email protected]>

* fixed default conversation template config for MLLama

* Update svqa

* add multinode

* bot happy

* Apply isort and black reformatting

Signed-off-by: cuichenx <[email protected]>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <[email protected]>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <[email protected]>

* Apply isort and black reformatting

Signed-off-by: artbataev <[email protected]>

* Perf improvements. Mainly from XAttn mask calculation (#10901)

* Perf improvements. Mainly from XAttn mask calculation

* Apply isort and black reformatting

Signed-off-by: parthmannan <[email protected]>

---------

Signed-off-by: parthmannan <[email protected]>
Co-authored-by: parthmannan <[email protected]>

* fix existing issues

Signed-off-by: yaoyu-33 <[email protected]>

* fix scripts

Signed-off-by: yaoyu-33 <[email protected]>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <[email protected]>

* fix lora

* few fixes for non image support

Signed-off-by: yaoyu-33 <[email protected]>

* update masking gen

Signed-off-by: yaoyu-33 <[email protected]>

* update lazy dataset

Signed-off-by: yaoyu-33 <[email protected]>

* fix data sampler and loading issue

Signed-off-by: yaoyu-33 <[email protected]>

* Add vlm generation

* Apply isort and black reformatting

Signed-off-by: meatybobby <[email protected]>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <[email protected]>

* generation update

Signed-off-by: yaoyu-33 <[email protected]>

* update lazy dataset

Signed-off-by: yaoyu-33 <[email protected]>

* Fix _strategy_lib.py

Signed-off-by: Yu Yao <[email protected]>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <[email protected]>

* fix warning

Signed-off-by: yaoyu-33 <[email protected]>

* hide vlm examples

Signed-off-by: yaoyu-33 <[email protected]>

* Revert "Add vlm generation"

This reverts commit 4711c75

Signed-off-by: yaoyu-33 <[email protected]>

* Fix VisionEncoder multi-batch bug

* update mcore parallelism initialization

Signed-off-by: yaoyu-33 <[email protected]>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <[email protected]>

* Update megatron_init.py

Signed-off-by: Yu Yao <[email protected]>

* add encoder parallel default config

Signed-off-by: yaoyu-33 <[email protected]>

* Fix _strategy_lib.py

Signed-off-by: Yu Yao <[email protected]>

* llm.generate fixes (#10983)

* fix context path, disable optimizer init, add tp

Signed-off-by: HuiyingLi <[email protected]>

* format

Signed-off-by: HuiyingLi <[email protected]>

* address comments, require user to provide trainer

Signed-off-by: HuiyingLi <[email protected]>

* minor fix

Signed-off-by: HuiyingLi <[email protected]>

* minor fixes

Signed-off-by: HuiyingLi <[email protected]>

---------

Signed-off-by: HuiyingLi <[email protected]>

* use __dict__ in check (#11012)

* check is_hf_model in leaf module

Signed-off-by: Alexandros Koumparoulis <[email protected]>

* Apply isort and black reformatting

Signed-off-by: akoumpa <[email protected]>

* disable getattr alternative path

Signed-off-by: Alexandros Koumparoulis <[email protected]>

* fix

Signed-off-by: Alexandros Koumparoulis <[email protected]>

* undo;

Signed-off-by: Alexandros Koumparoulis <[email protected]>

---------

Signed-off-by: Alexandros Koumparoulis <[email protected]>
Signed-off-by: akoumpa <[email protected]>
Co-authored-by: akoumpa <[email protected]>

* LoRA support for HF::AutoModelForCausalLM (#10982)

* add LinearAdapter

Signed-off-by: Alexandros Koumparoulis <[email protected]>

* add hf lora example

Signed-off-by: Alexandros Koumparoulis <[email protected]>

* remove unused imports

Signed-off-by: Alexandros Koumparoulis <[email protected]>

* fix

Signed-off-by: Alexandros Koumparoulis <[email protected]>

* fix

Signed-off-by: Alexandros Koumparoulis <[email protected]>

* subclass mixin

Signed-off-by: Alexandros Koumparoulis <[email protected]>

* remove stale imports

Signed-off-by: Alexandros Koumparoulis <[email protected]>

* undo

Signed-off-by: Alexandros Koumparoulis <[email protected]>

* fix scale

Signed-off-by: Alexandros Koumparoulis <[email protected]>

* regex selector for peft

Signed-off-by: Alexandros Koumparoulis <[email protected]>

* move lora

Signed-off-by: Alexandros Koumparoulis <[email protected]>

* fmt

Signed-off-by: Alexandros Koumparoulis <[email protected]>

* hf_auto_model_for_causal_lm finetune recipe

Signed-off-by: Alexandros Koumparoulis <[email protected]>

* Apply isort and black reformatting

Signed-off-by: akoumpa <[email protected]>

---------

Signed-off-by: Alexandros Koumparoulis <[email protected]>
Signed-off-by: akoumpa <[email protected]>
Co-authored-by: akoumpa <[email protected]>

* Change default for always_save_context to True (#11014)

Signed-off-by: Abhishree <[email protected]>
Co-authored-by: Pablo Garay <[email protected]>

* Add a build option to load_context (#10713)

* Add a build option to load_context

Signed-off-by: Marc Romeijn <[email protected]>
Signed-off-by: Alexandros Koumparoulis <[email protected]>

* Adding test

Signed-off-by: Marc Romeijn <[email protected]>
Signed-off-by: Alexandros Koumparoulis <[email protected]>

* Trying to fix failing CPU test

Signed-off-by: Marc Romeijn <[email protected]>
Signed-off-by: Alexandros Koumparoulis <[email protected]>

* cherry-pick fix

Signed-off-by: Alexandros Koumparoulis <[email protected]>

---------

Signed-off-by: Marc Romeijn <[email protected]>
Signed-off-by: Alexandros Koumparoulis <[email protected]>
Co-authored-by: Alexandros Koumparoulis <[email protected]>

* Fix pip install (#11026)

* Move AutoTokenizer inline

Signed-off-by: Marc Romeyn <[email protected]>

* Move einops to common requirements

Signed-off-by: Marc Romeyn <[email protected]>

* Move AutoTokenizer import to top-level again in fine_tuning

Signed-off-by: Marc Romeyn <[email protected]>

* Move megatron init inside nemo.lightning

Signed-off-by: Marc Romeyn <[email protected]>

* Make megatron_lazy_init_context work when transformer-engine is not installed

Signed-off-by: Marc Romeyn <[email protected]>

* Only import get_nmt_tokenizer when needed

Signed-off-by: Marc Romeyn <[email protected]>

* Apply isort and black reformatting

Signed-off-by: marcromeyn <[email protected]>

---------

Signed-off-by: Marc Romeyn <[email protected]>
Signed-off-by: marcromeyn <[email protected]>
Co-authored-by: marcromeyn <[email protected]>

* [WIP] Add docs for NEST SSL (#10804)

* add docs

Signed-off-by: stevehuang52 <[email protected]>

* update doc and fix missing param

Signed-off-by: stevehuang52 <[email protected]>

---------

Signed-off-by: stevehuang52 <[email protected]>

* Change dist ckpt defaults (#10913)

* Enable ckpt features by default (async ckpt), ckpt every 15mins and reduce preemption time to 1min

Signed-off-by: Shriya Palsamudram <[email protected]>

* fix ssm tests

Signed-off-by: Shriya Palsamudram <[email protected]>

* Make note that ckpt_async_save is disabled for SSMs

Signed-off-by: Shriya Palsamudram <[email protected]>

* Enable async ckpt for SSMs with fix

Signed-off-by: Shriya Palsamudram <[email protected]>

* Disable async ckpt in the peft test as it is a known bug, add note.

Signed-off-by: Shriya Palsamudram <[email protected]>

* Fix failing unit tests

Signed-off-by: Shriya Palsamudram <[email protected]>

* Ashors/peft async ckpt (#11010)

* [WIP] prototype for supporting async checkpointing with peft

Signed-off-by: ashors1 <[email protected]>
Signed-off-by: Shriya Palsamudram <[email protected]>

* Enable async ckpt for the peft test

Signed-off-by: Shriya Palsamudram <[email protected]>

* Fix peft setup test

Signed-off-by: Shriya Palsamudram <[email protected]>

---------

Signed-off-by: Shriya Palsamudram <[email protected]>
Signed-off-by: ashors1 <[email protected]>
Co-authored-by: ataghibakhsh <[email protected]>

* Akoumparouli/mixtral recipe fix r2.0.0 (#10994)

* Mixtral TP8 EP1

Signed-off-by: Alexandros Koumparoulis <[email protected]>

* Apply isort and black reformatting

Signed-off-by: akoumpa <[email protected]>

---------

Signed-off-by: Alexandros Koumparoulis <[email protected]>
Signed-off-by: akoumpa <[email protected]>
Co-authored-by: akoumpa <[email protected]>

* Fix _strategy_lib tests (#11033)

* fix world size and don't mock

Signed-off-by: Maanu Grover <[email protected]>

* cleanup global state

Signed-off-by: Maanu Grover <[email protected]>

* check app state instead

Signed-off-by: Maanu Grover <[email protected]>

* fix syntax nemo logger test

Signed-off-by: Maanu Grover <[email protected]>

---------

Signed-off-by: Maanu Grover <[email protected]>

* Update `BaseMegatronSampler` for compatibility with PTL's `_BatchProgress` (#11016)

* Revert "[NeMo-UX] Use custom `BatchProgress` class which does not restore states (#10383)"

This reverts commit b5798de.

* make megatron sampler return the total number of batches in the dataset

Signed-off-by: ashors1 <[email protected]>

---------

Signed-off-by: ashors1 <[email protected]>

* PTQ example for NeMo 2.0 (#10642)

* initial commit

Signed-off-by: Piotr Kaminski <[email protected]>

* create Quantizer for NeMo 2.0

Signed-off-by: Piotr Kaminski <[email protected]>

* refactor

Signed-off-by: Piotr Kaminski <[email protected]>

* Call quantize on an unwrapped mcore model

Signed-off-by: Piotr Kaminski <[email protected]>

* Apply isort and black reformatting

Signed-off-by: Laplasjan107 <[email protected]>

* Add tests, adjust unwrapping

Signed-off-by: Piotr Kaminski <[email protected]>

* Apply isort and black reformatting

Signed-off-by: Laplasjan107 <[email protected]>

* fix export

Signed-off-by: Piotr Kaminski <[email protected]>

* Apply isort and black reformatting

Signed-off-by: Laplasjan107 <[email protected]>

* Apply isort and black reformatting

Signed-off-by: artbataev <[email protected]>

* Fix output_path argument for HF import

Signed-off-by: Piotr Kamiński <[email protected]>

* fix fabric ckpt loading

Signed-off-by: Piotr Kaminski <[email protected]>

* Apply isort and black reformatting

Signed-off-by: Laplasjan107 <[email protected]>

* code review suggestions

Signed-off-by: Piotr Kaminski <[email protected]>

* Apply isort and black reformatting

Signed-off-by: Laplasjan107 <[email protected]>

* remove unused import

Signed-off-by: Piotr Kaminski <[email protected]>

* use cnn dataset in github ci

Signed-off-by: Piotr Kaminski <[email protected]>

* applied code review

Signed-off-by: Piotr Kaminski <[email protected]>

* code review changes

Signed-off-by: Piotr Kaminski <[email protected]>

* Apply isort and black reformatting

Signed-off-by: Laplasjan107 <[email protected]>

* simplify interface for data iterator

Signed-off-by: Piotr Kaminski <[email protected]>

* Apply isort and black reformatting

Signed-off-by: Laplasjan107 <[email protected]>

* (partial) PP fix

Signed-off-by: Piotr Kaminski <[email protected]>

* Apply isort and black reformatting

Signed-off-by: Laplasjan107 <[email protected]>

---------

Signed-off-by: Piotr Kaminski <[email protected]>
Signed-off-by: Laplasjan107 <[email protected]>
Signed-off-by: Piotr Kamiński <[email protected]>
Signed-off-by: artbataev <[email protected]>
Co-authored-by: Piotr Kaminski <[email protected]>
Co-authored-by: Laplasjan107 <[email protected]>
Co-authored-by: artbataev <[email protected]>

* TDT compute timestamps option and Extra Whitespace handling for SPE (#10875)

* add token duration

Signed-off-by: monica-sekoyan <[email protected]>

* revert rnnt change

Signed-off-by: monica-sekoyan <[email protected]>

* add remove_extra_whitespaces arg to spe tokenizer

Signed-off-by: monica-sekoyan <[email protected]>

* add token duration retrieval

Signed-off-by: monica-sekoyan <[email protected]>

* add ignore_extra_whitespace to spe

Signed-off-by: monica-sekoyan <[email protected]>

* add compute_timestamp support for tdt

Signed-off-by: monica-sekoyan <[email protected]>

* fix config field name

Signed-off-by: monica-sekoyan <[email protected]>

* add refinement for tdt timestamps

Signed-off-by: monica-sekoyan <[email protected]>

* add segments timestamp support and  refinement for ctc

Signed-off-by: monica-sekoyan <[email protected]>

* modify tests for ctc decoding timestamps

Signed-off-by: monica-sekoyan <[email protected]>

* add rnnt timestamp tests

Signed-off-by: monica-sekoyan <[email protected]>

* updated doc

Signed-off-by: monica-sekoyan <[email protected]>

* fix in test

Signed-off-by: monica-sekoyan <[email protected]>

* Apply isort and black reformatting

Signed-off-by: monica-sekoyan <[email protected]>

* fix of unicode char

Signed-off-by: monica-sekoyan <[email protected]>

* fix rnnt_decoding test

Signed-off-by: monica-sekoyan <[email protected]>

* workaround for tesst tokenizer

Signed-off-by: monica-sekoyan <[email protected]>

* Apply isort and black reformatting

Signed-off-by: monica-sekoyan <[email protected]>

* modify segments formation

Signed-off-by: monica-sekoyan <[email protected]>

* modify segments for ctc

Signed-off-by: monica-sekoyan <[email protected]>

* fix in ctc refinement

Signed-off-by: monica-sekoyan <[email protected]>

* Apply isort and black reformatting

Signed-off-by: monica-sekoyan <[email protected]>

* minor changes

Signed-off-by: monica-sekoyan <[email protected]>

* reverse offset change

Signed-off-by: monica-sekoyan <[email protected]>

* Apply isort and black reformatting

Signed-off-by: monica-sekoyan <[email protected]>

* warning mode=once

Signed-off-by: monica-sekoyan <[email protected]>

* Apply isort and black reformatting

Signed-off-by: monica-sekoyan <[email protected]>

* make ignore_extrawhitespaces false

Signed-off-by: monica-sekoyan <[email protected]>

* minor changes

Signed-off-by: monica-sekoyan <[email protected]>

* adjust changes to the tests

Signed-off-by: monica-sekoyan <[email protected]>

* modify prompt_formatter tests

Signed-off-by: monica-sekoyan <[email protected]>

* Apply isort and black reformatting

Signed-off-by: monica-sekoyan <[email protected]>

---------

Signed-off-by: monica-sekoyan <[email protected]>
Signed-off-by: monica-sekoyan <[email protected]>
Co-authored-by: monica-sekoyan <[email protected]>

* Basic online dynamic FP8 quantization with vLLM (#10904)

* Basic online dynamic quantization with vLLM

Signed-off-by: Jan Lasek <[email protected]>

* Apply isort and black reformatting

Signed-off-by: janekl <[email protected]>

* vllm 0.6.3 updates

Signed-off-by: Jan Lasek <[email protected]>

* Pass quantization param in deploy_vllm_triton.py script

Signed-off-by: Jan Lasek <[email protected]>

---------

Signed-off-by: Jan Lasek <[email protected]>
Signed-off-by: janekl <[email protected]>
Co-authored-by: janekl <[email protected]>

* ci: Improve VM maintenance (#10758)

* ci: Improve VM maintenance

Signed-off-by: Oliver Koenig <[email protected]>

* rename stuff

Signed-off-by: Oliver Koenig <[email protected]>

* title

Signed-off-by: Oliver Koenig <[email protected]>

* use team

Signed-off-by: Oliver Koenig <[email protected]>

* run on failure too

Signed-off-by: Oliver Koenig <[email protected]>

* fix

Signed-off-by: Oliver Koenig <[email protected]>

* yrdy

Signed-off-by: Oliver Koenig <[email protected]>

* f

Signed-off-by: Oliver Koenig <[email protected]>

* test

Signed-off-by: Oliver Koenig <[email protected]>

* fix

Signed-off-by: Oliver Koenig <[email protected]>

* f

Signed-off-by: Oliver Koenig <[email protected]>

* f

Signed-off-by: Oliver Koenig <[email protected]>

* f

Signed-off-by: Oliver Koenig <[email protected]>

---------

Signed-off-by: Oliver Koenig <[email protected]>

* Add comment for vision transpose

* update megatron_init.py inside lightning

Signed-off-by: yaoyu-33 <[email protected]>

* rename llama to mllama folder name

Signed-off-by: yaoyu-33 <[email protected]>

* update to attention bias

Signed-off-by: yaoyu-33 <[email protected]>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <[email protected]>

* update dropout to 0

Signed-off-by: yaoyu-33 <[email protected]>

* fix attention bias

Signed-off-by: yaoyu-33 <[email protected]>

* remove disable_vision_padding since we now have a fix

Signed-off-by: yaoyu-33 <[email protected]>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <[email protected]>

* Update init for mllama

Signed-off-by: yaoyu-33 <[email protected]>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <[email protected]>

* Address comments

Signed-off-by: yaoyu-33 <[email protected]>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <[email protected]>

* fix copyright title

Signed-off-by: yaoyu-33 <[email protected]>

* fix code scan

Signed-off-by: yaoyu-33 <[email protected]>

* update vision code

Signed-off-by: yaoyu-33 <[email protected]>

* revert attention bias changes until latest MLM code got merged

Signed-off-by: yaoyu-33 <[email protected]>

* fix warning

Signed-off-by: yaoyu-33 <[email protected]>

* Turn off system message check, as it's "" now

Signed-off-by: yaoyu-33 <[email protected]>

* Rolllback megatron_parallel.py

Signed-off-by: Yu Yao <[email protected]>

---------

Signed-off-by: yaoyu-33 <[email protected]>
Signed-off-by: yaoyu-33 <[email protected]>
Signed-off-by: Yu Yao <[email protected]>
Signed-off-by: cuichenx <[email protected]>
Signed-off-by: Chen Cui <[email protected]>
Signed-off-by: artbataev <[email protected]>
Signed-off-by: parthmannan <[email protected]>
Signed-off-by: meatybobby <[email protected]>
Signed-off-by: HuiyingLi <[email protected]>
Signed-off-by: Alexandros Koumparoulis <[email protected]>
Signed-off-by: akoumpa <[email protected]>
Signed-off-by: Abhishree <[email protected]>
Signed-off-by: Marc Romeijn <[email protected]>
Signed-off-by: Marc Romeyn <[email protected]>
Signed-off-by: marcromeyn <[email protected]>
Signed-off-by: stevehuang52 <[email protected]>
Signed-off-by: Shriya Palsamudram <[email protected]>
Signed-off-by: ashors1 <[email protected]>
Signed-off-by: Maanu Grover <[email protected]>
Signed-off-by: Piotr Kaminski <[email protected]>
Signed-off-by: Laplasjan107 <[email protected]>
Signed-off-by: Piotr Kamiński <[email protected]>
Signed-off-by: monica-sekoyan <[email protected]>
Signed-off-by: monica-sekoyan <[email protected]>
Signed-off-by: Jan Lasek <[email protected]>
Signed-off-by: janekl <[email protected]>
Signed-off-by: Oliver Koenig <[email protected]>
Co-authored-by: Ao Tang <[email protected]>
Co-authored-by: Chen Cui <[email protected]>
Co-authored-by: Bobby Chen <[email protected]>
Co-authored-by: yaoyu-33 <[email protected]>
Co-authored-by: Yashaswi Karnati <[email protected]>
Co-authored-by: ykarnati <[email protected]>
Co-authored-by: cuichenx <[email protected]>
Co-authored-by: Yashaswi Karnati <[email protected]>
Co-authored-by: artbataev <[email protected]>
Co-authored-by: Parth Mannan <[email protected]>
Co-authored-by: parthmannan <[email protected]>
Co-authored-by: meatybobby <[email protected]>
Co-authored-by: Huiying <[email protected]>
Co-authored-by: Alexandros Koumparoulis <[email protected]>
Co-authored-by: akoumpa <[email protected]>
Co-authored-by: Abhishree Thittenamane <[email protected]>
Co-authored-by: Pablo Garay <[email protected]>
Co-authored-by: Marc Romeyn <[email protected]>
Co-authored-by: Alexandros Koumparoulis <[email protected]>
Co-authored-by: marcromeyn <[email protected]>
Co-authored-by: He Huang (Steve) <[email protected]>
Co-authored-by: Shriya Rishab <[email protected]>
Co-authored-by: ataghibakhsh <[email protected]>
Co-authored-by: Maanu Grover <[email protected]>
Co-authored-by: Anna Shors <[email protected]>
Co-authored-by: Piotr Kamiński <[email protected]>
Co-authored-by: Piotr Kaminski <[email protected]>
Co-authored-by: Laplasjan107 <[email protected]>
Co-authored-by: monica-sekoyan <[email protected]>
Co-authored-by: monica-sekoyan <[email protected]>
Co-authored-by: Jan Lasek <[email protected]>
Co-authored-by: janekl <[email protected]>
Co-authored-by: oliver könig <[email protected]>
  • Loading branch information
1 parent 124aa06 commit 4afa427
Show file tree
Hide file tree
Showing 31 changed files with 3,986 additions and 52 deletions.
14 changes: 12 additions & 2 deletions nemo/collections/multimodal/data/energon/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,9 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

from copy import deepcopy
from typing import TYPE_CHECKING, Any, Dict, Literal, Optional
from typing import Any, Dict, Literal, Optional

import fiddle as fdl
import pytorch_lightning as pl
Expand Down Expand Up @@ -66,6 +67,7 @@ def __init__(
pin_memory: bool = True,
multimodal_sample_config: Optional[MultiModalSampleConfig] = MultiModalSampleConfig(),
task_encoder: Optional[MultiModalTaskEncoder] = None,
decoder_seq_length: Optional[int] = None,
) -> None:
"""
Initialize the SimpleMultiModalDataModule.
Expand All @@ -87,6 +89,7 @@ def __init__(
self.tokenizer = tokenizer
self.image_processor = image_processor
self.seq_length = seq_length
self.decoder_seq_length = decoder_seq_length
self.micro_batch_size = micro_batch_size
self.global_batch_size = global_batch_size
self.num_workers = num_workers
Expand All @@ -99,13 +102,18 @@ def __init__(
)
self.init_global_step = 0
self.data_sampler = SequentialMegatronSampler(
seq_len=self.seq_length, micro_batch_size=self.micro_batch_size, global_batch_size=self.global_batch_size
seq_len=self.seq_length,
decoder_seq_len=self.decoder_seq_length,
micro_batch_size=self.micro_batch_size,
global_batch_size=self.global_batch_size,
)
self.train_dataloader_object = None
self.val_dataloader_object = None

def io_init(self, **kwargs) -> fdl.Config[Self]:
# (pleasefixme) image_processor and task_encoder are problematic with Fiddle so we skip serializing them for now
cfg_kwargs = {k: deepcopy(v) for k, v in kwargs.items() if k not in ['image_processor', 'task_encoder']}

for val in cfg_kwargs.values():
if not serialization.find_node_traverser(type(val)):
track_io(type(val))
Expand Down Expand Up @@ -323,6 +331,7 @@ def __init__(
micro_batch_size: int = 4,
global_batch_size: int = 8,
init_consumed_samples: int = 0,
decoder_seq_len: Optional[int] = None,
init_global_step=0,
):
"""
Expand All @@ -336,6 +345,7 @@ def __init__(
"""
super().__init__(
seq_len=seq_len,
decoder_seq_len=decoder_seq_len,
micro_batch_size=micro_batch_size,
global_batch_size=global_batch_size,
init_consumed_samples=init_consumed_samples,
Expand Down
8 changes: 1 addition & 7 deletions nemo/collections/multimodal/data/energon/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
from dataclasses import dataclass, field
from typing import List
import torch
from nemo.collections.multimodal.data.energon.conversation import BaseConversationTemplateConfig
from nemo.collections.multimodal.data.energon.conversation import LLaVATemplateConfig


@dataclass
Expand Down Expand Up @@ -56,12 +56,6 @@ class ImageTextRawBatch:
loss_mask: torch.Tensor = field(default_factory=lambda: torch.empty(0, dtype=torch.float))


class LLaVATemplateConfig(BaseConversationTemplateConfig):
"""LLava specific template configuration which extends the base config"""

pass


@dataclass
class MultiModalSampleConfig:
image_token: ImageToken = field(default_factory=ImageToken)
Expand Down
20 changes: 20 additions & 0 deletions nemo/collections/multimodal/data/energon/conversation.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,15 @@
class BaseConversationTemplateConfig:
"""Conversation template config related parameters"""

system: Optional[str] = "".format() # fmt: off
roles: List[str] = field(default_factory=lambda: ['user', 'assistant'])
stop_string: Optional[str] = None
chat_template = None


class LLaVATemplateConfig(BaseConversationTemplateConfig):
"""LLava specific template configuration which extends the base config"""

system: Optional[str] = (
"A chat between a curious user and artificial assistant agent. The assistant gives helpful, detailed and polite answers to user's questions.".format()
) # fmt: off
Expand All @@ -36,3 +45,14 @@ class BaseConversationTemplateConfig:
{%- endif %}
{%- endfor -%}
"""


class MLlamaTemplateConfig(BaseConversationTemplateConfig):
"""LLava specific template configuration which extends the base config"""

system: Optional[str] = None
roles: List[str] = field(default_factory=lambda: ['user', 'assistant'])
stop_string: str = None
chat_template = """
'{{- bos_token }}\n{%- if custom_tools is defined %}\n {%- set tools = custom_tools %}\n{%- endif %}\n{%- if not tools_in_user_message is defined %}\n {%- set tools_in_user_message = true %}\n{%- endif %}\n{%- if not date_string is defined %}\n {%- if strftime_now is defined %}\n {%- set date_string = strftime_now("%d %b %Y") %}\n {%- else %}\n {%- set date_string = "26 Jul 2024" %}\n {%- endif %}\n{%- endif %}\n{%- if not tools is defined %}\n {%- set tools = none %}\n{%- endif %}\n\n{#- This block extracts the system message, so we can slot it into the right place. #}\n{%- if messages[0][\'role\'] == \'system\' %}\n {%- set system_message = messages[0][\'content\']|trim %}\n {%- set messages = messages[1:] %}\n{%- else %}\n {%- set system_message = "" %}\n{%- endif %}\n\n{#- Find out if there are any images #}\n{% set image_ns = namespace(has_images=false) %} \n{%- for message in messages %}\n {%- for content in message[\'content\'] %}\n {%- if content[\'type\'] == \'image\' %}\n {%- set image_ns.has_images = true %}\n {%- endif %}\n {%- endfor %}\n{%- endfor %}\n\n{#- Error out if there are images and system message #}\n{%- if image_ns.has_images and not system_message == "" %}\n {{- raise_exception("Prompting with images is incompatible with system messages.") }}\n{%- endif %}\n\n{#- System message if there are no images #}\n{%- if not image_ns.has_images %}\n {{- "<|start_header_id|>system<|end_header_id|>\\n\\n" }}\n {%- if tools is not none %}\n {{- "Environment: ipython\\n" }}\n {%- endif %}\n {{- "Cutting Knowledge Date: December 2023\\n" }}\n {{- "Today Date: " + date_string + "\\n\\n" }}\n {%- if tools is not none and not tools_in_user_message %}\n {{- "You have access to the following functions. To call a function, please respond with JSON for a function call." }}\n {{- \'Respond in the format {"name": function name, "parameters": dictionary of argument name and its value}.\' }}\n {{- "Do not use variables.\\n\\n" }}\n {%- for t in tools %}\n {{- t | tojson(indent=4) }}\n {{- "\\n\\n" }}\n {%- endfor %}\n {%- endif %}\n {{- system_message }}\n {{- "<|eot_id|>" }}\n{%- endif %}\n\n{#- Custom tools are passed in a user message with some extra guidance #}\n{%- if tools_in_user_message and not tools is none %}\n {#- Extract the first user message so we can plug it in here #}\n {%- if messages | length != 0 %}\n {%- set first_user_message = messages[0][\'content\']|trim %}\n {%- set messages = messages[1:] %}\n {%- else %}\n {{- raise_exception("Cannot put tools in the first user message when there\'s no first user message!") }}\n{%- endif %}\n {{- \'<|start_header_id|>user<|end_header_id|>\\n\\n\' -}}\n {{- "Given the following functions, please respond with a JSON for a function call " }}\n {{- "with its proper arguments that best answers the given prompt.\\n\\n" }}\n {{- \'Respond in the format {"name": function name, "parameters": dictionary of argument name and its value}.\' }}\n {{- "Do not use variables.\\n\\n" }}\n {%- for t in tools %}\n {{- t | tojson(indent=4) }}\n {{- "\\n\\n" }}\n {%- endfor %}\n {{- first_user_message + "<|eot_id|>"}}\n{%- endif %}\n\n{%- for message in messages %}\n {%- if not (message.role == \'ipython\' or message.role == \'tool\' or \'tool_calls\' in message) %}\n {{- \'<|start_header_id|>\' + message[\'role\'] + \'<|end_header_id|>\\n\\n\' }}\n {%- if message[\'content\'] is string %}\n {{- message[\'content\'] }}\n {%- else %}\n {%- for content in message[\'content\'] %}\n {%- if content[\'type\'] == \'image\' %}\n {{- \'<|image|>\' }}\n {%- elif content[\'type\'] == \'text\' %}\n {{- content[\'text\'] }}\n {%- endif %}\n {%- endfor %}\n {%- endif %}\n {{- \'<|eot_id|>\' }}\n {%- elif \'tool_calls\' in message %}\n {%- if not message.tool_calls|length == 1 %}\n {{- raise_exception("This model only supports single tool-calls at once!") }}\n {%- endif %}\n {%- set tool_call = message.tool_calls[0].function %}\n {{- \'<|start_header_id|>assistant<|end_header_id|>\\n\\n\' -}}\n {{- \'{"name": "\' + tool_call.name + \'", \' }}\n {{- \'"parameters": \' }}\n {{- tool_call.arguments | tojson }}\n {{- "}" }}\n {{- "<|eot_id|>" }}\n {%- elif message.role == "tool" or message.role == "ipython" %}\n {{- "<|start_header_id|>ipython<|end_header_id|>\\n\\n" }}\n {%- if message.content is mapping or message.content is iterable %}\n {{- message.content | tojson }}\n {%- else %}\n {{- message.content }}\n {%- endif %}\n {{- "<|eot_id|>" }}\n {%- endif %}\n{%- endfor %}\n{%- if add_generation_prompt %}\n {{- \'<|start_header_id|>assistant<|end_header_id|>\\n\\n\' }}\n{%- endif %}\n'
"""
2 changes: 1 addition & 1 deletion nemo/collections/multimodal/data/energon/task_encoder.py
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ def __init__(self, tokenizer, image_processor, multimodal_sample_config):
image_processor (ImageProcessor): The image processor used for preprocessing images across different sample types.
multimodal_sample_config (MultiModalSampleConfig): Configuration object for multimodal samples, including tokens and placeholders.
"""

self.tokenizer = tokenizer
self.encoders: Dict[str, SampleEncoder] = {
VQASample.__name__: VQASampleEncoder(
tokenizer=tokenizer,
Expand Down
51 changes: 45 additions & 6 deletions nemo/collections/vlm/__init__.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,30 @@
# Copyright (c) 2024, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

from nemo.collections.vlm.mllama.data import MLlamaLazyDataModule, MLlamaMockDataModule
from nemo.collections.vlm.mllama.model.base import (
CrossAttentionTextConfig,
CrossAttentionVisionConfig,
MLlamaModel,
MLlamaModelConfig,
)
from nemo.collections.vlm.mllama.model.mllama import (
MLlamaConfig11B,
MLlamaConfig11BInstruct,
MLlamaConfig90B,
MLlamaConfig90BInstruct,
)
from nemo.collections.vlm.neva.data import (
DataConfig,
ImageDataConfig,
Expand All @@ -6,24 +33,26 @@
MockDataModule,
MultiModalToken,
NevaLazyDataModule,
NevaMockDataModule,
VideoDataConfig,
VideoToken,
)
from nemo.collections.vlm.neva.model import (
from nemo.collections.vlm.neva.model.base import (
CLIPViTConfig,
HFCLIPVisionConfig,
Llava1_5Config7B,
Llava1_5Config13B,
LlavaConfig,
LlavaModel,
MultimodalProjectorConfig,
NevaConfig,
NevaModel,
)
from nemo.collections.vlm.neva.model.llava import Llava1_5Config7B, Llava1_5Config13B, LlavaConfig, LlavaModel
from nemo.collections.vlm.peft import LoRA
from nemo.collections.vlm.recipes import *

__all__ = [
"MockDataModule",
"NevaMockDataModule",
"NevaLazyDataModule",
"MLlamaMockDataModule",
"MLlamaLazyDataModule",
"DataConfig",
"ImageDataConfig",
"VideoDataConfig",
Expand All @@ -39,5 +68,15 @@
"Llava1_5Config7B",
"Llava1_5Config13B",
"LlavaModel",
"MLlamaModel",
"MLlamaModelConfig",
"CrossAttentionTextConfig",
"CrossAttentionVisionConfig",
"MLlamaConfig11B",
"MLlamaConfig11BInstruct",
"MLlamaConfig90B",
"MLlamaConfig90BInstruct",
"mllama_11b",
"mllama_90b",
"LlavaNextTaskEncoder",
]
17 changes: 17 additions & 0 deletions nemo/collections/vlm/mllama/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Copyright (c) 2024, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from transformers import PreTrainedTokenizerFast
from nemo.lightning.io import track_io

track_io(PreTrainedTokenizerFast)
21 changes: 21 additions & 0 deletions nemo/collections/vlm/mllama/data/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# Copyright (c) 2024, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

from nemo.collections.vlm.mllama.data.lazy import MLlamaLazyDataModule
from nemo.collections.vlm.mllama.data.mock import MockDataModule as MLlamaMockDataModule

__all__ = [
"MLlamaMockDataModule",
"MLlamaLazyDataModule",
]
Loading

0 comments on commit 4afa427

Please sign in to comment.