Training LLaVA with the Liger kernel results in degraded performance. #319

y-rok · 2024-10-22T02:58:42Z

🐛 Describe the bug

I attempted to train LLaVA (base LLM = LLaMA 3) using the Liger kernel. The loss graph was similar to when I trained LLaVA without the Liger kernel. However, the model trained with the Liger kernel showed lower performance on MLLM benchmarks, such as ChartQA. Since I used LLaMA 3, which is supported by Liger, I didn't expect any issues. Has anyone else tried training LLaVA with the Liger kernel?

Reproduce

from liger_kernel.transformers import apply_liger_kernel_to_llama
print("Apply liger_kernel_to_llama")
apply_liger_kernel_to_llama()

model = LlavaLlamaForCausalLM.from_pretrained(
                "meta-llama/Meta-Llama-3-8B",
                attn_implementation="flash_attention_2",
                torch_dtype=(torch.bfloat16),
            )

Versions

transformer = 4.45.1
torch = 2.4.0
a100

The text was updated successfully, but these errors were encountered:

hhaAndroid · 2024-10-25T02:16:46Z

I used xtuner to train llava, and there was no decrease in performance. I find this feature very useful and highly recommend it!

The training time remains almost unchanged, and the GPU memory usage is reduced by about 20%. If the sequence length is increased or a smaller model is used, the memory usage can be reduced by up to 50%.

y-rok · 2024-11-15T16:37:46Z

Could you share a code snippet showing how to apply the Liger Kernel for training a model with xtuner?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training LLaVA with the Liger kernel results in degraded performance. #319

Training LLaVA with the Liger kernel results in degraded performance. #319

y-rok commented Oct 22, 2024 •

edited

Loading

hhaAndroid commented Oct 25, 2024 •

edited

Loading

y-rok commented Nov 15, 2024 •

edited

Loading

Training LLaVA with the Liger kernel results in degraded performance. #319

Training LLaVA with the Liger kernel results in degraded performance. #319

Comments

y-rok commented Oct 22, 2024 • edited Loading

🐛 Describe the bug

Reproduce

Versions

hhaAndroid commented Oct 25, 2024 • edited Loading

y-rok commented Nov 15, 2024 • edited Loading

y-rok commented Oct 22, 2024 •

edited

Loading

hhaAndroid commented Oct 25, 2024 •

edited

Loading

y-rok commented Nov 15, 2024 •

edited

Loading