This is code to finetune and run Aurora-M, an open source Starcoderplus based model trained on 400B additional tokens of multilingual and multidomain data, and adapted for multimodal understanding using the BakLLaVA/LLaVA 1.5 code base. The 400B additional tokens were trained with BigCode's Megatron fork. This model is intended for mixture of experts (MoE) adapation using the M*DEL MoE adapatation. See our M*DEL project page for more details.
Compute provided by the LUMI Supercomputer center and JUWELS Supercomptuer center. Thank you!
Also check out our BakLLaVA project, which is a cooperation between the AI Open source organizations: LAION, Ontocord, Skunkworks OSS AI group and AI Alignment Lab.