You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We propose MeteoRA: Multiple-tasks Embedded LoRA for Large Language Models. Our proposed MeteoRA (Multiple-Tasks embedded LoRA) is a scalable and efficient framework that embeds multiple task-specific LoRA adapters into the base LLM via a full-mode Mixture-of-Experts (MoE) architecture. This framework also includes novel MoE forward acceleration strategies to address the efficiency challenges of traditional MoE implementations. Our evaluations, using the LlaMA2-13B and LlaMA3-8B base models equipped with 28 off-the-shelf LoRA adapters through MeteoRA, demonstrate equivalent performance with the traditional PEFT method. Moreover, the LLM equipped with MeteoRA achieves superior performance in handling composite tasks, effectively solving ten sequential problems in a single inference pass, thereby demonstrating the framework’s enhanced capability for timely adapter switching and multi-LoRA fusion.
We think this work may match your survey's focus. Please kindly consider this paper.
We also publish the code for inference and training, as well as the models published in Huggingface (MeteoRA with LlaMA2-13B and MeteoRA with LlaMA3-8B).
The text was updated successfully, but these errors were encountered:
We propose MeteoRA: Multiple-tasks Embedded LoRA for Large Language Models. Our proposed MeteoRA (Multiple-Tasks embedded LoRA) is a scalable and efficient framework that embeds multiple task-specific LoRA adapters into the base LLM via a full-mode Mixture-of-Experts (MoE) architecture. This framework also includes novel MoE forward acceleration strategies to address the efficiency challenges of traditional MoE implementations. Our evaluations, using the LlaMA2-13B and LlaMA3-8B base models equipped with 28 off-the-shelf LoRA adapters through MeteoRA, demonstrate equivalent performance with the traditional PEFT method. Moreover, the LLM equipped with MeteoRA achieves superior performance in handling composite tasks, effectively solving ten sequential problems in a single inference pass, thereby demonstrating the framework’s enhanced capability for timely adapter switching and multi-LoRA fusion.
We think this work may match your survey's focus. Please kindly consider this paper.
We also publish the code for inference and training, as well as the models published in Huggingface (MeteoRA with LlaMA2-13B and MeteoRA with LlaMA3-8B).
The text was updated successfully, but these errors were encountered: