Add Mixture Of Agents paper implementation (run-llama#14112)

sansmoraxz · Jun 12, 2024 · 5cd850f · 5cd850f
1 parent 7c512fb
commit 5cd850f
Show file tree

Hide file tree

Showing 13 changed files with 438 additions and 0 deletions.
diff --git a/llama-index-packs/llama-index-packs-mixture-of-agents/.gitignore b/llama-index-packs/llama-index-packs-mixture-of-agents/.gitignore
@@ -0,0 +1,153 @@
+llama_index/_static
+.DS_Store
+# Byte-compiled / optimized / DLL files
+__pycache__/
+*.py[cod]
+*$py.class
+
+# C extensions
+*.so
+
+# Distribution / packaging
+.Python
+bin/
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+etc/
+include/
+lib/
+lib64/
+parts/
+sdist/
+share/
+var/
+wheels/
+pip-wheel-metadata/
+share/python-wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+MANIFEST
+
+# PyInstaller
+#  Usually these files are written by a python script from a template
+#  before PyInstaller builds the exe, so as to inject date/other infos into it.
+*.manifest
+*.spec
+
+# Installer logs
+pip-log.txt
+pip-delete-this-directory.txt
+
+# Unit test / coverage reports
+htmlcov/
+.tox/
+.nox/
+.coverage
+.coverage.*
+.cache
+nosetests.xml
+coverage.xml
+*.cover
+*.py,cover
+.hypothesis/
+.pytest_cache/
+.ruff_cache
+
+# Translations
+*.mo
+*.pot
+
+# Django stuff:
+*.log
+local_settings.py
+db.sqlite3
+db.sqlite3-journal
+
+# Flask stuff:
+instance/
+.webassets-cache
+
+# Scrapy stuff:
+.scrapy
+
+# Sphinx documentation
+docs/_build/
+
+# PyBuilder
+target/
+
+# Jupyter Notebook
+.ipynb_checkpoints
+notebooks/
+
+# IPython
+profile_default/
+ipython_config.py
+
+# pyenv
+.python-version
+
+# pipenv
+#   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
+#   However, in case of collaboration, if having platform-specific dependencies or dependencies
+#   having no cross-platform support, pipenv may install dependencies that don't work, or not
+#   install all needed dependencies.
+#Pipfile.lock
+
+# PEP 582; used by e.g. github.com/David-OConnor/pyflow
+__pypackages__/
+
+# Celery stuff
+celerybeat-schedule
+celerybeat.pid
+
+# SageMath parsed files
+*.sage.py
+
+# Environments
+.env
+.venv
+env/
+venv/
+ENV/
+env.bak/
+venv.bak/
+pyvenv.cfg
+
+# Spyder project settings
+.spyderproject
+.spyproject
+
+# Rope project settings
+.ropeproject
+
+# mkdocs documentation
+/site
+
+# mypy
+.mypy_cache/
+.dmypy.json
+dmypy.json
+
+# Pyre type checker
+.pyre/
+
+# Jetbrains
+.idea
+modules/
+*.swp
+
+# VsCode
+.vscode
+
+# pipenv
+Pipfile
+Pipfile.lock
+
+# pyright
+pyrightconfig.json
diff --git a/llama-index-packs/llama-index-packs-mixture-of-agents/BUILD b/llama-index-packs/llama-index-packs-mixture-of-agents/BUILD
@@ -0,0 +1,7 @@
+poetry_requirements(
+    name="poetry",
+)
+
+python_requirements(
+    name="reqs",
+)
diff --git a/llama-index-packs/llama-index-packs-mixture-of-agents/CHANGELOG.md b/llama-index-packs/llama-index-packs-mixture-of-agents/CHANGELOG.md
@@ -0,0 +1,5 @@
+# CHANGELOG
+
+## [0.1.2] - 2024-02-13
+
+- Add maintainers and keywords from library.json (llamahub)
diff --git a/llama-index-packs/llama-index-packs-mixture-of-agents/Makefile b/llama-index-packs/llama-index-packs-mixture-of-agents/Makefile
@@ -0,0 +1,17 @@
+GIT_ROOT ?= $(shell git rev-parse --show-toplevel)
+
+help:	## Show all Makefile targets.
+	@grep -E '^[a-zA-Z_-]+:.*?## .*$$' $(MAKEFILE_LIST) | awk 'BEGIN {FS = ":.*?## "}; {printf "\033[33m%-30s\033[0m %s\n", $$1, $$2}'
+
+format:	## Run code autoformatters (black).
+	pre-commit install
+	git ls-files | xargs pre-commit run black --files
+
+lint:	## Run linters: pre-commit (black, ruff, codespell) and mypy
+	pre-commit install && git ls-files | xargs pre-commit run --show-diff-on-failure --files
+
+test:	## Run tests via pytest.
+	pytest tests
+
+watch-docs:	## Build and watch documentation.
+	sphinx-autobuild docs/ docs/_build/html --open-browser --watch $(GIT_ROOT)/llama_index/
diff --git a/llama-index-packs/llama-index-packs-mixture-of-agents/README.md b/llama-index-packs/llama-index-packs-mixture-of-agents/README.md
@@ -0,0 +1,57 @@
+# Mixture-Of-Agents Pack
+
+Implementation Of [Mixture-Of-Agents](https://arxiv.org/abs/2406.04692) paper from TogetherAI as LlamaPack.
+
+Disclaimer: While the paper named the method "Mixture of Agents", agents appear to refer to LLMs themselves, not actual agentic behaviour
+
+### Approach
+
+The capabilities of LLMs have advanced significantly, and there is now a growing number of these models available. To maximize their potential, we need to harness the collective expertise of multiple LLMs. This is where the Mixture-of-Agents (MoA) approach comes in.
+
+The MoA approach is a layered architecture where each layer consists of multiple LLM agents. These agents collaborate by taking the outputs of other agents in the previous layer as auxiliary information to generate their responses. This collaboration allows for the refinement and enhancement of responses, as agents build upon each other's strengths. The process can be categorized into two roles: Proposers(base LLM), who generate diverse context and perspectives, and Aggregators(reference LLMs), who synthesize these proposals into a single, high-quality output. By introducing additional aggregators and iteratively refining the responses, the MoA approach aims to maximize the collaborative potential of multiple LLMs, leading to superior outcomes.
+
+## CLI Usage
+
+You can download llamapacks directly using `llamaindex-cli`, which comes installed with the `llama-index` python package:
+
+```bash
+llamaindex-cli download-llamapack MixtureOfAgentsPack --download-dir ./mixture_of_agents_pack
+```
+
+You can then inspect the files at `./mixture_of_agents_pack` and use them as a template for your own project.
+
+## Code Usage
+
+You can download the pack to a the `./mixture_of_agents_pack` directory:
+
+```python
+from llama_index.core.llama_pack import download_llama_pack
+
+# download and install dependencies
+MixtureOfAgentsPack = download_llama_pack(
+    "MixtureOfAgentsPack", "./mixture_of_agents_pack"
+)
+
+from llama_index.llms.openai import OpenAI
+from llama_index.llms.mistralai import MistralAI
+
+# Add OPENAI_API_KEY and MISTRAL_API_KEY to your env variable
+
+mixture_of_agents_pack = MixtureOfAgentsPack(
+    llm=OpenAI(model="gpt-4"),  # Aggregator
+    reference_llms=[
+        OpenAI(model="gpt-3.5-turbo"),
+        MistralAI(model="mistral-medium"),
+    ],  # Proposers
+    num_layers=3,
+    temperature=0.1,
+)
+```
+
+From here, you can use the pack, or inspect and modify the pack in `./mixture_of_agents_pack`.
+
+The `run()` function is a light wrapper around the proposed approach in the paper.
+
+```python
+response = mixture_of_agents_pack.run("What is LlamaIndex?")
+```
diff --git a/...index-packs/llama-index-packs-mixture-of-agents/llama_index/packs/mixture_of_agents/BUILD b/...index-packs/llama-index-packs-mixture-of-agents/llama_index/packs/mixture_of_agents/BUILD
@@ -0,0 +1 @@
+python_sources()
diff --git a/...packs/llama-index-packs-mixture-of-agents/llama_index/packs/mixture_of_agents/__init__.py b/...packs/llama-index-packs-mixture-of-agents/llama_index/packs/mixture_of_agents/__init__.py
@@ -0,0 +1,3 @@
+from llama_index.packs.mixture_of_agents.base import MixtureOfAgentsPack
+
+__all__ = ["MixtureOfAgentsPack"]
diff --git a/...dex-packs/llama-index-packs-mixture-of-agents/llama_index/packs/mixture_of_agents/base.py b/...dex-packs/llama-index-packs-mixture-of-agents/llama_index/packs/mixture_of_agents/base.py
@@ -0,0 +1,123 @@
+# Reference: https://github.com/togethercomputer/MoA
+
+import logging
+from typing import Any, Dict, List
+import copy
+import asyncio
+import sys
+
+from llama_index.core.llama_pack.base import BaseLlamaPack
+from llama_index.core.llms.llm import LLM
+from llama_index.core.llms import ChatMessage
+
+logger = logging.getLogger(__name__)
+logging.basicConfig(stream=sys.stdout, level=logging.INFO)
+logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))
+
+
+class MixtureOfAgentsPack(BaseLlamaPack):
+    def __init__(
+        self,
+        llm: LLM,
+        reference_llms: List[LLM],
+        num_layers: int = 3,
+        max_tokens: int = 2048,
+        temperature: float = 0.7,
+    ) -> None:
+        self.llm = llm
+        self.reference_llms = reference_llms
+        self.num_layers = num_layers
+        self.max_tokens = max_tokens
+        self.temperature = temperature
+
+    def inject_references_to_messages(
+        self,
+        messages: List[ChatMessage],
+        references: List[str],
+    ) -> List[ChatMessage]:
+        messages = copy.deepcopy(messages)
+
+        system = f"""You have been provided with a set of responses from various open-source models to the latest user query. Your task is to synthesize these responses into a single, high-quality response. It is crucial to critically evaluate the information provided in these responses, recognizing that some of it may be biased or incorrect. Your response should not simply replicate the given answers but should offer a refined, accurate, and comprehensive reply to the instruction. Ensure your response is well-structured, coherent, and adheres to the highest standards of accuracy and reliability.
+
+    Responses from models:"""
+
+        for i, reference in enumerate(references):
+            system += f"\n{i+1}. {reference}"
+
+        if messages[0].role == "system":
+            messages[0].content += "\n\n" + system
+
+        else:
+            messages = [ChatMessage(role="system", content=system), *messages]
+
+        return messages
+
+    async def agenerate_with_references(
+        self,
+        llm: LLM,
+        messages: List[ChatMessage],
+        references: List[str],
+        max_tokens: int,
+        temperature: float,
+    ) -> str:
+        if len(references) > 0:
+            messages = self.inject_references_to_messages(messages, references)
+
+        return str(
+            await llm.achat(messages, max_tokens=max_tokens, temperature=temperature)
+        ).strip()
+
+    async def get_answer(self, query_str: str) -> str:
+        messages = []
+
+        messages.append(ChatMessage(role="user", content=query_str))
+
+        references = []
+
+        if len(self.reference_llms) > 0:
+            prev_references = []
+
+            for layer in range(self.num_layers):
+                logger.info(
+                    f"Round {layer+1}/{self.num_layers} to collecting reference responses."
+                )
+
+                references = []
+
+                jobs = [
+                    self.agenerate_with_references(
+                        llm=reference_llm,
+                        messages=messages,
+                        references=prev_references,
+                        max_tokens=self.max_tokens,
+                        temperature=self.temperature,
+                    )
+                    for reference_llm in self.reference_llms
+                ]
+
+                references = await asyncio.gather(*jobs)
+
+                if layer < self.num_layers - 1:
+                    prev_references = references
+
+                    references = []
+
+        return await self.agenerate_with_references(
+            llm=self.llm,
+            messages=messages,
+            max_tokens=self.max_tokens,
+            temperature=self.temperature,
+            references=references,
+        )
+
+    def get_modules(self) -> Dict[str, Any]:
+        """Get modules."""
+        return {
+            "llm": self.llm,
+            "reference_llms": self.reference_llms,
+            "num_layers": self.num_layers,
+        }
+
+    def run(self, query_str: str, **kwargs: Any) -> Any:
+        """Run the pipeline."""
+        return asyncio.run(self.get_answer(query_str))
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1,3 @@
		from llama_index.packs.mixture_of_agents.base import MixtureOfAgentsPack

		__all__ = ["MixtureOfAgentsPack"]