v0.0.37 (#14)

pkelaita · Dec 10, 2024 · dd33bd1 · dd33bd1
2 parents cbe5ee3 + fe2d38f
commit dd33bd1
Show file tree

Hide file tree

Showing 8 changed files with 336 additions and 296 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1,9 +1,30 @@
 # Changelog
 
-_Current version: 0.0.36_
+_Current version: 0.0.37_
 
 [PyPi link](https://pypi.org/project/l2m2/)
 
+### 0.0.37 - December 9, 2024
+
+> [!CAUTION]
+> This release has _significant_ breaking changes! Please read the changelog carefully.
+
+#### Added
+
+- Support for provider [Cerebras](https://cerebras.ai/), offering `llama-3.1-8b` and `llama-3.1-70b`.
+- Support for Mistral's `mistral-small`, `ministral-8b`, and `ministral-3b` models via La Plateforme.
+
+#### Changed
+
+- `mistral-large-2` has been renamed to `mistral-large`, to keep up with Mistral's naming scheme. **This is a breaking change!!!** Calls to `mistral-large-2` will fail.
+
+#### Removed
+
+- `mixtral-8x22b`, `mixtral-8x7b`, and `mistral-7b` are no longer available from provider Mistral as they have been [deprecated](https://docs.mistral.ai/getting-started/models/models_overview/). **This is a breaking change!!!** Calls to `mixtral-8x7b` and `mistral-7b` will fail, and calls to `mixtral-8x22b` via provider Mistral will fail.
+
+> [!NOTE]
+> The model `mixtral-8x22b` is still available via Groq.
+
 ### 0.0.36 - November 21, 2024
 
 #### Changed

diff --git a/README.md b/README.md
@@ -1,12 +1,12 @@
 # L2M2: A Simple Python LLM Manager 💬👍
 
-[![Tests](https://github.com/pkelaita/l2m2/actions/workflows/tests.yml/badge.svg?timestamp=1732217169)](https://github.com/pkelaita/l2m2/actions/workflows/tests.yml) [![codecov](https://codecov.io/github/pkelaita/l2m2/graph/badge.svg?token=UWIB0L9PR8)](https://codecov.io/github/pkelaita/l2m2) [![PyPI version](https://badge.fury.io/py/l2m2.svg?timestamp=1732217169)](https://badge.fury.io/py/l2m2)
+[![Tests](https://github.com/pkelaita/l2m2/actions/workflows/tests.yml/badge.svg?timestamp=1733808328)](https://github.com/pkelaita/l2m2/actions/workflows/tests.yml) [![codecov](https://codecov.io/github/pkelaita/l2m2/graph/badge.svg?token=UWIB0L9PR8)](https://codecov.io/github/pkelaita/l2m2) [![PyPI version](https://badge.fury.io/py/l2m2.svg?timestamp=1733808328)](https://badge.fury.io/py/l2m2)
 
 **L2M2** ("LLM Manager" &rarr; "LLMM" &rarr; "L2M2") is a tiny and very simple LLM manager for Python that exposes lots of models through a unified API. This is useful for evaluation, demos, production applications etc. that need to easily be model-agnostic.
 
 ### Features
 
-- <!--start-count-->25<!--end-count--> supported models (see below) – regularly updated and with more on the way.
+- <!--start-count-->27<!--end-count--> supported models (see below) – regularly updated and with more on the way.
 - Session chat memory – even across multiple models or with concurrent memory streams.
 - JSON mode
 - Prompt loading tools
@@ -32,21 +32,23 @@ L2M2 currently supports the following models:
 | `gemini-1.5-pro`    | [Google](https://ai.google.dev/)                                   | `gemini-1.5-pro`                                    |
 | `gemini-1.0-pro`    | [Google](https://ai.google.dev/)                                   | `gemini-1.0-pro`                                    |
 | `claude-3.5-sonnet` | [Anthropic](https://www.anthropic.com/api)                         | `claude-3-5-sonnet-latest`                          |
+| `claude-3.5-haiku`  | [Anthropic](https://www.anthropic.com/api)                         | `claude-3-5-haiku-latest`                           |
 | `claude-3-opus`     | [Anthropic](https://www.anthropic.com/api)                         | `claude-3-opus-20240229`                            |
 | `claude-3-sonnet`   | [Anthropic](https://www.anthropic.com/api)                         | `claude-3-sonnet-20240229`                          |
 | `claude-3-haiku`    | [Anthropic](https://www.anthropic.com/api)                         | `claude-3-haiku-20240307`                           |
 | `command-r`         | [Cohere](https://docs.cohere.com/)                                 | `command-r`                                         |
 | `command-r-plus`    | [Cohere](https://docs.cohere.com/)                                 | `command-r-plus`                                    |
-| `mistral-large-2`   | [Mistral](https://mistral.ai/)                                     | `mistral-large-latest`                              |
-| `mixtral-8x22b`     | [Mistral](https://mistral.ai/)                                     | `open-mixtral-8x22b`                                |
-| `mixtral-8x7b`      | [Mistral](https://mistral.ai/), [Groq](https://wow.groq.com/)      | `open-mixtral-8x7b`, `mixtral-8x7b-32768`           |
-| `mistral-7b`        | [Mistral](https://mistral.ai/)                                     | `open-mistral-7b`                                   |
+| `mistral-large`     | [Mistral](https://mistral.ai/)                                     | `mistral-large-latest`                              |
+| `ministral-3b`      | [Mistral](https://mistral.ai/)                                     | `ministral-3b-latest`                               |
+| `ministral-8b`      | [Mistral](https://mistral.ai/)                                     | `ministral-8b-latest`                               |
+| `mistral-small`     | [Mistral](https://mistral.ai/)                                     | `mistral-small-latest`                              |
+| `mixtral-8x7b`      | [Groq](https://wow.groq.com/)                                      | `mixtral-8x7b-32768`                                |
 | `gemma-7b`          | [Groq](https://wow.groq.com/)                                      | `gemma-7b-it`                                       |
 | `gemma-2-9b`        | [Groq](https://wow.groq.com/)                                      | `gemma2-9b-it`                                      |
 | `llama-3-8b`        | [Groq](https://wow.groq.com/), [Replicate](https://replicate.com/) | `llama3-8b-8192`, `meta/meta-llama-3-8b-instruct`   |
 | `llama-3-70b`       | [Groq](https://wow.groq.com/), [Replicate](https://replicate.com/) | `llama3-70b-8192`, `meta/meta-llama-3-70b-instruct` |
-| `llama-3.1-8b`      | [Groq](https://wow.groq.com/)                                      | `llama-3.1-8b-instant`                              |
-| `llama-3.1-70b`     | [Groq](https://wow.groq.com/)                                      | `llama-3.1-70b-versatile`                           |
+| `llama-3.1-8b`      | [Groq](https://wow.groq.com/), [Cerebras](https://cerebras.ai/)    | `llama-3.1-8b-instant`, `llama3.1-8b`               |
+| `llama-3.1-70b`     | [Groq](https://wow.groq.com/), [Cerebras](https://cerebras.ai/)    | `llama-3.1-70b-versatile`, `llama3.1-70b`           |
 | `llama-3.1-405b`    | [Replicate](https://replicate.com/)                                | `meta/meta-llama-3.1-405b-instruct`                 |
 | `llama-3.2-1b`      | [Groq](https://wow.groq.com/)                                      | `llama-3.2-1b-preview`                              |
 | `llama-3.2-3b`      | [Groq](https://wow.groq.com/)                                      | `llama-3.2-3b-preview`                              |
@@ -105,6 +107,7 @@ To activate any of the providers, set the provider's API key in the correspondin
 | Groq                    | `GROQ_API_KEY`        |
 | Replicate               | `REPLICATE_API_TOKEN` |
 | Mistral (La Plateforme) | `MISTRAL_API_KEY`     |
+| Cerebras                | `CEREBRAS_API_KEY`    |
 
 Additionally, you can activate providers programmatically as follows:
 
@@ -510,10 +513,10 @@ The following models natively support JSON mode via the given provider:
 - `gpt-4-turbo` (via Openai)
 - `gpt-3.5-turbo` (via Openai)
 - `gemini-1.5-pro` (via Google)
-- `mistral-large-2` (via Mistral)
-- `mixtral-8x22b` (via Mistral)
-- `mixtral-8x7b` (via Mistral)
-- `mistral-7b` (via Mistral)
+- `mistral-large` (via Mistral)
+- `ministral-3b` (via Mistral)
+- `ministral-8b` (via Mistral)
+- `mistral-small` (via Mistral)
 
 <!--end-json-native-->
 

diff --git a/l2m2/__init__.py b/l2m2/__init__.py
@@ -1 +1 @@
-__version__ = "0.0.36"
+__version__ = "0.0.37"
diff --git a/l2m2/client/base_llm_client.py b/l2m2/client/base_llm_client.py
@@ -36,6 +36,7 @@
     "groq": "GROQ_API_KEY",
     "replicate": "REPLICATE_API_TOKEN",
     "mistral": "MISTRAL_API_KEY",
+    "cerebras": "CEREBRAS_API_KEY",
 }
 
 
@@ -668,6 +669,12 @@ async def _call_replicate(
         )
         return "".join(result["output"])
 
+    async def _call_cerebras(
+        self,
+        *args: Any,
+    ) -> str:
+        return await self._generic_openai_spec_call("cerebras", *args)
+
     async def _generic_openai_spec_call(
         self,
         provider: str,

diff --git a/l2m2/model_info.py b/l2m2/model_info.py
@@ -111,6 +111,15 @@ class ModelEntry(TypedDict):
             "Content-Type": "application/json",
         },
     },
+    "cerebras": {
+        "name": "Cerebras",
+        "homepage": "https://cerebras.ai/",
+        "endpoint": "https://api.cerebras.ai/v1/chat/completions",
+        "headers": {
+            "Authorization": f"Bearer {API_KEY}",
+            "Content-Type": "application/json",
+        },
+    },
 }
 
 MODEL_INFO: Dict[str, Dict[str, ModelEntry]] = {
@@ -230,6 +239,22 @@ class ModelEntry(TypedDict):
             "extras": {},
         },
     },
+    "claude-3.5-haiku": {
+        "anthropic": {
+            "model_id": "claude-3-5-haiku-latest",
+            "params": {
+                "temperature": {
+                    "default": 0.0,
+                    "max": 1.0,
+                },
+                "max_tokens": {
+                    "default": 1000,  # L2M2 default, field is required
+                    "max": 4096,
+                },
+            },
+            "extras": {},
+        },
+    },
     "claude-3-opus": {
         "anthropic": {
             "model_id": "claude-3-opus-20240229",
@@ -310,7 +335,7 @@ class ModelEntry(TypedDict):
             "extras": {},
         },
     },
-    "mistral-large-2": {
+    "mistral-large": {
         "mistral": {
             "model_id": "mistral-large-latest",
             "params": {
@@ -326,9 +351,9 @@ class ModelEntry(TypedDict):
             "extras": {"json_mode_arg": {"response_format": {"type": "json_object"}}},
         },
     },
-    "mixtral-8x22b": {
+    "ministral-3b": {
         "mistral": {
-            "model_id": "open-mixtral-8x22b",
+            "model_id": "ministral-3b-latest",
             "params": {
                 "temperature": {
                     "default": PROVIDER_DEFAULT,
@@ -342,9 +367,9 @@ class ModelEntry(TypedDict):
             "extras": {"json_mode_arg": {"response_format": {"type": "json_object"}}},
         },
     },
-    "mixtral-8x7b": {
+    "ministral-8b": {
         "mistral": {
-            "model_id": "open-mixtral-8x7b",
+            "model_id": "ministral-8b-latest",
             "params": {
                 "temperature": {
                     "default": PROVIDER_DEFAULT,
@@ -357,35 +382,37 @@ class ModelEntry(TypedDict):
             },
             "extras": {"json_mode_arg": {"response_format": {"type": "json_object"}}},
         },
-        "groq": {
-            "model_id": "mixtral-8x7b-32768",
+    },
+    "mistral-small": {
+        "mistral": {
+            "model_id": "mistral-small-latest",
             "params": {
                 "temperature": {
                     "default": PROVIDER_DEFAULT,
-                    "max": 2.0,
+                    "max": 1.0,
                 },
                 "max_tokens": {
                     "default": PROVIDER_DEFAULT,
-                    "max": 2**16 - 1,
+                    "max": INF,
                 },
             },
-            "extras": {},
+            "extras": {"json_mode_arg": {"response_format": {"type": "json_object"}}},
         },
     },
-    "mistral-7b": {
-        "mistral": {
-            "model_id": "open-mistral-7b",
+    "mixtral-8x7b": {
+        "groq": {
+            "model_id": "mixtral-8x7b-32768",
             "params": {
                 "temperature": {
                     "default": PROVIDER_DEFAULT,
-                    "max": 1.0,
+                    "max": 2.0,
                 },
                 "max_tokens": {
                     "default": PROVIDER_DEFAULT,
-                    "max": INF,
+                    "max": 2**16 - 1,
                 },
             },
-            "extras": {"json_mode_arg": {"response_format": {"type": "json_object"}}},
+            "extras": {},
         },
     },
     "gemma-7b": {
@@ -497,6 +524,20 @@ class ModelEntry(TypedDict):
             },
             "extras": {},
         },
+        "cerebras": {
+            "model_id": "llama3.1-8b",
+            "params": {
+                "temperature": {
+                    "default": PROVIDER_DEFAULT,
+                    "max": 1.5,
+                },
+                "max_tokens": {
+                    "default": PROVIDER_DEFAULT,
+                    "max": 2**31 - 1,
+                },
+            },
+            "extras": {},
+        },
     },
     "llama-3.1-70b": {
         "groq": {
@@ -513,6 +554,20 @@ class ModelEntry(TypedDict):
             },
             "extras": {},
         },
+        "cerebras": {
+            "model_id": "llama3.1-70b",
+            "params": {
+                "temperature": {
+                    "default": PROVIDER_DEFAULT,
+                    "max": 1.5,
+                },
+                "max_tokens": {
+                    "default": PROVIDER_DEFAULT,
+                    "max": 2**31 - 1,
+                },
+            },
+            "extras": {},
+        },
     },
     "llama-3.1-405b": {
         "replicate": {

diff --git a/requirements-dev.txt b/requirements-dev.txt
@@ -9,5 +9,4 @@ python-dotenv>=1.0.1
 build>=1.2.1
 mypy>=1.9.0
 requests-mock>=1.12.1
-respx>=0.21.1
 types-requests>=2.32.0.20240602