Gemini Updates (#16)

pkelaita · Dec 13, 2024 · ed7d77a · ed7d77a
2 parents 475a2f4 + 8ca3138
commit ed7d77a
Show file tree

Hide file tree

Showing 6 changed files with 89 additions and 54 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1,14 +1,22 @@
 # Changelog
 
-_Current version: 0.0.37_
+_Current version: 0.0.38_
 
 [PyPi link](https://pypi.org/project/l2m2/)
 
-### In Development
+### v0.0.38 - December 12, 2024
+
+> [!CAUTION]
+> This release has breaking changes! Please read the changelog carefully.
 
 #### Added
 
 - Support for [Python 3.13](https://www.python.org/downloads/release/python-3130/).
+- Support for Google's [Gemini 2.0 Flash](https://ai.google.dev/gemini-api/docs/models/gemini#gemini-2.0-flash), [Gemini 1.5 Flash](https://ai.google.dev/gemini-api/docs/models/gemini#gemini-1.5-flash), and [Gemini 1.5 Flash 8B](https://ai.google.dev/gemini-api/docs/models/gemini#gemini-1.5-flash-8b) models.
+
+#### Removed
+
+- Gemini 1.0 Pro is no longer supported, as it is [deprecated](https://ai.google.dev/gemini-api/docs/models/gemini#gemini-1.0-pro) by Google. **This is a breaking change!!!** Calls to Gemini 1.0 Pro will fail.
 
 ### 0.0.37 - December 9, 2024
 

diff --git a/README.md b/README.md
@@ -1,14 +1,14 @@
 # L2M2: A Simple Python LLM Manager 💬👍
 
-[![Tests](https://github.com/pkelaita/l2m2/actions/workflows/tests.yml/badge.svg?timestamp=1733808328)](https://github.com/pkelaita/l2m2/actions/workflows/tests.yml) [![codecov](https://codecov.io/github/pkelaita/l2m2/graph/badge.svg?token=UWIB0L9PR8)](https://codecov.io/github/pkelaita/l2m2) [![PyPI version](https://badge.fury.io/py/l2m2.svg?timestamp=1733808328)](https://badge.fury.io/py/l2m2)
+[![Tests](https://github.com/pkelaita/l2m2/actions/workflows/tests.yml/badge.svg?timestamp=1734052060)](https://github.com/pkelaita/l2m2/actions/workflows/tests.yml) [![codecov](https://codecov.io/github/pkelaita/l2m2/graph/badge.svg?token=UWIB0L9PR8)](https://codecov.io/github/pkelaita/l2m2) [![PyPI version](https://badge.fury.io/py/l2m2.svg?timestamp=1734052060)](https://badge.fury.io/py/l2m2)
 
 **L2M2** ("LLM Manager" &rarr; "LLMM" &rarr; "L2M2") is a tiny and very simple LLM manager for Python that exposes lots of models through a unified API. This is useful for evaluation, demos, production applications etc. that need to easily be model-agnostic.
 
 ![](assets/l2m2_demo.gif)
 
 ### Features
 
-- <!--start-count-->27<!--end-count--> supported models (see below) – regularly updated and with more on the way.
+- <!--start-count-->29<!--end-count--> supported models (see below) – regularly updated and with more on the way.
 - Session chat memory – even across multiple models or with concurrent memory streams.
 - JSON mode
 - Prompt loading tools
@@ -25,35 +25,37 @@ L2M2 currently supports the following models:
 
 <!--start-model-table-->
 
-| Model Name          | Provider(s)                                                        | Model Version(s)                                    |
-| ------------------- | ------------------------------------------------------------------ | --------------------------------------------------- |
-| `gpt-4o`            | [OpenAI](https://openai.com/product)                               | `gpt-4o-2024-11-20`                                 |
-| `gpt-4o-mini`       | [OpenAI](https://openai.com/product)                               | `gpt-4o-mini-2024-07-18`                            |
-| `gpt-4-turbo`       | [OpenAI](https://openai.com/product)                               | `gpt-4-turbo-2024-04-09`                            |
-| `gpt-3.5-turbo`     | [OpenAI](https://openai.com/product)                               | `gpt-3.5-turbo-0125`                                |
-| `gemini-1.5-pro`    | [Google](https://ai.google.dev/)                                   | `gemini-1.5-pro`                                    |
-| `gemini-1.0-pro`    | [Google](https://ai.google.dev/)                                   | `gemini-1.0-pro`                                    |
-| `claude-3.5-sonnet` | [Anthropic](https://www.anthropic.com/api)                         | `claude-3-5-sonnet-latest`                          |
-| `claude-3.5-haiku`  | [Anthropic](https://www.anthropic.com/api)                         | `claude-3-5-haiku-latest`                           |
-| `claude-3-opus`     | [Anthropic](https://www.anthropic.com/api)                         | `claude-3-opus-20240229`                            |
-| `claude-3-sonnet`   | [Anthropic](https://www.anthropic.com/api)                         | `claude-3-sonnet-20240229`                          |
-| `claude-3-haiku`    | [Anthropic](https://www.anthropic.com/api)                         | `claude-3-haiku-20240307`                           |
-| `command-r`         | [Cohere](https://docs.cohere.com/)                                 | `command-r`                                         |
-| `command-r-plus`    | [Cohere](https://docs.cohere.com/)                                 | `command-r-plus`                                    |
-| `mistral-large`     | [Mistral](https://mistral.ai/)                                     | `mistral-large-latest`                              |
-| `ministral-3b`      | [Mistral](https://mistral.ai/)                                     | `ministral-3b-latest`                               |
-| `ministral-8b`      | [Mistral](https://mistral.ai/)                                     | `ministral-8b-latest`                               |
-| `mistral-small`     | [Mistral](https://mistral.ai/)                                     | `mistral-small-latest`                              |
-| `mixtral-8x7b`      | [Groq](https://wow.groq.com/)                                      | `mixtral-8x7b-32768`                                |
-| `gemma-7b`          | [Groq](https://wow.groq.com/)                                      | `gemma-7b-it`                                       |
-| `gemma-2-9b`        | [Groq](https://wow.groq.com/)                                      | `gemma2-9b-it`                                      |
-| `llama-3-8b`        | [Groq](https://wow.groq.com/), [Replicate](https://replicate.com/) | `llama3-8b-8192`, `meta/meta-llama-3-8b-instruct`   |
-| `llama-3-70b`       | [Groq](https://wow.groq.com/), [Replicate](https://replicate.com/) | `llama3-70b-8192`, `meta/meta-llama-3-70b-instruct` |
-| `llama-3.1-8b`      | [Groq](https://wow.groq.com/), [Cerebras](https://cerebras.ai/)    | `llama-3.1-8b-instant`, `llama3.1-8b`               |
-| `llama-3.1-70b`     | [Groq](https://wow.groq.com/), [Cerebras](https://cerebras.ai/)    | `llama-3.1-70b-versatile`, `llama3.1-70b`           |
-| `llama-3.1-405b`    | [Replicate](https://replicate.com/)                                | `meta/meta-llama-3.1-405b-instruct`                 |
-| `llama-3.2-1b`      | [Groq](https://wow.groq.com/)                                      | `llama-3.2-1b-preview`                              |
-| `llama-3.2-3b`      | [Groq](https://wow.groq.com/)                                      | `llama-3.2-3b-preview`                              |
+| Model Name            | Provider(s)                                                        | Model Version(s)                                    |
+| --------------------- | ------------------------------------------------------------------ | --------------------------------------------------- |
+| `gpt-4o`              | [OpenAI](https://openai.com/product)                               | `gpt-4o-2024-11-20`                                 |
+| `gpt-4o-mini`         | [OpenAI](https://openai.com/product)                               | `gpt-4o-mini-2024-07-18`                            |
+| `gpt-4-turbo`         | [OpenAI](https://openai.com/product)                               | `gpt-4-turbo-2024-04-09`                            |
+| `gpt-3.5-turbo`       | [OpenAI](https://openai.com/product)                               | `gpt-3.5-turbo-0125`                                |
+| `gemini-2.0-flash`    | [Google](https://ai.google.dev/)                                   | `gemini-2.0-flash-exp`                              |
+| `gemini-1.5-flash`    | [Google](https://ai.google.dev/)                                   | `gemini-1.5-flash`                                  |
+| `gemini-1.5-flash-8b` | [Google](https://ai.google.dev/)                                   | `gemini-1.5-flash-8b`                               |
+| `gemini-1.5-pro`      | [Google](https://ai.google.dev/)                                   | `gemini-1.5-pro`                                    |
+| `claude-3.5-sonnet`   | [Anthropic](https://www.anthropic.com/api)                         | `claude-3-5-sonnet-latest`                          |
+| `claude-3.5-haiku`    | [Anthropic](https://www.anthropic.com/api)                         | `claude-3-5-haiku-latest`                           |
+| `claude-3-opus`       | [Anthropic](https://www.anthropic.com/api)                         | `claude-3-opus-20240229`                            |
+| `claude-3-sonnet`     | [Anthropic](https://www.anthropic.com/api)                         | `claude-3-sonnet-20240229`                          |
+| `claude-3-haiku`      | [Anthropic](https://www.anthropic.com/api)                         | `claude-3-haiku-20240307`                           |
+| `command-r`           | [Cohere](https://docs.cohere.com/)                                 | `command-r`                                         |
+| `command-r-plus`      | [Cohere](https://docs.cohere.com/)                                 | `command-r-plus`                                    |
+| `mistral-large`       | [Mistral](https://mistral.ai/)                                     | `mistral-large-latest`                              |
+| `ministral-3b`        | [Mistral](https://mistral.ai/)                                     | `ministral-3b-latest`                               |
+| `ministral-8b`        | [Mistral](https://mistral.ai/)                                     | `ministral-8b-latest`                               |
+| `mistral-small`       | [Mistral](https://mistral.ai/)                                     | `mistral-small-latest`                              |
+| `mixtral-8x7b`        | [Groq](https://wow.groq.com/)                                      | `mixtral-8x7b-32768`                                |
+| `gemma-7b`            | [Groq](https://wow.groq.com/)                                      | `gemma-7b-it`                                       |
+| `gemma-2-9b`          | [Groq](https://wow.groq.com/)                                      | `gemma2-9b-it`                                      |
+| `llama-3-8b`          | [Groq](https://wow.groq.com/), [Replicate](https://replicate.com/) | `llama3-8b-8192`, `meta/meta-llama-3-8b-instruct`   |
+| `llama-3-70b`         | [Groq](https://wow.groq.com/), [Replicate](https://replicate.com/) | `llama3-70b-8192`, `meta/meta-llama-3-70b-instruct` |
+| `llama-3.1-8b`        | [Groq](https://wow.groq.com/), [Cerebras](https://cerebras.ai/)    | `llama-3.1-8b-instant`, `llama3.1-8b`               |
+| `llama-3.1-70b`       | [Groq](https://wow.groq.com/), [Cerebras](https://cerebras.ai/)    | `llama-3.1-70b-versatile`, `llama3.1-70b`           |
+| `llama-3.1-405b`      | [Replicate](https://replicate.com/)                                | `meta/meta-llama-3.1-405b-instruct`                 |
+| `llama-3.2-1b`        | [Groq](https://wow.groq.com/)                                      | `llama-3.2-1b-preview`                              |
+| `llama-3.2-3b`        | [Groq](https://wow.groq.com/)                                      | `llama-3.2-3b-preview`                              |
 
 <!--end-model-table-->
 
@@ -514,6 +516,9 @@ The following models natively support JSON mode via the given provider:
 - `gpt-4o-mini` (via Openai)
 - `gpt-4-turbo` (via Openai)
 - `gpt-3.5-turbo` (via Openai)
+- `gemini-2.0-flash` (via Google)
+- `gemini-1.5-flash` (via Google)
+- `gemini-1.5-flash-8b` (via Google)
 - `gemini-1.5-pro` (via Google)
 - `mistral-large` (via Mistral)
 - `ministral-3b` (via Mistral)

diff --git a/l2m2/__init__.py b/l2m2/__init__.py
@@ -1 +1 @@
-__version__ = "0.0.37"
+__version__ = "0.0.38"
diff --git a/l2m2/client/base_llm_client.py b/l2m2/client/base_llm_client.py
@@ -521,11 +521,7 @@ async def _call_google(
         data: Dict[str, Any] = {}
 
         if system_prompt is not None:
-            # Earlier models don't support system prompts, so prepend it to the prompt
-            if model_id not in ["gemini-1.5-pro"]:
-                prompt = f"{system_prompt}\n{prompt}"
-            else:
-                data["system_instruction"] = {"parts": {"text": system_prompt}}
+            data["system_instruction"] = {"parts": {"text": system_prompt}}
 
         messages: List[Dict[str, Any]] = []
         if isinstance(memory, ChatMemory):

diff --git a/l2m2/model_info.py b/l2m2/model_info.py
@@ -187,9 +187,9 @@ class ModelEntry(TypedDict):
             "extras": {"json_mode_arg": {"response_format": {"type": "json_object"}}},
         },
     },
-    "gemini-1.5-pro": {
+    "gemini-2.0-flash": {
         "google": {
-            "model_id": "gemini-1.5-pro",
+            "model_id": "gemini-2.0-flash-exp",
             "params": {
                 "temperature": {
                     "default": PROVIDER_DEFAULT,
@@ -205,9 +205,9 @@ class ModelEntry(TypedDict):
             "extras": {"json_mode_arg": {"response_mime_type": "application/json"}},
         },
     },
-    "gemini-1.0-pro": {
+    "gemini-1.5-flash": {
         "google": {
-            "model_id": "gemini-1.0-pro",
+            "model_id": "gemini-1.5-flash",
             "params": {
                 "temperature": {
                     "default": PROVIDER_DEFAULT,
@@ -220,7 +220,43 @@ class ModelEntry(TypedDict):
                     "max": 8192,
                 },
             },
-            "extras": {},
+            "extras": {"json_mode_arg": {"response_mime_type": "application/json"}},
+        },
+    },
+    "gemini-1.5-flash-8b": {
+        "google": {
+            "model_id": "gemini-1.5-flash-8b",
+            "params": {
+                "temperature": {
+                    "default": PROVIDER_DEFAULT,
+                    "max": 2.0,
+                },
+                "max_tokens": {
+                    "custom_key": "max_output_tokens",
+                    "default": PROVIDER_DEFAULT,
+                    # https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models#gemini-models
+                    "max": 8192,
+                },
+            },
+            "extras": {"json_mode_arg": {"response_mime_type": "application/json"}},
+        },
+    },
+    "gemini-1.5-pro": {
+        "google": {
+            "model_id": "gemini-1.5-pro",
+            "params": {
+                "temperature": {
+                    "default": PROVIDER_DEFAULT,
+                    "max": 2.0,
+                },
+                "max_tokens": {
+                    "custom_key": "max_output_tokens",
+                    "default": PROVIDER_DEFAULT,
+                    # https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models#gemini-models
+                    "max": 8192,
+                },
+            },
+            "extras": {"json_mode_arg": {"response_mime_type": "application/json"}},
         },
     },
     "claude-3.5-sonnet": {

diff --git a/tests/l2m2/client/test_base_llm_client.py b/tests/l2m2/client/test_base_llm_client.py
@@ -267,16 +267,6 @@ async def test_call_google_1_5(mock_get_extra_message, mock_llm_post, llm_client
     await _generic_test_call(llm_client, "google", "gemini-1.5-pro")
 
 
-@pytest.mark.asyncio
-@patch(LLM_POST_PATH)
-@patch(GET_EXTRA_MESSAGE_PATH)
-async def test_call_google_1_0(mock_get_extra_message, mock_llm_post, llm_client):
-    mock_get_extra_message.return_value = "extra message"
-    mock_return_value = {"candidates": [{"content": {"parts": [{"text": "response"}]}}]}
-    mock_llm_post.return_value = mock_return_value
-    await _generic_test_call(llm_client, "google", "gemini-1.0-pro")
-
-
 @pytest.mark.asyncio
 @patch(LLM_POST_PATH)
 @patch(GET_EXTRA_MESSAGE_PATH)