Merge branch 'main' into feat/opensearch/efficient_filtering

deepset-ai · Oct 29, 2024 · 424e50d · 424e50d
2 parents cdd2fb6 + 3220330
commit 424e50d
Show file tree

Hide file tree

Showing 67 changed files with 526 additions and 313 deletions.
diff --git a/.github/workflows/CI_stale.yml b/.github/workflows/CI_stale.yml
@@ -0,0 +1,15 @@
+name: 'Stalebot'
+on:
+  schedule:
+    - cron: '30 1 * * *'
+
+jobs:
+  makestale:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/stale@v9
+        with:
+          any-of-labels: 'community-triage'
+          stale-pr-message: 'This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 10 days.'
+          days-before-stale: 30
+          days-before-close: 10
diff --git a/.github/workflows/weaviate.yml b/.github/workflows/weaviate.yml
@@ -30,7 +30,7 @@ jobs:
       fail-fast: false
       matrix:
         os: [ubuntu-latest]
-        python-version: ["3.8", "3.9", "3.10", "3.11", "3.12"]
+        python-version: ["3.9", "3.10", "3.11", "3.12"]
 
     steps:
       - uses: actions/checkout@v4

diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -48,14 +48,14 @@ By participating, you are expected to uphold this code. Please report unacceptab
 ## I Have a Question
 
 > [!TIP]
-> If you want to ask a question, we assume that you have read the available [Documentation](https://docs.haystack.deepset.ai/v2.0/docs/intro).
+> If you want to ask a question, we assume that you have read the available [documentation](https://docs.haystack.deepset.ai/docs/intro).
 
-Before you ask a question, it is best to search for existing [Issues](/issues) that might help you. In case you have
+Before you ask a question, it is best to search for existing [issues](/../../issues) that might help you. In case you have
 found a suitable issue and still need clarification, you can write your question in this issue. It is also advisable to
 search the internet for answers first.
 
 If you then still feel the need to ask a question and need clarification, you can use one of our
-[Community Channels](https://haystack.deepset.ai/community), Discord in particular is often very helpful.
+[community channels](https://haystack.deepset.ai/community). Discord in particular is often very helpful.
 
 ## Reporting Bugs
 
@@ -67,8 +67,8 @@ investigate carefully, collect information and describe the issue in detail in y
 following steps in advance to help us fix any potential bug as fast as possible.
 
 - Make sure that you are using the latest version.
-- Determine if your bug is really a bug and not an error on your side e.g. using incompatible environment components/versions (Make sure that you have read the [documentation](https://docs.haystack.deepset.ai/v2.0/docs/intro). If you are looking for support, you might want to check [this section](#i-have-a-question)).
-- To see if other users have experienced (and potentially already solved) the same issue you are having, check if there is not already a bug report existing for your bug or error in the [bug tracker](/issues).
+- Determine if your bug is really a bug and not an error on your side e.g. using incompatible environment components/versions (Make sure that you have read the [documentation](https://docs.haystack.deepset.ai/docs/intro). If you are looking for support, you might want to check [this section](#i-have-a-question)).
+- To see if other users have experienced (and potentially already solved) the same issue you are having, check if there is not already a bug report existing for your bug or error in the [bug tracker](/../../issues?labels=bug).
 - Also make sure to search the internet (including Stack Overflow) to see if users outside of the GitHub community have discussed the issue.
 - Collect information about the bug:
   - OS, Platform and Version (Windows, Linux, macOS, x86, ARM)
@@ -85,7 +85,7 @@ following steps in advance to help us fix any potential bug as fast as possible.
 
 We use GitHub issues to track bugs and errors. If you run into an issue with the project:
 
-- Open an [Issue of type Bug Report](/issues/new?assignees=&labels=bug&projects=&template=bug_report.md&title=).
+- Open an [issue of type Bug Report](/../../issues/new?assignees=&labels=bug&projects=&template=bug_report.md&title=).
 - Explain the behavior you would expect and the actual behavior.
 - Please provide as much context as possible and describe the *reproduction steps* that someone else can follow to recreate the issue on their own. This usually includes your code. For good bug reports you should isolate the problem and create a reduced test case.
 - Provide the information you collected in the previous section.
@@ -94,7 +94,7 @@ Once it's filed:
 
 - The project team will label the issue accordingly.
 - A team member will try to reproduce the issue with your provided steps. If there are no reproduction steps or no obvious way to reproduce the issue, the team will ask you for those steps.
-- If the team is able to reproduce the issue, the issue will scheduled for a fix, or left to be [implemented by someone](#your-first-code-contribution).
+- If the team can reproduce the issue, it will either be scheduled for a fix or made available for [community contribution](#contribute-code).
 
 
 ## Suggesting Enhancements
@@ -106,14 +106,14 @@ to existing ones. Following these guidelines will help maintainers and the commu
 ### Before Submitting an Enhancement
 
 - Make sure that you are using the latest version.
-- Read the [documentation](https://docs.haystack.deepset.ai/v2.0/docs/intro) carefully and find out if the functionality is already covered, maybe by an individual configuration.
-- Perform a [search](/issues) to see if the enhancement has already been suggested. If it has, add a comment to the existing issue instead of opening a new one.
+- Read the [documentation](https://docs.haystack.deepset.ai/docs/intro) carefully and find out if the functionality is already covered, maybe by an individual configuration.
+- Perform a [search](/../../issues) to see if the enhancement has already been suggested. If it has, add a comment to the existing issue instead of opening a new one.
 - Find out whether your idea fits with the scope and aims of the project. It's up to you to make a strong case to convince the project's developers of the merits of this feature. Keep in mind that we want features that will be useful to the majority of our users and not just a small subset. If you're just targeting a minority of users, consider writing and distributing the integration on your own.
 
 
 ### How Do I Submit a Good Enhancement Suggestion?
 
-Enhancement suggestions are tracked as GitHub issues of type [Feature request for existing integrations](/issues/new?assignees=&labels=feature+request&projects=&template=feature-request-for-existing-integrations.md&title=).
+Enhancement suggestions are tracked as GitHub issues of type [Feature request for existing integrations](/../../issues/new?assignees=&labels=feature+request&projects=&template=feature-request-for-existing-integrations.md&title=).
 
 - Use a **clear and descriptive title** for the issue to identify the suggestion.
 - Fill the issue following the template
@@ -129,8 +129,8 @@ Enhancement suggestions are tracked as GitHub issues of type [Feature request fo
 If this is your first contribution, a good starting point is looking for an open issue that's marked with the label
 ["good first issue"](https://github.com/deepset-ai/haystack-core-integrations/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22).
 The core contributors periodically mark certain issues as good for first-time contributors. Those issues are usually
-limited in scope, easy fixable and low priority, so there is absolutely no reason why you should not try fixing them,
-it's a good excuse to start looking into the project and a safe space for experimenting failure: if you don't get the
+limited in scope, easy fixable and low priority, so there is absolutely no reason why you should not try fixing them.
+It's also a good excuse to start looking into the project and a safe space for experimenting failure: if you don't get the
 grasp of something, pick another one!
 
 ### Setting up your development environment
@@ -279,7 +279,7 @@ The Python API docs detail the source code: classes, functions, and parameters t
 This type of documentation is extracted from the source code itself, and contributors should pay attention when they
 change the code to also change relevant comments and docstrings. This type of documentation is mostly useful to
 developers, but it can be handy for users at times. You can browse it on the dedicated section in the
-[documentation website](https://docs.haystack.deepset.ai/v2.0/reference/integrations-chroma).
+[documentation website](https://docs.haystack.deepset.ai/reference/integrations-chroma).
 
 We use `pydoc-markdown` to convert docstrings into properly formatted Markdown files, and while the CI takes care of
 generating and publishing the updated documentation at every merge on the `main` branch, you can generate the docs

diff --git a/integrations/amazon_bedrock/CHANGELOG.md b/integrations/amazon_bedrock/CHANGELOG.md
@@ -1,5 +1,27 @@
 # Changelog
 
+## [integrations/amazon_bedrock-v1.1.0] - 2024-10-23
+
+### 🚜 Refactor
+
+- Avoid downloading tokenizer if `truncate` is `False` (#1152)
+
+### ⚙️ Miscellaneous Tasks
+
+- Adopt uv as installer (#1142)
+
+## [integrations/amazon_bedrock-v1.0.5] - 2024-10-17
+
+### 🚀 Features
+
+- Add prefixes to supported model patterns to allow cross region model ids (#1127)
+
+## [integrations/amazon_bedrock-v1.0.4] - 2024-10-16
+
+### 🐛 Bug Fixes
+
+- Avoid bedrock read timeout (add boto3_config param) (#1135)
+
 ## [integrations/amazon_bedrock-v1.0.3] - 2024-10-04
 
 ### 🐛 Bug Fixes

diff --git a/integrations/amazon_bedrock/pyproject.toml b/integrations/amazon_bedrock/pyproject.toml
@@ -42,6 +42,7 @@ root = "../.."
 git_describe_command = 'git describe --tags --match="integrations/amazon_bedrock-v[0-9]*"'
 
 [tool.hatch.envs.default]
+installer = "uv"
 dependencies = [
   "coverage[toml]>=6.5",
   "pytest",
@@ -60,8 +61,9 @@ docs = ["pydoc-markdown pydoc/config.yml"]
 python = ["3.8", "3.9", "3.10", "3.11", "3.12"]
 
 [tool.hatch.envs.lint]
+installer = "uv"
 detached = true
-dependencies = ["black>=23.1.0", "mypy>=1.0.0", "ruff>=0.0.243"]
+dependencies = ["pip", "black>=23.1.0", "mypy>=1.0.0", "ruff>=0.0.243"]
 [tool.hatch.envs.lint.scripts]
 typing = "mypy --install-types --non-interactive --explicit-package-bases {args:src/ tests}"
 

diff --git a/...ock/src/haystack_integrations/components/generators/amazon_bedrock/chat/chat_generator.py b/...ock/src/haystack_integrations/components/generators/amazon_bedrock/chat/chat_generator.py
@@ -58,9 +58,9 @@ class AmazonBedrockChatGenerator:
     """
 
     SUPPORTED_MODEL_PATTERNS: ClassVar[Dict[str, Type[BedrockModelChatAdapter]]] = {
-        r"(.+\.)?anthropic.claude.*": AnthropicClaudeChatAdapter,
-        r"meta.llama2.*": MetaLlama2ChatAdapter,
-        r"mistral.*": MistralChatAdapter,
+        r"([a-z]{2}\.)?anthropic.claude.*": AnthropicClaudeChatAdapter,
+        r"([a-z]{2}\.)?meta.llama2.*": MetaLlama2ChatAdapter,
+        r"([a-z]{2}\.)?mistral.*": MistralChatAdapter,
     }
 
     def __init__(

diff --git a/...mazon_bedrock/src/haystack_integrations/components/generators/amazon_bedrock/generator.py b/...mazon_bedrock/src/haystack_integrations/components/generators/amazon_bedrock/generator.py
@@ -3,6 +3,7 @@
 import re
 from typing import Any, Callable, ClassVar, Dict, List, Optional, Type
 
+from botocore.config import Config
 from botocore.exceptions import ClientError
 from haystack import component, default_from_dict, default_to_dict
 from haystack.dataclasses import StreamingChunk
@@ -65,13 +66,13 @@ class AmazonBedrockGenerator:
     """
 
     SUPPORTED_MODEL_PATTERNS: ClassVar[Dict[str, Type[BedrockModelAdapter]]] = {
-        r"amazon.titan-text.*": AmazonTitanAdapter,
-        r"ai21.j2.*": AI21LabsJurassic2Adapter,
-        r"cohere.command-[^r].*": CohereCommandAdapter,
-        r"cohere.command-r.*": CohereCommandRAdapter,
-        r"(.+\.)?anthropic.claude.*": AnthropicClaudeAdapter,
-        r"meta.llama.*": MetaLlamaAdapter,
-        r"mistral.*": MistralAdapter,
+        r"([a-z]{2}\.)?amazon.titan-text.*": AmazonTitanAdapter,
+        r"([a-z]{2}\.)?ai21.j2.*": AI21LabsJurassic2Adapter,
+        r"([a-z]{2}\.)?cohere.command-[^r].*": CohereCommandAdapter,
+        r"([a-z]{2}\.)?cohere.command-r.*": CohereCommandRAdapter,
+        r"([a-z]{2}\.)?anthropic.claude.*": AnthropicClaudeAdapter,
+        r"([a-z]{2}\.)?meta.llama.*": MetaLlamaAdapter,
+        r"([a-z]{2}\.)?mistral.*": MistralAdapter,
     }
 
     def __init__(
@@ -87,6 +88,7 @@ def __init__(
         max_length: Optional[int] = 100,
         truncate: Optional[bool] = True,
         streaming_callback: Optional[Callable[[StreamingChunk], None]] = None,
+        boto3_config: Optional[Dict[str, Any]] = None,
         **kwargs,
     ):
         """
@@ -102,6 +104,7 @@ def __init__(
         :param truncate: Whether to truncate the prompt or not.
         :param streaming_callback: A callback function that is called when a new token is received from the stream.
             The callback function accepts StreamingChunk as an argument.
+        :param boto3_config: The configuration for the boto3 client.
         :param kwargs: Additional keyword arguments to be passed to the model.
         These arguments are specific to the model. You can find them in the model's documentation.
         :raises ValueError: If the model name is empty or None.
@@ -120,6 +123,7 @@ def __init__(
         self.aws_region_name = aws_region_name
         self.aws_profile_name = aws_profile_name
         self.streaming_callback = streaming_callback
+        self.boto3_config = boto3_config
         self.kwargs = kwargs
 
         def resolve_secret(secret: Optional[Secret]) -> Optional[str]:
@@ -133,7 +137,10 @@ def resolve_secret(secret: Optional[Secret]) -> Optional[str]:
                 aws_region_name=resolve_secret(aws_region_name),
                 aws_profile_name=resolve_secret(aws_profile_name),
             )
-            self.client = session.client("bedrock-runtime")
+            config: Optional[Config] = None
+            if self.boto3_config:
+                config = Config(**self.boto3_config)
+            self.client = session.client("bedrock-runtime", config=config)
         except Exception as exception:
             msg = (
                 "Could not connect to Amazon Bedrock. Make sure the AWS environment is configured correctly. "
@@ -145,15 +152,16 @@ def resolve_secret(secret: Optional[Secret]) -> Optional[str]:
         # We pop the model_max_length as it is not sent to the model but used to truncate the prompt if needed
         model_max_length = kwargs.get("model_max_length", 4096)
 
-        # Truncate prompt if prompt tokens > model_max_length-max_length
-        # (max_length is the length of the generated text)
-        # we use GPT2 tokenizer which will likely provide good token count approximation
-
-        self.prompt_handler = DefaultPromptHandler(
-            tokenizer="gpt2",
-            model_max_length=model_max_length,
-            max_length=self.max_length or 100,
-        )
+        # we initialize the prompt handler only if truncate is True: we avoid unnecessarily downloading the tokenizer
+        if self.truncate:
+            # Truncate prompt if prompt tokens > model_max_length-max_length
+            # (max_length is the length of the generated text)
+            # we use GPT2 tokenizer which will likely provide good token count approximation
+            self.prompt_handler = DefaultPromptHandler(
+                tokenizer="gpt2",
+                model_max_length=model_max_length,
+                max_length=self.max_length or 100,
+            )
 
         model_adapter_cls = self.get_model_adapter(model=model)
         if not model_adapter_cls:
@@ -273,6 +281,7 @@ def to_dict(self) -> Dict[str, Any]:
             max_length=self.max_length,
             truncate=self.truncate,
             streaming_callback=callback_name,
+            boto3_config=self.boto3_config,
             **self.kwargs,
         )
 

diff --git a/integrations/amazon_bedrock/tests/test_chat_generator.py b/integrations/amazon_bedrock/tests/test_chat_generator.py
@@ -243,7 +243,7 @@ def test_long_prompt_is_not_truncated_when_truncate_false(mock_boto3_session):
             generator.run(messages=messages)
 
         # Ensure _ensure_token_limit was not called
-        mock_ensure_token_limit.assert_not_called(),
+        mock_ensure_token_limit.assert_not_called()
 
         # Check the prompt passed to prepare_body
         generator.model_adapter.prepare_body.assert_called_with(messages=messages, stop_words=[], stream=False)
@@ -261,6 +261,9 @@ def test_long_prompt_is_not_truncated_when_truncate_false(mock_boto3_session):
         ("meta.llama2-13b-chat-v1", MetaLlama2ChatAdapter),
         ("meta.llama2-70b-chat-v1", MetaLlama2ChatAdapter),
         ("meta.llama2-130b-v5", MetaLlama2ChatAdapter),  # artificial
+        ("us.meta.llama2-13b-chat-v1", MetaLlama2ChatAdapter),  # cross-region inference
+        ("eu.meta.llama2-70b-chat-v1", MetaLlama2ChatAdapter),  # cross-region inference
+        ("de.meta.llama2-130b-v5", MetaLlama2ChatAdapter),  # cross-region inference
         ("unknown_model", None),
     ],
 )
@@ -517,7 +520,6 @@ def test_get_responses(self) -> None:
     @pytest.mark.parametrize("model_name", MODELS_TO_TEST)
     @pytest.mark.integration
     def test_default_inference_params(self, model_name, chat_messages):
-
         client = AmazonBedrockChatGenerator(model=model_name)
         response = client.run(chat_messages)
 

diff --git a/integrations/amazon_bedrock/tests/test_generator.py b/integrations/amazon_bedrock/tests/test_generator.py
@@ -36,6 +36,7 @@ def test_to_dict(mock_boto3_session):
             "truncate": False,
             "temperature": 10,
             "streaming_callback": None,
+            "boto3_config": None,
         },
     }
 
@@ -57,12 +58,16 @@ def test_from_dict(mock_boto3_session):
                 "aws_profile_name": {"type": "env_var", "env_vars": ["AWS_PROFILE"], "strict": False},
                 "model": "anthropic.claude-v2",
                 "max_length": 99,
+                "boto3_config": {
+                    "read_timeout": 1000,
+                },
             },
         }
     )
 
     assert generator.max_length == 99
     assert generator.model == "anthropic.claude-v2"
+    assert generator.boto3_config == {"read_timeout": 1000}
 
 
 def test_default_constructor(mock_boto3_session, set_env_variables):
@@ -103,6 +108,14 @@ def test_constructor_prompt_handler_initialized(mock_boto3_session, mock_prompt_
     assert layer.prompt_handler.model_max_length == 4096
 
 
+def test_prompt_handler_absent_when_truncate_false(mock_boto3_session):
+    """
+    Test that the prompt_handler is not initialized when truncate is set to False.
+    """
+    generator = AmazonBedrockGenerator(model="anthropic.claude-v2", truncate=False)
+    assert not hasattr(generator, "prompt_handler")
+
+
 def test_constructor_with_model_kwargs(mock_boto3_session):
     """
     Test that model_kwargs are correctly set in the constructor
@@ -220,7 +233,7 @@ def test_long_prompt_is_not_truncated_when_truncate_false(mock_boto3_session):
             generator.run(prompt=long_prompt_text)
 
         # Ensure _ensure_token_limit was not called
-        mock_ensure_token_limit.assert_not_called(),
+        mock_ensure_token_limit.assert_not_called()
 
         # Check the prompt passed to prepare_body
         generator.model_adapter.prepare_body.assert_called_with(prompt=long_prompt_text, stream=False)
@@ -246,17 +259,22 @@ def test_long_prompt_is_not_truncated_when_truncate_false(mock_boto3_session):
         ("ai21.j2-mega-v5", AI21LabsJurassic2Adapter),  # artificial
         ("amazon.titan-text-lite-v1", AmazonTitanAdapter),
         ("amazon.titan-text-express-v1", AmazonTitanAdapter),
+        ("us.amazon.titan-text-express-v1", AmazonTitanAdapter),  # cross-region inference
         ("amazon.titan-text-agile-v1", AmazonTitanAdapter),
         ("amazon.titan-text-lightning-v8", AmazonTitanAdapter),  # artificial
         ("meta.llama2-13b-chat-v1", MetaLlamaAdapter),
         ("meta.llama2-70b-chat-v1", MetaLlamaAdapter),
+        ("eu.meta.llama2-13b-chat-v1", MetaLlamaAdapter),  # cross-region inference
+        ("us.meta.llama2-70b-chat-v1", MetaLlamaAdapter),  # cross-region inference
         ("meta.llama2-130b-v5", MetaLlamaAdapter),  # artificial
         ("meta.llama3-8b-instruct-v1:0", MetaLlamaAdapter),
         ("meta.llama3-70b-instruct-v1:0", MetaLlamaAdapter),
         ("meta.llama3-130b-instruct-v5:9", MetaLlamaAdapter),  # artificial
         ("mistral.mistral-7b-instruct-v0:2", MistralAdapter),
         ("mistral.mixtral-8x7b-instruct-v0:1", MistralAdapter),
         ("mistral.mistral-large-2402-v1:0", MistralAdapter),
+        ("eu.mistral.mixtral-8x7b-instruct-v0:1", MistralAdapter),  # cross-region inference
+        ("us.mistral.mistral-large-2402-v1:0", MistralAdapter),  # cross-region inference
         ("mistral.mistral-medium-v8:0", MistralAdapter),  # artificial
         ("unknown_model", None),
     ],