Skip to content

Commit

Permalink
Merge branch 'main' into raspawar/default-model
Browse files Browse the repository at this point in the history
  • Loading branch information
raspawar authored Aug 12, 2024
2 parents e6d0d6e + 7f5b12e commit dcdfbf6
Show file tree
Hide file tree
Showing 8 changed files with 154 additions and 63 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -35,13 +35,14 @@
@component
class AmazonBedrockGenerator:
"""
`AmazonBedrockGenerator` enables text generation via Amazon Bedrock hosted LLMs.
Generates text using models hosted on Amazon Bedrock.
For example, to use the Anthropic Claude model, simply initialize the `AmazonBedrockGenerator` with the
'anthropic.claude-v2' model name. Provide AWS credentials either via local AWS profile or directly via
For example, to use the Anthropic Claude model, pass 'anthropic.claude-v2' in the `model` parameter.
Provide AWS credentials either through the local AWS profile or directly through
`aws_access_key_id`, `aws_secret_access_key`, `aws_session_token`, and `aws_region_name` parameters.
Usage example:
### Usage example
```python
from haystack_integrations.components.generators.amazon_bedrock import AmazonBedrockGenerator
Expand All @@ -52,6 +53,16 @@ class AmazonBedrockGenerator:
print(generator.run("Who is the best American actor?"))
```
AmazonBedrockGenerator uses AWS for authentication. You can use the AWS CLI to authenticate through your IAM.
For more information on setting up an IAM identity-based policy, see [Amazon Bedrock documentation]
(https://docs.aws.amazon.com/bedrock/latest/userguide/security_iam_id-based-policy-examples.html).
If the AWS environment is configured correctly, the AWS credentials are not required as they're loaded
automatically from the environment or the AWS configuration file.
If the AWS environment is not configured, set `aws_access_key_id`, `aws_secret_access_key`,
`aws_session_token`, and `aws_region_name` as environment variables or pass them as
[Secret](https://docs.haystack.deepset.ai/v2.0/docs/secret-management) arguments. Make sure the region you set
supports Amazon Bedrock.
"""

SUPPORTED_MODEL_PATTERNS: ClassVar[Dict[str, Type[BedrockModelAdapter]]] = {
Expand Down Expand Up @@ -85,11 +96,12 @@ def __init__(
:param aws_access_key_id: The AWS access key ID.
:param aws_secret_access_key: The AWS secret access key.
:param aws_session_token: The AWS session token.
:param aws_region_name: The AWS region name.
:param aws_region_name: The AWS region name. Make sure the region you set supports Amazon Bedrock.
:param aws_profile_name: The AWS profile name.
:param max_length: The maximum length of the generated text.
:param truncate: Whether to truncate the prompt or not.
:param kwargs: Additional keyword arguments to be passed to the model.
These arguments are specific to the model. You can find them in the model's documentation.
:raises ValueError: If the model name is empty or None.
:raises AmazonBedrockConfigurationError: If the AWS environment is not configured correctly or the model is
not supported.
Expand Down Expand Up @@ -236,8 +248,9 @@ def run(self, prompt: str, generation_kwargs: Optional[Dict[str, Any]] = None):
"""
Generates a list of string response to the given prompt.
:param prompt: The prompt to generate a response for.
:param generation_kwargs: Additional keyword arguments passed to the generator.
:param prompt: Instructions for the model.
:param generation_kwargs: Additional keyword arguments to customize text generation.
These arguments are specific to the model. You can find them in the model's documentation.
:returns: A dictionary with the following keys:
- `replies`: A list of generated responses.
:raises ValueError: If the prompt is empty or None.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,18 +15,16 @@
@component
class CohereChatGenerator:
"""
Enables text generation using Cohere's chat endpoint.
Completes chats using Cohere's models through Cohere `chat` endpoint.
This component is designed to inference Cohere's chat models.
You can customize how the text is generated by passing parameters to the
Cohere API through the `**generation_kwargs` parameter. You can do this when
initializing or running the component. Any parameter that works with
`cohere.Client.chat` will work here too.
For details, see [Cohere API](https://docs.cohere.com/reference/chat).
Users can pass any text generation parameters valid for the `cohere.Client,chat` method
directly to this component via the `**generation_kwargs` parameter in __init__ or the `**generation_kwargs`
parameter in `run` method.
### Usage example
Invocations are made using 'cohere' package.
See [Cohere API](https://docs.cohere.com/reference/chat) for more details.
Example usage:
```python
from haystack_integrations.components.generators.cohere import CohereChatGenerator
Expand All @@ -49,32 +47,34 @@ def __init__(
"""
Initialize the CohereChatGenerator instance.
:param api_key: the API key for the Cohere API.
:param model: The name of the model to use. Available models are: [command, command-r, command-r-plus, etc.]
:param streaming_callback: a callback function to be called with the streaming response.
:param api_base_url: the base URL of the Cohere API.
:param generation_kwargs: additional model parameters. These will be used during generation. Refer to
https://docs.cohere.com/reference/chat for more details.
:param api_key: The API key for the Cohere API.
:param model: The name of the model to use. You can use models from the `command` family.
:param streaming_callback: A callback function that is called when a new token is received from the stream.
The callback function accepts [StreamingChunk](https://docs.haystack.deepset.ai/docs/data-classes#streamingchunk)
as an argument.
:param api_base_url: The base URL of the Cohere API.
:param generation_kwargs: Other parameters to use for the model during generation. For a list of parameters,
see [Cohere Chat endpoint](https://docs.cohere.com/reference/chat).
Some of the parameters are:
- 'chat_history': A list of previous messages between the user and the model, meant to give the model
conversational context for responding to the user's message.
- 'preamble_override': When specified, the default Cohere preamble will be replaced with the provided one.
- 'conversation_id': An alternative to chat_history. Previous conversations can be resumed by providing
the conversation's identifier. The contents of message and the model's response will be stored
as part of this conversation.If a conversation with this id does not already exist,
a new conversation will be created.
- 'prompt_truncation': Defaults to AUTO when connectors are specified and OFF in all other cases.
Dictates how the prompt will be constructed.
- 'connectors': Accepts {"id": "web-search"}, and/or the "id" for a custom connector, if you've created one.
When specified, the model's reply will be enriched with information found by
- 'preamble': When specified, replaces the default Cohere preamble with the provided one.
- 'conversation_id': An alternative to `chat_history`. Previous conversations can be resumed by providing
the conversation's identifier. The contents of message and the model's response are stored
as part of this conversation. If a conversation with this ID doesn't exist,
a new conversation is created.
- 'prompt_truncation': Defaults to `AUTO` when connectors are specified and to `OFF` in all other cases.
Dictates how the prompt is constructed.
- 'connectors': Accepts {"id": "web-search"}, and the "id" for a custom connector, if you created one.
When specified, the model's reply is enriched with information found by
quering each of the connectors (RAG).
- 'documents': A list of relevant documents that the model can use to enrich its reply.
- 'search_queries_only': Defaults to false. When true, the response will only contain a
list of generated search queries, but no search will take place, and no reply from the model to the
user's message will be generated.
- 'citation_quality': Defaults to "accurate". Dictates the approach taken to generating citations
- 'search_queries_only': Defaults to `False`. When `True`, the response only contains a
list of generated search queries, but no search takes place, and no reply from the model to the
user's message is generated.
- 'citation_quality': Defaults to `accurate`. Dictates the approach taken to generating citations
as part of the RAG flow by allowing the user to specify whether they want
"accurate" results or "fast" results.
`accurate` results or `fast` results.
- 'temperature': A non-negative float that tunes the degree of randomness in generation. Lower temperatures
mean less random generations.
"""
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,12 +15,12 @@

@component
class CohereGenerator(CohereChatGenerator):
"""LLM Generator compatible with Cohere's generate endpoint.
"""Generates text using Cohere's models through Cohere's `generate` endpoint.
NOTE: Cohere discontinued the `generate` API, so this generator is a mere wrapper
around `CohereChatGenerator` provided for backward compatibility.
Example usage:
### Usage example
```python
from haystack_integrations.components.generators.cohere import CohereGenerator
Expand All @@ -40,6 +40,15 @@ def __init__(
):
"""
Instantiates a `CohereGenerator` component.
:param api_key: Cohere API key.
:param model: Cohere model to use for generation.
:param streaming_callback: Callback function that is called when a new token is received from the stream.
The callback function accepts [StreamingChunk](https://docs.haystack.deepset.ai/docs/data-classes#streamingchunk)
as an argument.
:param api_base_url: Cohere base URL.
:param **kwargs: Additional arguments passed to the model. These arguments are specific to the model.
You can check them in model's documentation.
"""

# Note we have to call super() like this because of the way components are dynamically built with the decorator
Expand All @@ -52,8 +61,8 @@ def run(self, prompt: str):
:param prompt: the prompt to be sent to the generative model.
:returns: A dictionary with the following keys:
- `replies`: the list of replies generated by the model.
- `meta`: metadata about the request.
- `replies`: A list of replies generated by the model.
- `meta`: Information about the request.
"""
chat_message = ChatMessage(content=prompt, role=ChatRole.USER, name="", meta={})
# Note we have to call super() like this because of the way components are dynamically built with the decorator
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,10 +18,16 @@
@component
class GoogleAIGeminiChatGenerator:
"""
`GoogleAIGeminiChatGenerator` is a multimodal generator supporting Gemini via Google AI Studio.
It uses the `ChatMessage` dataclass to interact with the model.
Completes chats using multimodal Gemini models through Google AI Studio.
It uses the [`ChatMessage`](https://docs.haystack.deepset.ai/docs/data-classes#chatmessage)
dataclass to interact with the model. You can use the following models:
- gemini-pro
- gemini-ultra
- gemini-pro-vision
### Usage example
Usage example:
```python
from haystack.utils import Secret
from haystack.dataclasses.chat_message import ChatMessage
Expand All @@ -42,7 +48,8 @@ class GoogleAIGeminiChatGenerator:
```
Usage example with function calling:
#### With function calling:
```python
from haystack.utils import Secret
from haystack.dataclasses.chat_message import ChatMessage
Expand Down Expand Up @@ -111,11 +118,15 @@ def __init__(
* `gemini-pro-vision`
* `gemini-ultra`
:param api_key: Google AI Studio API key.
:param model: Name of the model to use.
:param generation_config: The generation config to use.
Can either be a `GenerationConfig` object or a dictionary of parameters.
For the available parameters, see
:param api_key: Google AI Studio API key. To get a key,
see [Google AI Studio](https://makersuite.google.com).
:param model: Name of the model to use. Supported models are:
- gemini-pro
- gemini-ultra
- gemini-pro-vision
:param generation_config: The generation configuration to use.
This can either be a `GenerationConfig` object or a dictionary of parameters.
For available parameters, see
[the `GenerationConfig` API reference](https://ai.google.dev/api/python/google/generativeai/GenerationConfig).
:param safety_settings: The safety settings to use.
A dictionary with `HarmCategory` as keys and `HarmBlockThreshold` as values.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,9 +17,10 @@
@component
class GoogleAIGeminiGenerator:
"""
`GoogleAIGeminiGenerator` is a multimodal generator supporting Gemini via Google AI Studio.
Generates text using multimodal Gemini models through Google AI Studio.
### Usage example
Usage example:
```python
from haystack.utils import Secret
from haystack_integrations.components.generators.google_ai import GoogleAIGeminiGenerator
Expand All @@ -30,7 +31,8 @@ class GoogleAIGeminiGenerator:
print(answer)
```
Multimodal usage example:
#### Multimodal example
```python
import requests
from haystack.utils import Secret
Expand Down Expand Up @@ -81,9 +83,9 @@ def __init__(
:param api_key: Google AI Studio API key.
:param model: Name of the model to use.
:param generation_config: The generation config to use.
Can either be a `GenerationConfig` object or a dictionary of parameters.
For the available parameters, see
:param generation_config: The generation configuration to use.
This can either be a `GenerationConfig` object or a dictionary of parameters.
For available parameters, see
[the `GenerationConfig` API reference](https://ai.google.dev/api/python/google/generativeai/GenerationConfig).
:param safety_settings: The safety settings to use.
A dictionary with `HarmCategory` as keys and `HarmBlockThreshold` as values.
Expand Down
50 changes: 50 additions & 0 deletions integrations/llama_cpp/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
# Changelog

## [integrations/llama_cpp-v0.4.1] - 2024-08-08

### 🐛 Bug Fixes

- Replace DynamicChatPromptBuilder with ChatPromptBuilder (#940)

### ⚙️ Miscellaneous Tasks

- Retry tests to reduce flakyness (#836)
- Update ruff invocation to include check parameter (#853)
- Pin `llama-cpp-python>=0.2.87` (#955)

## [integrations/llama_cpp-v0.4.0] - 2024-05-13

### 🐛 Bug Fixes

- Fix commit (#436)


- Fix order of API docs (#447)

This PR will also push the docs to Readme

### 📚 Documentation

- Update category slug (#442)
- Small consistency improvements (#536)
- Disable-class-def (#556)

### ⚙️ Miscellaneous Tasks

- [**breaking**] Rename model_path to model in the Llama.cpp integration (#243)

### Llama.cpp

- Generate api docs (#353)

## [integrations/llama_cpp-v0.2.1] - 2024-01-18

## [integrations/llama_cpp-v0.2.0] - 2024-01-17

## [integrations/llama_cpp-v0.1.0] - 2024-01-09

### 🚀 Features

- Add Llama.cpp Generator (#179)

<!-- generated by git-cliff -->
2 changes: 1 addition & 1 deletion integrations/llama_cpp/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ classifiers = [
"Programming Language :: Python :: Implementation :: CPython",
"Programming Language :: Python :: Implementation :: PyPy",
]
dependencies = ["haystack-ai", "llama-cpp-python<0.2.84"]
dependencies = ["haystack-ai", "llama-cpp-python>=0.2.87"]

[project.urls]
Documentation = "https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/llama_cpp#readme"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,10 +14,11 @@
@component
class NvidiaGenerator:
"""
A component for generating text using generative models provided by
[NVIDIA NIMs](https://ai.nvidia.com).
Generates text using generative models hosted with
[NVIDIA NIM](https://ai.nvidia.com) on on the [NVIDIA API Catalog](https://build.nvidia.com/explore/discover).
### Usage example
Usage example:
```python
from haystack_integrations.components.generators.nvidia import NvidiaGenerator
Expand All @@ -36,6 +37,8 @@ class NvidiaGenerator:
print(result["meta"])
print(result["usage"])
```
You need an NVIDIA API key for this component to work.
"""

def __init__(
Expand All @@ -54,14 +57,17 @@ def __init__(
for more information on the supported models.
`Note`: If no specific model along with locally hosted API URL is provided,
the system defaults to the available model found using /models API.
Check supported models at [NVIDIA NIM](https://ai.nvidia.com).
:param api_key:
API key for the NVIDIA NIM.
API key for the NVIDIA NIM. Set it as the `NVIDIA_API_KEY` environment
variable or pass it here.
:param api_url:
Custom API URL for the NVIDIA NIM.
:param model_arguments:
Additional arguments to pass to the model provider. Different models accept different arguments.
Search your model in the [NVIDIA NIMs](https://ai.nvidia.com)
to know the supported arguments.
Additional arguments to pass to the model provider. These arguments are
specific to a model.
Search your model in the [NVIDIA NIM](https://ai.nvidia.com)
to find the arguments it accepts.
"""
self._model = model
self._api_url = url_validation(api_url, _DEFAULT_API_URL, ["v1/chat/completions"])
Expand Down

0 comments on commit dcdfbf6

Please sign in to comment.