feat: add support for version specification in Azure Group-level Conf…

…ig (#170) * feat: add support for version specification in Azure Group-level Configuration, update docs * Update azure.mdx typo * Update azure.mdx
LibreChat-AI · Nov 25, 2024 · d968123 · d968123
1 parent eec5608
commit d968123
Show file tree

Hide file tree

Showing 3 changed files with 32 additions and 25 deletions.
diff --git a/components/changelog/content/config_v1.1.8.mdx b/components/changelog/content/config_v1.1.8.mdx
@@ -0,0 +1 @@
+- Added support for specifying `version` in [Azure Group-level Configuration](/docs/configuration/azure#group-level-configuration) when using [Serverless Inference Endpoints](/docs/configuration/azure#serverless-inference-endpoints)
diff --git a/pages/changelog/config_v1.1.8.mdx b/pages/changelog/config_v1.1.8.mdx
@@ -0,0 +1,13 @@
+---
+date: 2024/11/25
+title: ⚙️ Config v1.1.8
+---
+
+import { ChangelogHeader } from '@/components/changelog/ChangelogHeader'
+import Content from '@/components/changelog/content/config_v1.1.8.mdx'
+
+<ChangelogHeader />
+
+---
+
+<Content />
diff --git a/pages/docs/configuration/azure.mdx b/pages/docs/configuration/azure.mdx
@@ -531,49 +531,42 @@ Remember to replace placeholder text with actual prompts or instructions and pro
 
 ### Serverless Inference Endpoints
 
-Through the `librechat.yaml` file, you can configure Azure AI Studio serverless inference endpoints to access models from the [Azure Model Catalog.](https://ai.azure.com/explore) Only a model identifier, `baseURL`, and `apiKey` are needed along with the `serverless` field to indicate the special handling these endpoints need.
+Through the `librechat.yaml` file, you can configure Azure AI Studio serverless inference endpoints to access models from the [Azure AI Foundry.](https://ai.azure.com/explore) Only a model identifier, `baseURL`, and `apiKey` are needed along with the `serverless` field to indicate the special handling these endpoints need.
 
 - You will need to follow the instructions in the compatible model cards to set up **MaaS** ("Models as a Service") access on Azure AI Studio.
 
     - For reference, here are some known compatible model cards:
 
-    - [Mistral-large](https://aka.ms/aistudio/landing/mistral-large) | [Llama-2-70b-chat](https://aka.ms/aistudio/landing/Llama-2-70b-chat) | [Phi-3-medium-128k-instruct](https://ai.azure.com/explore/models/Phi-3-medium-128k-instruct/version/1/registry/azureml)
+    - [Mistral-large](https://aka.ms/aistudio/landing/mistral-large) | [Meta-Llama-3.1-8B-Instruct](https://ai.azure.com/explore/models/Meta-Llama-3.1-8B-Instruct/version/4/) | [Phi-3-medium-128k-instruct](https://ai.azure.com/explore/models/Phi-3-medium-128k-instruct/version/1/registry/azureml)
 
 - You can also review [the technical blog for the "Mistral-large" model release](https://techcommunity.microsoft.com/t5/ai-machine-learning-blog/mistral-large-mistral-ai-s-flagship-llm-debuts-on-azure-ai/ba-p/4066996) for more info.
 
-- Then, you will need to add them to your azureOpenAI config in the librechat.yaml file.
+- Then, you will need to add them to your `azureOpenAI` config in the librechat.yaml file.
 
-- Here are example configurations for Mistral-large, LLama-2-70b-chat, and Phi-3-medium-128k-instruct:
+- Here is an example configuration for `Meta-Llama-3.1-8B-Instruct`:
 
 ```yaml filename="librechat.yaml"
 endpoints:
   azureOpenAI:
     groups:
-# serverless examples
-    - group: "mistral-inference"
-      apiKey: "${AZURE_MISTRAL_API_KEY}" # arbitrary env var name
-      baseURL: "https://Mistral-large-vnpet-serverless.region.inference.ai.azure.com/v1/chat/completions"
-      serverless: true
-      models:
-        mistral-large: true
-    - group: "llama-70b-chat"
-      apiKey: "${AZURE_LLAMA2_70B_API_KEY}" # arbitrary env var name
-      baseURL: "https://Llama-2-70b-chat-qmvyb-serverless.region.inference.ai.azure.com/v1/chat/completions"
-      serverless: true
-      models:
-        llama-70b-chat: true
-    - group: "phi-3-medium-128k-instruct"
-      apiKey: "${AZURE_PHI3_MEDIUM_API_KEY}" # arbitrary env var name
-      baseURL: "https://Phi-3-medium-128k-instruct-abcde-serverless.eastus2.inference.ai.azure.com/v1/chat/completions"
+    - group: "serverless-example"
+      apiKey: "${LLAMA318B_API_KEY}"  # arbitrary env var name
+      baseURL: "https://example.services.ai.azure.com/models/"
+      version: "2024-05-01-preview" # Optional: specify API version
       serverless: true
       models:
-        phi-3-medium-128k-instruct: true
+        # Must match the deployment name of the model
+        Meta-Llama-3.1-8B-Instruct: true
 ```
 
 **Notes**:
 
-- Make sure to add the appropriate suffix for your deployment, either "/v1/chat/completions" or "/v1/completions"
-- If using "/v1/completions" (without "chat"), you need to set the `forcePrompt` field to `true` in your [group config.](#group-level-configuration)
-- Compatibility with LibreChat relies on parity with OpenAI API specs, which at the time of writing, are typically **"Pay-as-you-go"** or "Models as a Service" (MaaS) deployments on Azure AI Studio, that are OpenAI-SDK-compatible with either v1/completions or v1/chat/completions endpoint handling.
+- Azure AI Foundry models now provision endpoints under `/models/chat/completions?api-version=version` for serverless inference.
+  - The `baseURL` field should be set to the root of the endpoint, without anything after `/models/`, i.e., the `/chat/completions` path.
+  - Example: `https://example.services.ai.azure.com/models/` for `https://example.services.ai.azure.com/models/chat/completions?api-version=2024-05-01-preview`
+  - The `version` query parameter is optional and can be specified in the `baseURL` field.
+- The model name used in the `models` field must match the deployment name of the model in the Azure AI Foundry. 
+- Compatibility with LibreChat relies on parity with OpenAI API specs, which at the time of writing, are typically **"Pay-as-you-go"** or "Models as a Service" (MaaS) deployments on Azure AI Studio, that are OpenAI-SDK-compatible with either `v1/completions` or `models/chat/completions` endpoint handling.
 - All models that offer serverless deployments ("Serverless APIs") are compatible from the Azure model catalog. You can filter by "Serverless API" under Deployment options and "Chat completion" under inference tasks to see the full list; however, real time endpoint models have not been tested.
-- These serverless inference endpoint/models are likely not compatible with OpenAI function calling, which enables the use of Plugins. As they have yet been tested, they are available on the Plugins endpoint, although they are not expected to work.
+- These serverless inference endpoint/models may or may not support function calling according to OpenAI API specs, which enables their use with Agents.
+- If using legacy "/v1/completions" (without "chat"), you need to set the `forcePrompt` field to `true` in your [group config.](#group-level-configuration)
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1 @@
		- Added support for specifying `version` in [Azure Group-level Configuration](/docs/configuration/azure#group-level-configuration) when using [Serverless Inference Endpoints](/docs/configuration/azure#serverless-inference-endpoints)