Skip to content

Commit

Permalink
feat(cosmosdbnosql): Add Semantic Cache Integration (#7033)
Browse files Browse the repository at this point in the history
Co-authored-by: Yohan Lasorsa <[email protected]>
Co-authored-by: jacoblee93 <[email protected]>
  • Loading branch information
3 people authored Nov 5, 2024
1 parent e13925f commit c6440b6
Show file tree
Hide file tree
Showing 10 changed files with 643 additions and 3 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
# Azure Cosmos DB NoSQL Semantic Cache

> The Semantic Cache feature is supported with Azure Cosmos DB for NoSQL integration, enabling users to retrieve cached responses based on semantic similarity between the user input and previously cached results. It leverages [AzureCosmosDBNoSQLVectorStore](/docs/integrations/vectorstores/azure_cosmosdb_nosql), which stores vector embeddings of cached prompts. These embeddings enable similarity-based searches, allowing the system to retrieve relevant cached results.
If you don't have an Azure account, you can [create a free account](https://azure.microsoft.com/free/) to get started.

## Setup

You'll first need to install the [`@langchain/azure-cosmosdb`](https://www.npmjs.com/package/@langchain/azure-cosmosdb) package:

import IntegrationInstallTooltip from "@mdx_components/integration_install_tooltip.mdx";

<IntegrationInstallTooltip></IntegrationInstallTooltip>

```bash npm2yarn
npm install @langchain/azure-cosmosdb @langchain/core
```

You'll also need to have an Azure Cosmos DB for NoSQL instance running. You can deploy a free version on Azure Portal without any cost, following [this guide](https://learn.microsoft.com/azure/cosmos-db/nosql/quickstart-portal).

Once you have your instance running, make sure you have the connection string. If you are using Managed Identity, you need to have the endpoint. You can find them in the Azure Portal, under the "Settings / Keys" section of your instance.

import CodeBlock from "@theme/CodeBlock";

:::info

When using Azure Managed Identity and role-based access control, you must ensure that the database and container have been created beforehand. RBAC does not provide permissions to create databases and containers. You can get more information about the permission model in the [Azure Cosmos DB documentation](https://learn.microsoft.com/azure/cosmos-db/how-to-setup-rbac#permission-model).

:::

## Usage example

import Example from "@examples/caches/azure_cosmosdb_nosql/azure_cosmosdb_nosql.ts";

<CodeBlock language="typescript">{Example}</CodeBlock>

## Related

- Vector store [conceptual guide](/docs/concepts/#vectorstores)
- Vector store [how-to guides](/docs/how_to/#vectorstores)
14 changes: 14 additions & 0 deletions docs/core_docs/docs/integrations/llm_caching/index.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
---
sidebar_class_name: hidden
hide_table_of_contents: true
---

# Model caches

[Caching LLM calls](/docs/how_to/chat_model_caching) can be useful for testing, cost savings, and speed.

Below are some integrations that allow you to cache results of individual LLM calls using different caches with different strategies.

import { IndexTable } from "@theme/FeatureTables";

<IndexTable />
18 changes: 18 additions & 0 deletions docs/core_docs/docs/integrations/platforms/microsoft.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -132,6 +132,24 @@ See a [usage example](/docs/integrations/vectorstores/azure_cosmosdb_mongodb).
import { AzureCosmosDBMongoDBVectorStore } from "@langchain/azure-cosmosdb";
```

## Semantic Cache

### Azure Cosmos DB NoSQL Semantic Cache

> The Semantic Cache feature is supported with Azure Cosmos DB for NoSQL integration, enabling users to retrieve cached responses based on semantic similarity between the user input and previously cached results. It leverages [AzureCosmosDBNoSQLVectorStore](/docs/integrations/vectorstores/azure_cosmosdb_nosql), which stores vector embeddings of cached prompts. These embeddings enable similarity-based searches, allowing the system to retrieve relevant cached results.
<IntegrationInstallTooltip></IntegrationInstallTooltip>

```bash npm2yarn
npm install @langchain/azure-cosmosdb @langchain/core
```

See a [usage example](/docs/integrations/llm_caching/azure_cosmosdb_nosql).

```typescript
import { AzureCosmosDBNoSQLSemanticCache } from "@langchain/azure-cosmosdb";
```

## Document loaders

### Azure Blob Storage
Expand Down
16 changes: 16 additions & 0 deletions docs/core_docs/sidebars.js
Original file line number Diff line number Diff line change
Expand Up @@ -347,6 +347,22 @@ module.exports = {
slug: "integrations/document_transformers",
},
},
{
type: "category",
label: "Model caches",
collapsible: false,
items: [
{
type: "autogenerated",
dirName: "integrations/llm_caching",
className: "hidden",
},
],
link: {
type: "doc",
id: "integrations/llm_caching/index",
},
},
{
type: "category",
label: "Graphs",
Expand Down
49 changes: 49 additions & 0 deletions examples/src/caches/azure_cosmosdb_nosql/azure_cosmosdb_nosql.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
import {
AzureCosmosDBNoSQLConfig,
AzureCosmosDBNoSQLSemanticCache,
} from "@langchain/azure-cosmosdb";
import { ChatOpenAI, OpenAIEmbeddings } from "@langchain/openai";

const embeddings = new OpenAIEmbeddings();
const config: AzureCosmosDBNoSQLConfig = {
databaseName: "<DATABASE_NAME>",
containerName: "<CONTAINER_NAME>",
// use endpoint to initiate client with managed identity
connectionString: "<CONNECTION_STRING>",
};

/**
* Sets the threshold similarity score for returning cached results based on vector distance.
* Cached output is returned only if the similarity score meets or exceeds this threshold;
* otherwise, a new result is generated. Default is 0.6, adjustable via the constructor
* to suit various distance functions and use cases.
* (see: https://learn.microsoft.com/azure/cosmos-db/nosql/query/vectordistance).
*/

const similarityScoreThreshold = 0.5;
const cache = new AzureCosmosDBNoSQLSemanticCache(
embeddings,
config,
similarityScoreThreshold
);

const model = new ChatOpenAI({ cache });

// Invoke the model to perform an action
const response1 = await model.invoke("Do something random!");
console.log(response1);
/*
AIMessage {
content: "Sure! I'll generate a random number for you: 37",
additional_kwargs: {}
}
*/

const response2 = await model.invoke("Do something random!");
console.log(response2);
/*
AIMessage {
content: "Sure! I'll generate a random number for you: 37",
additional_kwargs: {}
}
*/
6 changes: 3 additions & 3 deletions libs/langchain-azure-cosmosdb/src/azure_cosmosdb_nosql.ts
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ export interface AzureCosmosDBNoSQLConfig
readonly metadataKey?: string;
}

const USER_AGENT_PREFIX = "langchainjs-azure-cosmosdb-nosql";
const USER_AGENT_SUFFIX = "langchainjs-cdbnosql-vectorstore-javascript";

/**
* Azure Cosmos DB for NoSQL vCore vector store.
Expand Down Expand Up @@ -151,14 +151,14 @@ export class AzureCosmosDBNoSQLVectorStore extends VectorStore {
this.client = new CosmosClient({
endpoint,
key,
userAgentSuffix: USER_AGENT_PREFIX,
userAgentSuffix: USER_AGENT_SUFFIX,
});
} else {
// Use managed identity
this.client = new CosmosClient({
endpoint,
aadCredentials: dbConfig.credentials ?? new DefaultAzureCredential(),
userAgentSuffix: USER_AGENT_PREFIX,
userAgentSuffix: USER_AGENT_SUFFIX,
} as CosmosClientOptions);
}
}
Expand Down
191 changes: 191 additions & 0 deletions libs/langchain-azure-cosmosdb/src/caches.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,191 @@
import {
BaseCache,
deserializeStoredGeneration,
getCacheKey,
serializeGeneration,
} from "@langchain/core/caches";
import { Generation } from "@langchain/core/outputs";
import { Document } from "@langchain/core/documents";
import { EmbeddingsInterface } from "@langchain/core/embeddings";
import { CosmosClient, CosmosClientOptions } from "@azure/cosmos";
import { DefaultAzureCredential } from "@azure/identity";
import { getEnvironmentVariable } from "@langchain/core/utils/env";
import {
AzureCosmosDBNoSQLConfig,
AzureCosmosDBNoSQLVectorStore,
} from "./azure_cosmosdb_nosql.js";

const USER_AGENT_SUFFIX = "langchainjs-cdbnosql-semanticcache-javascript";
const DEFAULT_CONTAINER_NAME = "semanticCacheContainer";

/**
* Represents a Semantic Cache that uses CosmosDB NoSQL backend as the underlying
* storage system.
*
* @example
* ```typescript
* const embeddings = new OpenAIEmbeddings();
* const cache = new AzureCosmosDBNoSQLSemanticCache(embeddings, {
* databaseName: DATABASE_NAME,
* containerName: CONTAINER_NAME
* });
* const model = new ChatOpenAI({cache});
*
* // Invoke the model to perform an action
* const response = await model.invoke("Do something random!");
* console.log(response);
* ```
*/
export class AzureCosmosDBNoSQLSemanticCache extends BaseCache {
private embeddings: EmbeddingsInterface;

private config: AzureCosmosDBNoSQLConfig;

private similarityScoreThreshold: number;

private cacheDict: { [key: string]: AzureCosmosDBNoSQLVectorStore } = {};

private vectorDistanceFunction: string;

constructor(
embeddings: EmbeddingsInterface,
dbConfig: AzureCosmosDBNoSQLConfig,
similarityScoreThreshold: number = 0.6
) {
super();
let client: CosmosClient;

const connectionString =
dbConfig.connectionString ??
getEnvironmentVariable("AZURE_COSMOSDB_NOSQL_CONNECTION_STRING");

const endpoint =
dbConfig.endpoint ??
getEnvironmentVariable("AZURE_COSMOSDB_NOSQL_ENDPOINT");

if (!dbConfig.client && !connectionString && !endpoint) {
throw new Error(
"AzureCosmosDBNoSQLSemanticCache client, connection string or endpoint must be set."
);
}

if (!dbConfig.client) {
if (connectionString) {
// eslint-disable-next-line @typescript-eslint/no-non-null-assertion
let [endpoint, key] = connectionString!.split(";");
[, endpoint] = endpoint.split("=");
[, key] = key.split("=");

client = new CosmosClient({
endpoint,
key,
userAgentSuffix: USER_AGENT_SUFFIX,
});
} else {
// Use managed identity
client = new CosmosClient({
endpoint,
aadCredentials: dbConfig.credentials ?? new DefaultAzureCredential(),
userAgentSuffix: USER_AGENT_SUFFIX,
} as CosmosClientOptions);
}
} else {
client = dbConfig.client;
}

this.vectorDistanceFunction =
dbConfig.vectorEmbeddingPolicy?.vectorEmbeddings[0].distanceFunction ??
"cosine";

this.config = {
...dbConfig,
client,
databaseName: dbConfig.databaseName,
containerName: dbConfig.containerName ?? DEFAULT_CONTAINER_NAME,
};
this.embeddings = embeddings;
this.similarityScoreThreshold = similarityScoreThreshold;
}

private getLlmCache(llmKey: string) {
const key = getCacheKey(llmKey);
if (!this.cacheDict[key]) {
this.cacheDict[key] = new AzureCosmosDBNoSQLVectorStore(
this.embeddings,
this.config
);
}
return this.cacheDict[key];
}

/**
* Retrieves data from the cache.
*
* @param prompt The prompt for lookup.
* @param llmKey The LLM key used to construct the cache key.
* @returns An array of Generations if found, null otherwise.
*/
public async lookup(prompt: string, llmKey: string) {
const llmCache = this.getLlmCache(llmKey);

const results = await llmCache.similaritySearchWithScore(prompt, 1);
if (!results.length) return null;

const generations = results
.flatMap(([document, score]) => {
const isSimilar =
(this.vectorDistanceFunction === "euclidean" &&
score <= this.similarityScoreThreshold) ||
(this.vectorDistanceFunction !== "euclidean" &&
score >= this.similarityScoreThreshold);

if (!isSimilar) return undefined;

return document.metadata.return_value.map((gen: string) =>
deserializeStoredGeneration(JSON.parse(gen))
);
})
.filter((gen) => gen !== undefined);

return generations.length > 0 ? generations : null;
}

/**
* Updates the cache with new data.
*
* @param prompt The prompt for update.
* @param llmKey The LLM key used to construct the cache key.
* @param value The value to be stored in the cache.
*/
public async update(
prompt: string,
llmKey: string,
returnValue: Generation[]
) {
const serializedGenerations = returnValue.map((generation) =>
JSON.stringify(serializeGeneration(generation))
);
const llmCache = this.getLlmCache(llmKey);
const metadata = {
llm_string: llmKey,
prompt,
return_value: serializedGenerations,
};
const doc = new Document({
pageContent: prompt,
metadata,
});
await llmCache.addDocuments([doc]);
}

/**
* deletes the semantic cache for a given llmKey
* @param llmKey
*/
public async clear(llmKey: string) {
const key = getCacheKey(llmKey);
if (this.cacheDict[key]) {
await this.cacheDict[key].delete();
}
}
}
1 change: 1 addition & 0 deletions libs/langchain-azure-cosmosdb/src/index.ts
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
export * from "./azure_cosmosdb_mongodb.js";
export * from "./azure_cosmosdb_nosql.js";
export * from "./caches.js";
Loading

0 comments on commit c6440b6

Please sign in to comment.