Skip to content

Commit

Permalink
Merge
Browse files Browse the repository at this point in the history
  • Loading branch information
jacoblee93 committed Oct 13, 2023
2 parents c064bd3 + 70774b7 commit 2801dd6
Show file tree
Hide file tree
Showing 68 changed files with 2,801 additions and 391 deletions.
3 changes: 3 additions & 0 deletions docs/docs_skeleton/docs/expression_language/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -13,3 +13,6 @@ The base interface shared by all LCEL objects

#### [Cookbook](/docs/expression_language/cookbook)
Examples of common LCEL usage patterns

#### [Why use LCEL](/docs/expression_language/why)
A deeper dive into the benefits of LCEL
14 changes: 11 additions & 3 deletions docs/docs_skeleton/docs/modules/chains/popular/chat_vector_db.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,19 @@ sidebar_position: 2
---

# Conversational Retrieval QA
The ConversationalRetrievalQA chain builds on RetrievalQAChain to provide a chat history component.

It first combines the chat history (either explicitly passed in or retrieved from the provided memory) and the question into a standalone question, then looks up relevant documents from the retriever, and finally passes those documents and the question to a question answering chain to return a response.
:::info
Looking for the older, non-LCEL version? Click [here](/docs/modules/chains/popular/chat_vector_db_legacy).
:::

To create one, you will need a retriever. In the below example, we will create one from a vector store, which can be created from embeddings.
A common requirement for retrieval-augmented generation chains is support for followup questions.
Followup questions can contain references to past chat history (e.g. "What did Biden say about Justice Breyer", followed by "Was that nice?"), which make them ill-suited
to direct retriever similarity search .

To support followups, you can add an additional step prior to retrieval that combines the chat history (either explicitly passed in or retrieved from the provided memory) and the question into a standalone question.
It then performs the standard retrieval steps of looking up relevant documents from the retriever and passing those documents and the question into a question answering chain to return a response.

To create a conversational question-answering chain, you will need a retriever. In the below example, we will create one from a vector store, which can be created from embeddings.

import Example from "@snippets/modules/chains/popular/chat_vector_db.mdx"

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# Conversational Retrieval QA

:::info
Looking for the LCEL version? Click [here](/docs/modules/chains/popular/chat_vector_db).
:::

The ConversationalRetrievalQA chain builds on RetrievalQAChain to provide a chat history component.

It first combines the chat history (either explicitly passed in or retrieved from the provided memory) and the question into a standalone question, then looks up relevant documents from the retriever, and finally passes those documents and the question to a question answering chain to return a response.

To create one, you will need a retriever. In the below example, we will create one from a vector store, which can be created from embeddings.

import Example from "@snippets/modules/chains/popular/chat_vector_db.mdx"

<Example/>
2 changes: 1 addition & 1 deletion docs/docs_skeleton/docs/modules/chains/popular/sqlite.mdx
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# SQL

This example demonstrates the use of the `SQLDatabaseChain` for answering questions over a SQL database.
This example demonstrates the use of `Runnables` with questions and more on a SQL database.

import Example from "@snippets/modules/chains/popular/sqlite.mdx"

Expand Down
11 changes: 11 additions & 0 deletions docs/docs_skeleton/docs/modules/chains/popular/sqlite_legacy.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
---
sidebar_class_name: hidden
---

# SQL

This example demonstrates the use of the `SQLDatabaseChain` for answering questions over a SQL database.

import Example from "@snippets/modules/chains/popular/sqlite_legacy.mdx"

<Example/>
8 changes: 8 additions & 0 deletions docs/extras/expression_language/why.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# Why use LCEL?

The LangChain Expression Language was designed from day 1 to **support putting prototypes in production, with no code changes**, from the simplest “prompt + LLM” chain to the most complex chains (we’ve seen folks successfully running in production LCEL chains with 100s of steps). To highlight a few of the reasons you might want to use LCEL:

- optimised parallel execution: whenever your LCEL chains have steps that can be executed in parallel (eg if you fetch documents from multiple retrievers) we automatically do it, for the smallest possible latency.
- support for retries and fallbacks: more recently we’ve added support for configuring retries and fallbacks for any part of your LCEL chain. This is a great way to make your chains more reliable at scale. We’re currently working on adding streaming support for retries/fallbacks, so you can get the added reliability without any latency cost.
- accessing intermediate results: for more complex chains it’s often very useful to access the results of intermediate steps even before the final output is produced. This can be used let end-users know something is happening, or even just to debug your chain. We’ve added support for [streaming intermediate results](https://x.com/LangChainAI/status/1711806009097044193?s=20), and it’s available on every LangServe server.
- tracing with LangSmith: all chains built with LCEL have first-class tracing support, which can be used to debug your chains, or to understand what’s happening in production. To enable this all you have to do is add your [LangSmith](https://www.langchain.com/langsmith) API key as an environment variable.
Original file line number Diff line number Diff line change
Expand Up @@ -37,47 +37,43 @@ npm install cassandra-driver
import { CassandraStore } from "langchain/vectorstores/cassandra";
import { OpenAIEmbeddings } from "langchain/embeddings/openai";

// text sample from Godel, Escher, Bach

const config = {
cloud: {
secureConnectBundle: process.env.CASSANDRA_SCB as string,
},
credentials: {
username: "token",
password: process.env.CASSANDRA_TOKEN as string,
},
keyspace: "test",
dimensions: 1536,
table: "test",
primaryKey: {
name: "id",
type: "int",
},
metadataColumns: [
{
name: "name",
type: "text",
},
],
};

const vectorStore = await CassandraStore.fromTexts(
["I am blue", "Green yellow purple", "Hello there hello"],
[
"Tortoise: Labyrinth? Labyrinth? Could it Are we in the notorious Little\
Harmonic Labyrinth of the dreaded Majotaur?",
"Achilles: Yiikes! What is that?",
"Tortoise: They say-although I person never believed it myself-that an I\
Majotaur has created a tiny labyrinth sits in a pit in the middle of\
it, waiting innocent victims to get lost in its fears complexity.\
Then, when they wander and dazed into the center, he laughs and\
laughs at them-so hard, that he laughs them to death!",
"Achilles: Oh, no!",
"Tortoise: But it's only a myth. Courage, Achilles.",
{ id: 2, name: "2" },
{ id: 1, name: "1" },
{ id: 3, name: "3" },
],
[{ id: 2 }, { id: 1 }, { id: 3 }, { id: 4 }, { id: 5 }],
new OpenAIEmbeddings(),
{
collectionName: "goldel_escher_bach",
}
cassandraConfig
);

// or alternatively from docs
const vectorStore = await CassandraStore.fromDocuments(docs, new OpenAIEmbeddings(), {
collectionName: "goldel_escher_bach",
});

const response = await vectorStore.similaritySearch("scared", 2);
```

## Query docs from existing collection

```typescript
import { CassandraStore } from "langchain/vectorstores/cassandra";
import { OpenAIEmbeddings } from "langchain/embeddings/openai";

const vectorStore = await CassandraStore.fromExistingCollection(
new OpenAIEmbeddings(),
{
collectionName: "goldel_escher_bach",
}
const results = await vectorStore.similaritySearch(
"Green yellow purple",
1
);

const response = await vectorStore.similaritySearch("scared", 2);
```
```
Original file line number Diff line number Diff line change
Expand Up @@ -17,10 +17,82 @@ directly to the model and call it, as shown below.

## Usage

There are two main ways to apply functions to your OpenAI calls.

The first and most simple is by attaching a function directly to the `.invoke({})` method:

```typescript
/* Define your function schema */
const extractionFunctionSchema = {...}

/* Instantiate ChatOpenAI class */
const model = new ChatOpenAI({ modelName: "gpt-4" });

/**
* Call the .invoke method on the model, directly passing
* the function arguments as call args.
*/
const result = await model.invoke([new HumanMessage("What a beautiful day!")], {
functions: [extractionFunctionSchema],
function_call: { name: "extractor" },
});

console.log({ result });
```

The second way is by directly binding the function to your model. Binding function arguments to your model is useful when you want to call the same function twice.
Calling the `.bind({})` method attaches any call arguments passed in to all future calls to the model.

```typescript
/* Define your function schema */
const extractionFunctionSchema = {...}

/* Instantiate ChatOpenAI class and bind function arguments to the model */
const model = new ChatOpenAI({ modelName: "gpt-4" }).bind({
functions: [extractionFunctionSchema],
function_call: { name: "extractor" },
});

/* Now we can call the model without having to pass the function arguments in again */
const result = await model.invoke([new HumanMessage("What a beautiful day!")]);

console.log({ result });
```

OpenAI requires parameter schemas in the format below, where `parameters` must be [JSON Schema](https://json-schema.org/).
Specifying the `function_call` parameter will force the model to return a response using the specified function.
When adding call arguments to your model, specifying the `function_call` argument will force the model to return a response using the specified function.
This is useful if you have multiple schemas you'd like the model to pick from.

Example function schema:

```typescript
const extractionFunctionSchema = {
name: "extractor",
description: "Extracts fields from the input.",
parameters: {
type: "object",
properties: {
tone: {
type: "string",
enum: ["positive", "negative"],
description: "The overall tone of the input",
},
word_count: {
type: "number",
description: "The number of words in the input",
},
chat_response: {
type: "string",
description: "A response to the human's input",
},
},
required: ["tone", "word_count", "chat_response"],
},
};
```

Now to put it all together:

<CodeBlock language="typescript">{OpenAIFunctionsExample}</CodeBlock>

## Usage with Zod
Expand Down
21 changes: 21 additions & 0 deletions docs/extras/modules/model_io/models/llms/integrations/yandex.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# YandexGPT

LangChain.js supports calling [YandexGPT](https://cloud.yandex.com/en/services/yandexgpt) LLMs.

## Setup

First, you should [create service account](https://cloud.yandex.com/en/docs/iam/operations/sa/create) with the `ai.languageModels.user` role.

Next, you have two authentication options:

- [IAM token](https://cloud.yandex.com/en/docs/iam/operations/iam-token/create-for-sa).
You can specify the token in a constructor parameter `iam_token` or in an environment variable `YC_IAM_TOKEN`.
- [API key](https://cloud.yandex.com/en/docs/iam/operations/api-key/create)
You can specify the key in a constructor parameter `api_key` or in an environment variable `YC_API_KEY`.

## Usage

import CodeBlock from "@theme/CodeBlock";
import YandexGPTExample from "@examples/models/llm/yandex.ts";

<CodeBlock language="typescript">{YandexGPTExample}</CodeBlock>
65 changes: 10 additions & 55 deletions docs/snippets/modules/chains/popular/chat_vector_db.mdx
Original file line number Diff line number Diff line change
@@ -1,36 +1,20 @@
import CodeBlock from "@theme/CodeBlock";
import ConvoRetrievalQAExample from "@examples/chains/conversational_qa.ts";
import Example from "@examples/chains/conversational_qa.ts";

<CodeBlock language="typescript">{ConvoRetrievalQAExample}</CodeBlock>

In the above code snippet, the fromLLM method of the `ConversationalRetrievalQAChain` class has the following signature:
Here's an explanation of each step in the `RunnableSequence.from()` call above:

```typescript
static fromLLM(
llm: BaseLanguageModel,
retriever: BaseRetriever,
options?: {
questionGeneratorChainOptions?: {
llm?: BaseLanguageModel;
template?: string;
};
qaChainOptions?: QAChainParams;
returnSourceDocuments?: boolean;
}
): ConversationalRetrievalQAChain
```
- The first input passed is an object containing a `question` key. This key is used as the main input for whatever question a user may ask.
- The next key is `chatHistory`. This is a string of all previous chats (human & AI) concatenated together. This is used to help the model understand the context of the question.
- The `context` key is used to fetch relevant documents from the loaded context (in this case the State Of The Union speech). It performs a call to the `getRelevantDocuments` method on the retriever, passing in the user's question as the query. We then pass it to our `serializeDocs` util which maps over all returned documents, joins them with newlines and returns a string.

Here's an explanation of each of the attributes of the options object:
After getting and formatting all inputs we pipe them through the following operations:
- `questionPrompt` - this is the prompt template which we pass to the model in the next step. Behind the scenes it's taking the inputs outlined above and formatting them into the proper spots outlined in our template.
- The formatted prompt with context then gets passed to the LLM and a response is generated.
- Finally, we pipe the result of the LLM call to an output parser which formats the response into a readable string.

- `questionGeneratorChainOptions`: An object that allows you to pass a custom template and LLM to the underlying question generation chain.
- If the template is provided, the `ConversationalRetrievalQAChain` will use this template to generate a question from the conversation context instead of using the question provided in the question parameter.
- Passing in a separate LLM (`llm`) here allows you to use a cheaper/faster model to create the condensed question while using a more powerful model for the final response, and can reduce unnecessary latency.
- `qaChainOptions`: Options that allow you to customize the specific QA chain used in the final step. The default is the [`StuffDocumentsChain`](/docs/modules/chains/document/stuff), but you can customize which chain is used by passing in a `type` parameter.
**Passing specific options here is completely optional**, but can be useful if you want to customize the way the response is presented to the end user, or if you have too many documents for the default `StuffDocumentsChain`.
You can see [the API reference of the usable fields here](/docs/api/chains/types/QAChainParams). In case you want to make chat_history available to the final answering `qaChain`, which ultimately answers the user question, you HAVE to pass a custom qaTemplate with chat_history as input, as it is not present in the default Template, which only gets passed `context` documents and generated `question`.
- `returnSourceDocuments`: A boolean value that indicates whether the `ConversationalRetrievalQAChain` should return the source documents that were used to retrieve the answer. If set to true, the documents will be included in the result returned by the call() method. This can be useful if you want to allow the user to see the sources used to generate the answer. If not set, the default value will be false.
- If you are using this option and passing in a memory instance, set `inputKey` and `outputKey` on the memory instance to the same values as the chain input and final conversational chain output. These default to `"question"` and `"text"` respectively, and specify the values that the memory should store.
Using this `RunnableSequence` we can pass questions, and chat history to the model for informed conversational question answering.

## Built-in Memory

Expand All @@ -44,37 +28,8 @@ import ConvoQABuiltInExample from "@examples/chains/conversational_qa_built_in_m

## Streaming

You can also use the above concept of using two different LLMs to stream only the final response from the chain, and not output from the intermediate standalone question generation step. Here's an example:
You can also stream results from the chain. This is useful if you want to stream the output of the chain to a client, or if you want to stream the output of the chain to another chain.

import ConvoQAStreamingExample from "@examples/chains/conversational_qa_streaming.ts";

<CodeBlock language="typescript">{ConvoQAStreamingExample}</CodeBlock>

## Externally-Managed Memory

For this chain, if you'd like to format the chat history in a custom way (or pass in chat messages directly for convenience), you can also pass the chat history in explicitly by omitting the `memory` option and supplying
a `chat_history` string or array of [HumanMessages](/docs/api/schema/classes/HumanMessage) and [AIMessages](/docs/api/schema/classes/AIMessage) directly into the `chain.call` method:

import ConvoQAExternalMemoryExample from "@examples/chains/conversational_qa_external_memory.ts";

<CodeBlock language="typescript">{ConvoQAExternalMemoryExample}</CodeBlock>

## Prompt Customization

If you want to further change the chain's behavior, you can change the prompts for both the underlying question generation chain and the QA chain.

One case where you might want to do this is to improve the chain's ability to answer meta questions about the chat history.
By default, the only input to the QA chain is the standalone question generated from the question generation chain.
This poses a challenge when asking meta questions about information in previous interactions from the chat history.

For example, if you introduce a friend Bob and mention his age as 28, the chain is unable to provide his age upon asking a question like "How old is Bob?".
This limitation occurs because the bot searches for Bob in the vector store, rather than considering the message history.

You can pass an alternative prompt for the question generation chain that also returns parts of the chat history relevant to the answer,
allowing the QA chain to answer meta questions with the additional context:

import ConvoRetrievalQAWithCustomPrompt from "@examples/chains/conversation_qa_custom_prompt.ts";

<CodeBlock language="typescript">{ConvoRetrievalQAWithCustomPrompt}</CodeBlock>

Keep in mind that adding more context to the prompt in this way may distract the LLM from other relevant retrieved information.
Loading

0 comments on commit 2801dd6

Please sign in to comment.