From 3f76d142a07afdda0a2d26fbb0679da99bcf8ae5 Mon Sep 17 00:00:00 2001 From: Brace Sproul Date: Fri, 2 Aug 2024 13:47:26 -0700 Subject: [PATCH] docs[minor]: Updated AWS Knowledge retriever doc (#6352) --- .../retrievers/bedrock-knowledge-bases.ipynb | 273 ++++++++++++++++++ .../retrievers/bedrock-knowledge-bases.mdx | 26 -- 2 files changed, 273 insertions(+), 26 deletions(-) create mode 100644 docs/core_docs/docs/integrations/retrievers/bedrock-knowledge-bases.ipynb delete mode 100644 docs/core_docs/docs/integrations/retrievers/bedrock-knowledge-bases.mdx diff --git a/docs/core_docs/docs/integrations/retrievers/bedrock-knowledge-bases.ipynb b/docs/core_docs/docs/integrations/retrievers/bedrock-knowledge-bases.ipynb new file mode 100644 index 000000000000..fbf57c6eb66a --- /dev/null +++ b/docs/core_docs/docs/integrations/retrievers/bedrock-knowledge-bases.ipynb @@ -0,0 +1,273 @@ +{ + "cells": [ + { + "cell_type": "raw", + "id": "afaf8039", + "metadata": { + "vscode": { + "languageId": "raw" + } + }, + "source": [ + "---\n", + "sidebar_label: Knowledge Bases for Amazon Bedrock\n", + "---" + ] + }, + { + "cell_type": "markdown", + "id": "e49f1e0d", + "metadata": {}, + "source": [ + "# Knowledge Bases for Amazon Bedrock\n", + "\n", + "## Overview\n", + "\n", + "This will help you getting started with the [AmazonKnowledgeBaseRetriever](/docs/concepts/#retrievers). For detailed documentation of all AmazonKnowledgeBaseRetriever features and configurations head to the [API reference](https://api.js.langchain.com/classes/langchain_aws.AmazonKnowledgeBaseRetriever.html).\n", + "\n", + "Knowledge Bases for Amazon Bedrock is a fully managed support for end-to-end RAG workflow provided by Amazon Web Services (AWS).\n", + "It provides an entire ingestion workflow of converting your documents into embeddings (vector) and storing the embeddings in a specialized vector database.\n", + "Knowledge Bases for Amazon Bedrock supports popular databases for vector storage, including vector engine for Amazon OpenSearch Serverless, Pinecone, Redis Enterprise Cloud, Amazon Aurora (coming soon), and MongoDB (coming soon).\n", + "\n", + "### Integration details\n", + "\n", + "| Retriever | Self-host | Cloud offering | Package | [Py support](https://python.langchain.com/docs/integrations/retrievers/bedrock/) |\n", + "| :--- | :--- | :---: | :---: | :---: |\n", + "[AmazonKnowledgeBaseRetriever](https://api.js.langchain.com/classes/langchain_aws.AmazonKnowledgeBaseRetriever.html) | 🟠 (see details below) | ✅ | @langchain/aws | ✅ |\n", + "\n", + "> AWS Knowledge Base Retriever can be 'self hosted' in the sense you can run it on your own AWS infrastructure. However it is not possible to run on another cloud provider or on-premises.\n", + "\n", + "## Setup\n", + "\n", + "In order to use the AmazonKnowledgeBaseRetriever, you need to have an AWS account, where you can manage your indexes and documents. Once you've setup your account, set the following environment variables:\n", + "\n", + "```bash\n", + "process.env.AWS_KNOWLEDGE_BASE_ID=your-knowledge-base-id\n", + "process.env.AWS_ACCESS_KEY_ID=your-access-key-id\n", + "process.env.AWS_SECRET_ACCESS_KEY=your-secret-access-key\n", + "```" + ] + }, + { + "cell_type": "markdown", + "id": "72ee0c4b-9764-423a-9dbf-95129e185210", + "metadata": {}, + "source": [ + "If you want to get automated tracing from individual queries, you can also set your [LangSmith](https://docs.smith.langchain.com/) API key by uncommenting below:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "a15d341e-3e26-4ca3-830b-5aab30ed66de", + "metadata": {}, + "outputs": [], + "source": [ + "// process.env.LANGSMITH_API_KEY = \"\";\n", + "// process.env.LANGSMITH_TRACING = \"true\";" + ] + }, + { + "cell_type": "markdown", + "id": "0730d6a1-c893-4840-9817-5e5251676d5d", + "metadata": {}, + "source": [ + "### Installation\n", + "\n", + "This retriever lives in the `@langchain/aws` package:\n", + "\n", + "```{=mdx}\n", + "import IntegrationInstallTooltip from \"@mdx_components/integration_install_tooltip.mdx\";\n", + "import Npm2Yarn from \"@theme/Npm2Yarn\";\n", + "\n", + "\n", + "\n", + "\n", + " @langchain/aws\n", + "\n", + "```" + ] + }, + { + "cell_type": "markdown", + "id": "a38cde65-254d-4219-a441-068766c0d4b5", + "metadata": {}, + "source": [ + "## Instantiation\n", + "\n", + "Now we can instantiate our retriever:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "70cc8e65-2a02-408a-bbc6-8ef649057d82", + "metadata": {}, + "outputs": [], + "source": [ + "import { AmazonKnowledgeBaseRetriever } from \"@langchain/aws\";\n", + "\n", + "const retriever = new AmazonKnowledgeBaseRetriever({\n", + " topK: 10,\n", + " knowledgeBaseId: process.env.AWS_KNOWLEDGE_BASE_ID,\n", + " region: \"us-east-2\",\n", + " clientOptions: {\n", + " credentials: {\n", + " accessKeyId: process.env.AWS_ACCESS_KEY_ID,\n", + " secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY,\n", + " },\n", + " },\n", + "});" + ] + }, + { + "cell_type": "markdown", + "id": "5c5f2839-4020-424e-9fc9-07777eede442", + "metadata": {}, + "source": [ + "## Usage" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "51a60dbe-9f2e-4e04-bb62-23968f17164a", + "metadata": {}, + "outputs": [], + "source": [ + "const query = \"...\"\n", + "\n", + "await retriever.invoke(query);" + ] + }, + { + "cell_type": "markdown", + "id": "dfe8aad4-8626-4330-98a9-7ea1ca5d2e0e", + "metadata": {}, + "source": [ + "## Use within a chain\n", + "\n", + "Like other retrievers, AmazonKnowledgeBaseRetriever can be incorporated into LLM applications via [chains](/docs/how_to/sequence/).\n", + "\n", + "We will need a LLM or chat model:\n", + "\n", + "```{=mdx}\n", + "import ChatModelTabs from \"@theme/ChatModelTabs\";\n", + "\n", + "\n", + "```" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "25b647a3-f8f2-4541-a289-7a241e43f9df", + "metadata": {}, + "outputs": [], + "source": [ + "// @ls-docs-hide-cell\n", + "\n", + "import { ChatOpenAI } from \"@langchain/openai\";\n", + "\n", + "const llm = new ChatOpenAI({\n", + " model: \"gpt-4o-mini\",\n", + " temperature: 0,\n", + "});" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "23e11cc9-abd6-4855-a7eb-799f45ca01ae", + "metadata": {}, + "outputs": [], + "source": [ + "import { ChatPromptTemplate } from \"@langchain/core/prompts\";\n", + "import { RunnablePassthrough, RunnableSequence } from \"@langchain/core/runnables\";\n", + "import { StringOutputParser } from \"@langchain/core/output_parsers\";\n", + "\n", + "import type { Document } from \"@langchain/core/documents\";\n", + "\n", + "const prompt = ChatPromptTemplate.fromTemplate(`\n", + "Answer the question based only on the context provided.\n", + "\n", + "Context: {context}\n", + "\n", + "Question: {question}`);\n", + "\n", + "const formatDocs = (docs: Document[]) => {\n", + " return docs.map((doc) => doc.pageContent).join(\"\\n\\n\");\n", + "}\n", + "\n", + "// See https://js.langchain.com/v0.2/docs/tutorials/rag\n", + "const ragChain = RunnableSequence.from([\n", + " {\n", + " context: retriever.pipe(formatDocs),\n", + " question: new RunnablePassthrough(),\n", + " },\n", + " prompt,\n", + " llm,\n", + " new StringOutputParser(),\n", + "]);" + ] + }, + { + "cell_type": "markdown", + "id": "22b1d6f8", + "metadata": {}, + "source": [ + "```{=mdx}\n", + "\n", + ":::tip\n", + "\n", + "See [our RAG tutorial](docs/tutorials/rag) for more information and examples on `RunnableSequence`'s like the one above.\n", + "\n", + ":::\n", + "\n", + "```" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "d47c37dd-5c11-416c-a3b6-bec413cd70e8", + "metadata": {}, + "outputs": [], + "source": [ + "await ragChain.invoke(\"...\")" + ] + }, + { + "cell_type": "markdown", + "id": "3a5bb5ca-c3ae-4a58-be67-2cd18574b9a3", + "metadata": {}, + "source": [ + "## API reference\n", + "\n", + "For detailed documentation of all AmazonKnowledgeBaseRetriever features and configurations head to the [API reference](https://api.js.langchain.com/classes/langchain_aws.AmazonKnowledgeBaseRetriever.html)." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "TypeScript", + "language": "typescript", + "name": "tslab" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "typescript", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.4" + } + }, + "nbformat": 4, + "nbformat_minor": 5 + } + \ No newline at end of file diff --git a/docs/core_docs/docs/integrations/retrievers/bedrock-knowledge-bases.mdx b/docs/core_docs/docs/integrations/retrievers/bedrock-knowledge-bases.mdx deleted file mode 100644 index 01cb48c6e1af..000000000000 --- a/docs/core_docs/docs/integrations/retrievers/bedrock-knowledge-bases.mdx +++ /dev/null @@ -1,26 +0,0 @@ ---- -hide_table_of_contents: true ---- - -# Knowledge Bases for Amazon Bedrock - -Knowledge Bases for Amazon Bedrock is a fully managed support for end-to-end RAG workflow provided by Amazon Web Services (AWS). -It provides an entire ingestion workflow of converting your documents into embeddings (vector) and storing the embeddings in a specialized vector database. -Knowledge Bases for Amazon Bedrock supports popular databases for vector storage, including vector engine for Amazon OpenSearch Serverless, Pinecone, Redis Enterprise Cloud, Amazon Aurora (coming soon), and MongoDB (coming soon). - -## Setup - -import IntegrationInstallTooltip from "@mdx_components/integration_install_tooltip.mdx"; - - - -```bash npm2yarn -npm i @langchain/aws -``` - -## Usage - -import CodeBlock from "@theme/CodeBlock"; -import Example from "@examples/retrievers/amazon_knowledge_bases.ts"; - -{Example}