-
Notifications
You must be signed in to change notification settings - Fork 15.9k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'master' into cb-return-id-in-docs
- Loading branch information
Showing
21 changed files
with
1,904 additions
and
10 deletions.
There are no files selected for viewing
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
# NAVER | ||
|
||
All functionality related to `Naver`, including HyperCLOVA X models, particularly those accessible through `Naver Cloud` [CLOVA Studio](https://clovastudio.ncloud.com/). | ||
|
||
> [Naver](https://navercorp.com/) is a global technology company with cutting-edge technologies and a diverse business portfolio including search, commerce, fintech, content, cloud, and AI. | ||
> [Naver Cloud](https://www.navercloudcorp.com/lang/en/) is the cloud computing arm of Naver, a leading cloud service provider offering a comprehensive suite of cloud services to businesses through its [Naver Cloud Platform (NCP)](https://www.ncloud.com/). | ||
Please refer to [NCP User Guide](https://guide.ncloud-docs.com/docs/clovastudio-overview) for more detailed instructions (also in Korean). | ||
|
||
## Installation and Setup | ||
|
||
- Get both CLOVA Studio API Key and API Gateway Key by [creating your app](https://guide.ncloud-docs.com/docs/en/clovastudio-playground01#create-test-app) and set them as environment variables respectively (`NCP_CLOVASTUDIO_API_KEY`, `NCP_APIGW_API_KEY`). | ||
- Install the integration Python package with: | ||
|
||
```bash | ||
pip install -U langchain-community | ||
``` | ||
|
||
## Chat models | ||
|
||
### ChatClovaX | ||
|
||
See a [usage example](/docs/integrations/chat/naver). | ||
|
||
```python | ||
from langchain_community.chat_models import ChatClovaX | ||
``` | ||
|
||
## Embedding models | ||
|
||
### ClovaXEmbeddings | ||
|
||
See a [usage example](/docs/integrations/text_embedding/naver). | ||
|
||
```python | ||
from langchain_community.embeddings import ClovaXEmbeddings | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,318 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "raw", | ||
"id": "afaf8039", | ||
"metadata": {}, | ||
"source": [ | ||
"---\n", | ||
"sidebar_label: Naver\n", | ||
"---" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "e49f1e0d", | ||
"metadata": {}, | ||
"source": [ | ||
"# ClovaXEmbeddings\n", | ||
"\n", | ||
"This notebook covers how to get started with embedding models provided by CLOVA Studio. For detailed documentation on `ClovaXEmbeddings` features and configuration options, please refer to the [API reference](https://python.langchain.com/api_reference/community/embeddings/langchain_community.naver.ClovaXEmbeddings.html).\n", | ||
"\n", | ||
"## Overview\n", | ||
"### Integration details\n", | ||
"\n", | ||
"| Provider | Package |\n", | ||
"|:--------:|:-------:|\n", | ||
"| [Naver](/docs/integrations/providers/naver.mdx) | [langchain-community](https://python.langchain.com/api_reference/community/embeddings/langchain_community.naver.ClovaXEmbeddings.html) |\n", | ||
"\n", | ||
"## Setup\n", | ||
"\n", | ||
"Before using embedding models provided by CLOVA Studio, you must go through the three steps below.\n", | ||
"\n", | ||
"1. Creating [NAVER Cloud Platform](https://www.ncloud.com/) account \n", | ||
"2. Apply to use [CLOVA Studio](https://www.ncloud.com/product/aiService/clovaStudio)\n", | ||
"3. Find API Keys after creating CLOVA Studio Test App or Service App (See [here](https://guide.ncloud-docs.com/docs/en/clovastudio-playground01#테스트앱생성).)\n", | ||
"\n", | ||
"### Credentials\n", | ||
"\n", | ||
"CLOVA Studio requires 3 keys (`NCP_CLOVASTUDIO_API_KEY`, `NCP_APIGW_API_KEY` and `NCP_CLOVASTUDIO_APP_ID`) for embeddings.\n", | ||
"- `NCP_CLOVASTUDIO_API_KEY` and `NCP_CLOVASTUDIO_APP_ID` is issued per serviceApp or testApp\n", | ||
"- `NCP_APIGW_API_KEY` is issued per account\n", | ||
"\n", | ||
"The two API Keys could be found by clicking `App Request Status` > `Service App, Test App List` > `‘Details’ button for each app` in [CLOVA Studio](https://clovastudio.ncloud.com/studio-application/service-app)." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "c52e8a50-3e67-4272-bc80-3954d98f8dea", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"import getpass\n", | ||
"import os\n", | ||
"\n", | ||
"if not os.getenv(\"NCP_CLOVASTUDIO_API_KEY\"):\n", | ||
" os.environ[\"NCP_CLOVASTUDIO_API_KEY\"] = getpass.getpass(\n", | ||
" \"Enter NCP CLOVA Studio API Key: \"\n", | ||
" )\n", | ||
"if not os.getenv(\"NCP_APIGW_API_KEY\"):\n", | ||
" os.environ[\"NCP_APIGW_API_KEY\"] = getpass.getpass(\"Enter NCP API Gateway API Key: \")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "83520d8e-ecf8-4e47-b3bc-1ac205b3a2ab", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"os.environ[\"NCP_CLOVASTUDIO_APP_ID\"] = input(\"Enter NCP CLOVA Studio App ID: \")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "ff00653e", | ||
"metadata": {}, | ||
"source": [ | ||
"### Installation\n", | ||
"\n", | ||
"ClovaXEmbeddings integration lives in the `langchain_community` package:" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "99400c9b", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"# install package\n", | ||
"!pip install -U langchain-community" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "2651e611-9d5b-4315-9bbd-f99f56be4e19", | ||
"metadata": {}, | ||
"source": [ | ||
"## Instantiation\n", | ||
"\n", | ||
"Now we can instantiate our embeddings object and embed query or document:\n", | ||
"\n", | ||
"- There are several embedding models available in CLOVA Studio. Please refer [here](https://guide.ncloud-docs.com/docs/en/clovastudio-explorer03#임베딩API) for further details.\n", | ||
"- Note that you might need to normalize the embeddings depending on your specific use case." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 7, | ||
"id": "62e0dbc3", | ||
"metadata": { | ||
"scrolled": true, | ||
"tags": [] | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"from langchain_community.embeddings import ClovaXEmbeddings\n", | ||
"\n", | ||
"embeddings = ClovaXEmbeddings(\n", | ||
" model=\"clir-emb-dolphin\", # set with the model name of corresponding app id. Default is `clir-emb-dolphin`\n", | ||
" # app_id=\"...\" # set if you prefer to pass app id directly instead of using environment variables\n", | ||
")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "0493b4a8", | ||
"metadata": {}, | ||
"source": [ | ||
"## Indexing and Retrieval\n", | ||
"\n", | ||
"Embedding models are often used in retrieval-augmented generation (RAG) flows, both as part of indexing data as well as later retrieving it. For more detailed instructions, please see our RAG tutorials under the [working with external knowledge tutorials](/docs/tutorials/#working-with-external-knowledge).\n", | ||
"\n", | ||
"Below, see how to index and retrieve data using the `embeddings` object we initialized above. In this example, we will index and retrieve a sample document in the `InMemoryVectorStore`." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 8, | ||
"id": "d4d59653", | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"data": { | ||
"text/plain": [ | ||
"'CLOVA Studio is an AI development tool that allows you to customize your own HyperCLOVA X models.'" | ||
] | ||
}, | ||
"execution_count": 8, | ||
"metadata": {}, | ||
"output_type": "execute_result" | ||
} | ||
], | ||
"source": [ | ||
"# Create a vector store with a sample text\n", | ||
"from langchain_core.vectorstores import InMemoryVectorStore\n", | ||
"\n", | ||
"text = \"CLOVA Studio is an AI development tool that allows you to customize your own HyperCLOVA X models.\"\n", | ||
"\n", | ||
"vectorstore = InMemoryVectorStore.from_texts(\n", | ||
" [text],\n", | ||
" embedding=embeddings,\n", | ||
")\n", | ||
"\n", | ||
"# Use the vectorstore as a retriever\n", | ||
"retriever = vectorstore.as_retriever()\n", | ||
"\n", | ||
"# Retrieve the most similar text\n", | ||
"retrieved_documents = retriever.invoke(\"What is CLOVA Studio?\")\n", | ||
"\n", | ||
"# show the retrieved document's content\n", | ||
"retrieved_documents[0].page_content" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "b1a249e1", | ||
"metadata": {}, | ||
"source": [ | ||
"## Direct Usage\n", | ||
"\n", | ||
"Under the hood, the vectorstore and retriever implementations are calling `embeddings.embed_documents(...)` and `embeddings.embed_query(...)` to create embeddings for the text(s) used in `from_texts` and retrieval `invoke` operations, respectively.\n", | ||
"\n", | ||
"You can directly call these methods to get embeddings for your own use cases.\n", | ||
"\n", | ||
"### Embed single texts\n", | ||
"\n", | ||
"You can embed single texts or documents with `embed_query`:" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 9, | ||
"id": "12fcfb4b", | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"name": "stdout", | ||
"output_type": "stream", | ||
"text": [ | ||
"[-0.094717406, -0.4077411, -0.5513184, 1.6024436, -1.3235079, -1.0720996, -0.44471845, 1.3665184, 0.\n" | ||
] | ||
} | ||
], | ||
"source": [ | ||
"single_vector = embeddings.embed_query(text)\n", | ||
"print(str(single_vector)[:100]) # Show the first 100 characters of the vector" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "8b383b53", | ||
"metadata": {}, | ||
"source": [ | ||
"### Embed multiple texts\n", | ||
"\n", | ||
"You can embed multiple texts with `embed_documents`:" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 10, | ||
"id": "1f2e6104", | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"name": "stdout", | ||
"output_type": "stream", | ||
"text": [ | ||
"[-0.094717406, -0.4077411, -0.5513184, 1.6024436, -1.3235079, -1.0720996, -0.44471845, 1.3665184, 0.\n", | ||
"[-0.25525448, -0.84877056, -0.6928286, 1.5867524, -1.2930486, -0.8166254, -0.17934391, 1.4236152, 0.\n" | ||
] | ||
} | ||
], | ||
"source": [ | ||
"text2 = \"LangChain is the framework for building context-aware reasoning applications\"\n", | ||
"two_vectors = embeddings.embed_documents([text, text2])\n", | ||
"for vector in two_vectors:\n", | ||
" print(str(vector)[:100]) # Show the first 100 characters of the vector" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "eee40d32367cc5c4", | ||
"metadata": {}, | ||
"source": [ | ||
"## Additional functionalities\n", | ||
"\n", | ||
"### Service App\n", | ||
"\n", | ||
"When going live with production-level application using CLOVA Studio, you should apply for and use Service App. (See [here](https://guide.ncloud-docs.com/docs/en/clovastudio-playground01#서비스앱신청).)\n", | ||
"\n", | ||
"For a Service App, corresponding `NCP_CLOVASTUDIO_API_KEY` and `NCP_CLOVASTUDIO_APP_ID` are issued and can only be called with them." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "08f9f44e-c6a4-4163-8caf-27a0cda345b7", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"# Update environment variables\n", | ||
"\n", | ||
"os.environ[\"NCP_CLOVASTUDIO_API_KEY\"] = getpass.getpass(\n", | ||
" \"Enter NCP CLOVA Studio API Key for Service App: \"\n", | ||
")\n", | ||
"os.environ[\"NCP_CLOVASTUDIO_APP_ID\"] = input(\"Enter NCP CLOVA Studio Service App ID: \")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "86f59698-b3f4-4b19-a9d4-4facfcea304b", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"embeddings = ClovaXEmbeddings(\n", | ||
" service_app=True,\n", | ||
" model=\"clir-emb-dolphin\", # set with the model name of corresponding app id of your Service App\n", | ||
" # app_id=\"...\" # set if you prefer to pass app id directly instead of using environment variables\n", | ||
")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "1ddeaee9", | ||
"metadata": {}, | ||
"source": [ | ||
"## API Reference\n", | ||
"\n", | ||
"For detailed documentation on `ClovaXEmbeddings` features and configuration options, please refer to the [API reference](https://python.langchain.com/latest/api_reference/community/embeddings/langchain_community.embeddings.naver.ClovaXEmbeddings.html)." | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "Python 3 (ipykernel)", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.12.3" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 5 | ||
} |
Oops, something went wrong.