Skip to content

Commit

Permalink
ibm: Add support for Embedding Models (#20647)
Browse files Browse the repository at this point in the history
---------

Co-authored-by: Erick Friis <[email protected]>
  • Loading branch information
MateuszOssGit and efriis authored Apr 19, 2024
1 parent 7380981 commit 75ffe51
Show file tree
Hide file tree
Showing 9 changed files with 804 additions and 246 deletions.
10 changes: 10 additions & 0 deletions docs/docs/integrations/providers/ibm.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -37,3 +37,13 @@ See a [usage example](/docs/integrations/llms/ibm_watsonx).
```python
from langchain_ibm import WatsonxLLM
```

## Embedding Models

### WatsonxEmbeddings

See a [usage example](/docs/integrations/text_embedding/ibm_watsonx).

```python
from langchain_ibm import WatsonxEmbeddings
```
243 changes: 243 additions & 0 deletions docs/docs/integrations/text_embedding/ibm_watsonx.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,243 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# IBM watsonx.ai\n",
"\n",
">WatsonxEmbeddings is a wrapper for IBM [watsonx.ai](https://www.ibm.com/products/watsonx-ai) foundation models.\n",
"\n",
"This example shows how to communicate with `watsonx.ai` models using `LangChain`."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Setting up\n",
"\n",
"Install the package `langchain-ibm`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!pip install -qU langchain-ibm"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This cell defines the WML credentials required to work with watsonx Embeddings.\n",
"\n",
"**Action:** Provide the IBM Cloud user API key. For details, see\n",
"[documentation](https://cloud.ibm.com/docs/account?topic=account-userapikey&interface=ui)."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"from getpass import getpass\n",
"\n",
"watsonx_api_key = getpass()\n",
"os.environ[\"WATSONX_APIKEY\"] = watsonx_api_key"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Additionaly you are able to pass additional secrets as an environment variable. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"os.environ[\"WATSONX_URL\"] = \"your service instance url\"\n",
"os.environ[\"WATSONX_TOKEN\"] = \"your token for accessing the CPD cluster\"\n",
"os.environ[\"WATSONX_PASSWORD\"] = \"your password for accessing the CPD cluster\"\n",
"os.environ[\"WATSONX_USERNAME\"] = \"your username for accessing the CPD cluster\"\n",
"os.environ[\"WATSONX_INSTANCE_ID\"] = \"your instance_id for accessing the CPD cluster\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Load the model\n",
"\n",
"You might need to adjust model `parameters` for different models."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"from ibm_watsonx_ai.metanames import EmbedTextParamsMetaNames\n",
"\n",
"embed_params = {\n",
" EmbedTextParamsMetaNames.TRUNCATE_INPUT_TOKENS: 3,\n",
" EmbedTextParamsMetaNames.RETURN_OPTIONS: {\"input_text\": True},\n",
"}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Initialize the `WatsonxEmbeddings` class with previously set parameters.\n",
"\n",
"\n",
"**Note**: \n",
"\n",
"- To provide context for the API call, you must add `project_id` or `space_id`. For more information see [documentation](https://www.ibm.com/docs/en/watsonx-as-a-service?topic=projects).\n",
"- Depending on the region of your provisioned service instance, use one of the urls described [here](https://ibm.github.io/watsonx-ai-python-sdk/setup_cloud.html#authentication).\n",
"\n",
"In this example, we’ll use the `project_id` and Dallas url.\n",
"\n",
"\n",
"You need to specify `model_id` that will be used for inferencing."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"from langchain_ibm import WatsonxEmbeddings\n",
"\n",
"watsonx_embedding = WatsonxEmbeddings(\n",
" model_id=\"ibm/slate-125m-english-rtrvr\",\n",
" url=\"https://us-south.ml.cloud.ibm.com\",\n",
" project_id=\"PASTE YOUR PROJECT_ID HERE\",\n",
" params=embed_params,\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Alternatively you can use Cloud Pak for Data credentials. For details, see [documentation](https://ibm.github.io/watsonx-ai-python-sdk/setup_cpd.html). "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"watsonx_embedding = WatsonxEmbeddings(\n",
" model_id=\"ibm/slate-125m-english-rtrvr\",\n",
" url=\"PASTE YOUR URL HERE\",\n",
" username=\"PASTE YOUR USERNAME HERE\",\n",
" password=\"PASTE YOUR PASSWORD HERE\",\n",
" instance_id=\"openshift\",\n",
" version=\"5.0\",\n",
" project_id=\"PASTE YOUR PROJECT_ID HERE\",\n",
" params=embed_params,\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Usage\n",
"\n",
"### Embed query"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[0.0094472, -0.024981909, -0.026013248, -0.040483925, -0.057804465]"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"text = \"This is a test document.\"\n",
"\n",
"query_result = watsonx_embedding.embed_query(text)\n",
"query_result[:5]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Embed documents"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[0.009447193, -0.024981918, -0.026013244, -0.040483937, -0.057804447]"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"texts = [\"This is a content of the document\", \"This is another document\"]\n",
"\n",
"doc_result = watsonx_embedding.embed_documents(texts)\n",
"doc_result[0][:5]"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "langchain",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.13"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
3 changes: 2 additions & 1 deletion libs/partners/ibm/langchain_ibm/__init__.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
from langchain_ibm.embeddings import WatsonxEmbeddings
from langchain_ibm.llms import WatsonxLLM

__all__ = ["WatsonxLLM"]
__all__ = ["WatsonxLLM", "WatsonxEmbeddings"]
Loading

0 comments on commit 75ffe51

Please sign in to comment.