-
Notifications
You must be signed in to change notification settings - Fork 15.8k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
12 changed files
with
379 additions
and
37 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,132 @@ | ||
# CrateDB | ||
|
||
> [CrateDB] is a distributed and scalable SQL database for storing and | ||
> analyzing massive amounts of data in near real-time, even with complex | ||
> queries. It is PostgreSQL-compatible, based on Lucene, and inheriting | ||
> from Elasticsearch. | ||
|
||
## Installation and Setup | ||
|
||
### Setup CrateDB | ||
There are two ways to get started with CrateDB quickly. Alternatively, | ||
choose other [CrateDB installation options]. | ||
|
||
#### Start CrateDB on your local machine | ||
Example: Run a single-node CrateDB instance with security disabled, | ||
using Docker or Podman. This is not recommended for production use. | ||
|
||
```bash | ||
docker run --name=cratedb --rm \ | ||
--publish=4200:4200 --publish=5432:5432 --env=CRATE_HEAP_SIZE=2g \ | ||
crate:latest -Cdiscovery.type=single-node | ||
``` | ||
|
||
#### Deploy cluster on CrateDB Cloud | ||
[CrateDB Cloud] is a managed CrateDB service. Sign up for a | ||
[free trial][CrateDB Cloud Console]. | ||
|
||
### Install Client | ||
Install the most recent version of the `langchain-cratedb` package | ||
and a few others that are needed for this tutorial. | ||
```bash | ||
pip install --upgrade langchain-cratedb langchain-openai unstructured | ||
``` | ||
|
||
|
||
## Documentation | ||
For a more detailed walkthrough of the CrateDB wrapper, see | ||
[using LangChain with CrateDB]. See also [all features of CrateDB] | ||
to learn about other functionality provided by CrateDB. | ||
|
||
|
||
## Features | ||
The CrateDB adapter for LangChain provides APIs to use CrateDB as vector store, | ||
document loader, and storage for chat messages. | ||
|
||
### Vector Store | ||
Use the CrateDB vector store functionality around `FLOAT_VECTOR` and `KNN_MATCH` | ||
for similarity search and other purposes. See also [CrateDBVectorStore Tutorial]. | ||
|
||
Make sure you've configured a valid OpenAI API key. | ||
```bash | ||
export OPENAI_API_KEY=sk-XJZ... | ||
``` | ||
```python | ||
from langchain_community.document_loaders import UnstructuredURLLoader | ||
from langchain_cratedb import CrateDBVectorStore | ||
from langchain_openai import OpenAIEmbeddings | ||
from langchain.text_splitter import CharacterTextSplitter | ||
|
||
loader = UnstructuredURLLoader(urls=["https://github.com/langchain-ai/langchain/raw/refs/tags/langchain-core==0.3.28/docs/docs/how_to/state_of_the_union.txt"]) | ||
documents = loader.load() | ||
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0) | ||
docs = text_splitter.split_documents(documents) | ||
|
||
embeddings = OpenAIEmbeddings() | ||
|
||
# Connect to a self-managed CrateDB instance on localhost. | ||
CONNECTION_STRING = "crate://?schema=testdrive" | ||
|
||
store = CrateDBVectorStore.from_documents( | ||
documents=docs, | ||
embedding=embeddings, | ||
collection_name="state_of_the_union", | ||
connection=CONNECTION_STRING, | ||
) | ||
|
||
query = "What did the president say about Ketanji Brown Jackson" | ||
docs_with_score = store.similarity_search_with_score(query) | ||
``` | ||
|
||
### Document Loader | ||
Load load documents from a CrateDB database table, using the document loader | ||
`CrateDBLoader`, which is based on SQLAlchemy. See also [CrateDBLoader Tutorial]. | ||
|
||
To use the document loader in your applications: | ||
```python | ||
import sqlalchemy as sa | ||
from langchain_community.utilities import SQLDatabase | ||
from langchain_cratedb import CrateDBLoader | ||
|
||
# Connect to a self-managed CrateDB instance on localhost. | ||
CONNECTION_STRING = "crate://?schema=testdrive" | ||
|
||
db = SQLDatabase(engine=sa.create_engine(CONNECTION_STRING)) | ||
|
||
loader = CrateDBLoader( | ||
'SELECT * FROM sys.summits LIMIT 42', | ||
db=db, | ||
) | ||
documents = loader.load() | ||
``` | ||
|
||
### Chat Message History | ||
Use CrateDB as the storage for your chat messages. | ||
See also [CrateDBChatMessageHistory Tutorial]. | ||
|
||
To use the chat message history in your applications: | ||
```python | ||
from langchain_cratedb import CrateDBChatMessageHistory | ||
|
||
# Connect to a self-managed CrateDB instance on localhost. | ||
CONNECTION_STRING = "crate://?schema=testdrive" | ||
|
||
message_history = CrateDBChatMessageHistory( | ||
session_id="test-session", | ||
connection=CONNECTION_STRING, | ||
) | ||
|
||
message_history.add_user_message("hi!") | ||
``` | ||
|
||
|
||
[all features of CrateDB]: https://cratedb.com/docs/guide/feature/ | ||
[CrateDB]: https://cratedb.com/database | ||
[CrateDB Cloud]: https://cratedb.com/database/cloud | ||
[CrateDB Cloud Console]: https://console.cratedb.cloud/?utm_source=langchain&utm_content=documentation | ||
[CrateDB installation options]: https://cratedb.com/docs/guide/install/ | ||
[CrateDBChatMessageHistory Tutorial]: https://github.com/crate/cratedb-examples/blob/main/topic/machine-learning/llm-langchain/conversational_memory.ipynb | ||
[CrateDBLoader Tutorial]: https://github.com/crate/cratedb-examples/blob/main/topic/machine-learning/llm-langchain/document_loader.ipynb | ||
[CrateDBVectorStore Tutorial]: https://github.com/crate/cratedb-examples/blob/main/topic/machine-learning/llm-langchain/vector_search.ipynb | ||
[using LangChain with CrateDB]: https://cratedb.com/docs/guide/integrate/langchain/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,85 @@ | ||
import React from 'react'; | ||
import clsx from 'clsx'; | ||
import {useWindowSize} from '@docusaurus/theme-common'; | ||
import {useDoc} from '@docusaurus/plugin-content-docs/client'; | ||
import DocItemPaginator from '@theme/DocItem/Paginator'; | ||
import DocVersionBanner from '@theme/DocVersionBanner'; | ||
import DocVersionBadge from '@theme/DocVersionBadge'; | ||
import DocItemFooter from '@theme/DocItem/Footer'; | ||
import DocItemTOCMobile from '@theme/DocItem/TOC/Mobile'; | ||
import DocItemTOCDesktop from '@theme/DocItem/TOC/Desktop'; | ||
import DocItemContent from '@theme/DocItem/Content'; | ||
import DocBreadcrumbs from '@theme/DocBreadcrumbs'; | ||
import ContentVisibility from '@theme/ContentVisibility'; | ||
import styles from './styles.module.css'; | ||
/** | ||
* Decide if the toc should be rendered, on mobile or desktop viewports | ||
*/ | ||
function useDocTOC() { | ||
const {frontMatter, toc} = useDoc(); | ||
const windowSize = useWindowSize(); | ||
const hidden = frontMatter.hide_table_of_contents; | ||
const canRender = !hidden && toc.length > 0; | ||
const mobile = canRender ? <DocItemTOCMobile /> : undefined; | ||
const desktop = | ||
canRender && (windowSize === 'desktop' || windowSize === 'ssr') ? ( | ||
<DocItemTOCDesktop /> | ||
) : undefined; | ||
return { | ||
hidden, | ||
mobile, | ||
desktop, | ||
}; | ||
} | ||
export default function DocItemLayout({children}) { | ||
const docTOC = useDocTOC(); | ||
const {metadata, frontMatter} = useDoc(); | ||
|
||
"https://github.com/langchain-ai/langchain/blob/master/docs/docs/introduction.ipynb" | ||
"https://colab.research.google.com/github/langchain-ai/langchain/blob/master/docs/docs/introduction.ipynb" | ||
|
||
console.log({metadata, frontMatter}) | ||
|
||
const linkColab = frontMatter.link_colab || ( | ||
metadata.editUrl?.endsWith(".ipynb") | ||
? metadata.editUrl?.replace("https://github.com/langchain-ai/langchain/edit/", "https://colab.research.google.com/github/langchain-ai/langchain/blob/") | ||
: null | ||
); | ||
const linkGithub = frontMatter.link_github || metadata.editUrl?.replace("/edit/", "/blob/"); | ||
|
||
console.log({linkColab, linkGithub}) | ||
|
||
return ( | ||
<div className="row"> | ||
<div className={clsx('col', !docTOC.hidden && styles.docItemCol)}> | ||
<ContentVisibility metadata={metadata} /> | ||
<DocVersionBanner /> | ||
<div className={styles.docItemContainer}> | ||
<article> | ||
<DocBreadcrumbs /> | ||
<DocVersionBadge /> | ||
{docTOC.mobile} | ||
<div style={{ | ||
display: "flex", | ||
flexDirection: "column", | ||
alignItems: "flex-end", | ||
float: "right", | ||
}}> | ||
{linkColab && (<a target="_blank" href={linkColab}> | ||
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/> | ||
</a>)} | ||
{linkGithub && (<a href={linkGithub} target="_blank"> | ||
<img src="https://img.shields.io/badge/Open%20on%20GitHub-grey?logo=github&logoColor=white" | ||
alt="Open on GitHub" /> | ||
</a>)} | ||
</div> | ||
<DocItemContent>{children}</DocItemContent> | ||
<DocItemFooter /> | ||
</article> | ||
<DocItemPaginator /> | ||
</div> | ||
</div> | ||
{docTOC.desktop && <div className="col col--3">{docTOC.desktop}</div>} | ||
</div> | ||
); | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
.docItemContainer header + *, | ||
.docItemContainer article > *:first-child { | ||
margin-top: 0; | ||
} | ||
|
||
@media (min-width: 997px) { | ||
.docItemCol { | ||
max-width: 75% !important; | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.