Skip to content

Commit

Permalink
Add Cleanlab's LLM integration (run-llama#13994)
Browse files Browse the repository at this point in the history
* init

* complete methods

* add example notebook

* add simple test

* fix local/public variable

* call constructor

* fix key mismatch and privateAttr name

* formatting and linting

* change function name

* add BUILD files

* update BUILD

* update BUILD

* fix lint
  • Loading branch information
AshishSardana authored Jun 30, 2024
1 parent fd34432 commit a7c7920
Show file tree
Hide file tree
Showing 12 changed files with 567 additions and 0 deletions.
191 changes: 191 additions & 0 deletions docs/docs/examples/llm/cleanlab.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,191 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "4d1b897a",
"metadata": {},
"source": [
"<a href=\"https://colab.research.google.com/github/jerryjliu/llama_index/blob/main/docs/docs/examples/llm/cleanlab.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "2e33dced-e587-4397-81b3-d6606aa1738a",
"metadata": {},
"source": [
"# Cleanlab Trustworthy Language Model\n",
"\n",
"This notebook shows how to use Cleanlab's Trustworthy Language Model (TLM). TLM is a more reliable LLM that gives high-quality outputs and indicates when it is unsure of the answer to a question, making it suitable for applications where unchecked hallucinations are a show-stopper.\n",
"\n",
"Read more about TLM API on [Cleanlab Studio's docs](https://help.cleanlab.ai/reference/python/trustworthy_language_model/). Feel free to refer to the [quickstart tutorial](https://help.cleanlab.ai/tutorials/tlm/) for more advanced usage. \n",
"\n",
"Visit https://cleanlab.ai and sign up to get an API key."
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "5863dde9-84a0-4c33-ad52-cc767442f63f",
"metadata": {},
"source": [
"## Setup"
]
},
{
"cell_type": "markdown",
"id": "833bdb2b",
"metadata": {},
"source": [
"If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4aff387e",
"metadata": {},
"outputs": [],
"source": [
"%pip install llama-index-llms-cleanlab"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9bbbc106",
"metadata": {},
"outputs": [],
"source": [
"%pip install llama-index"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ad297f19-998f-4485-aa2f-d67020058b7d",
"metadata": {},
"outputs": [],
"source": [
"from llama_index.llms.cleanlab import CleanlabTLM"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "152ced37-9a42-47be-9a39-4218521f5e72",
"metadata": {},
"outputs": [],
"source": [
"# set api key in env or in llm\n",
"# import os\n",
"# os.environ[\"CLEANLAB_API_KEY\"] = \"your api key\"\n",
"\n",
"llm = CleanlabTLM(api_key=\"your_api_key\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d61b10bb-e911-47fb-8e84-19828cf224be",
"metadata": {},
"outputs": [],
"source": [
"resp = llm.complete(\"Who is Paul Graham?\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3bd14f4e-c245-4384-a471-97e4ddfcb40e",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Paul Graham is a British-born computer scientist, venture capitalist, and essayist. He is best known for co-founding the startup incubator and investment firm, Y Combinator, which has provided funding and support to numerous successful tech startups including Dropbox, Airbnb, and Reddit.\n",
"\n",
"Before founding Y Combinator, Graham was a successful entrepreneur himself, having co-founded the company Viaweb in 1995, which was later acquired by Yahoo in 1998. Graham is also known for his essays on startups, technology, and programming, which have been widely read and influential in the tech industry.\n",
"\n",
"In addition to his work in the tech industry, Graham has a background in artificial intelligence and computer science, having earned a Ph.D. in computer science from Harvard University. He is also a prolific essayist and has written several books, including \"Hackers & Painters\" and \"The Hundred-Year Lie: How to Prevent Corporate Abuse and Save the World from Its Own Worst Appetites.\"\n"
]
}
],
"source": [
"print(resp)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "25ad1b00-28fc-4bcd-96c4-d5b35605721a",
"metadata": {},
"source": [
"### Streaming"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "13c641fa-345a-4dce-87c5-ab1f6dcf4757",
"metadata": {},
"source": [
"Using `stream_complete` endpoint "
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "06da1ef1-2f6b-497c-847b-62dd2df11491",
"metadata": {},
"outputs": [],
"source": [
"response = llm.stream_complete(\"Who is Paul Graham?\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1b851def-5160-46e5-a30c-5a3ef2356b79",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" Paul Graham is a British-born computer scientist, entrepreneur, venture capitalist, and essayist. He is best known for co-founding the startup incubator and investment firm, Y Combinator, which has provided funding and support to numerous successful startups including Dropbox, Airbnb, and Reddit.\n",
"\n",
"Before founding Y Combinator, Graham was a successful entrepreneur himself, having co-founded the company Viaweb in 1995, which was later acquired by Yahoo in 1998. Graham is also known for his essays on startups, technology, and programming, which have been widely read and influential in the tech industry.\n",
"\n",
"In addition to his work in the tech industry, Graham has a background in computer science and artificial intelligence, having earned a PhD in this field from Harvard University. He has also taught programming and entrepreneurship at several universities, including Harvard and Stanford."
]
}
],
"source": [
"for r in response:\n",
" print(r.delta, end=\"\")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
153 changes: 153 additions & 0 deletions llama-index-integrations/llms/llama-index-llms-cleanlab/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,153 @@
llama_index/_static
.DS_Store
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
bin/
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
etc/
include/
lib/
lib64/
parts/
sdist/
share/
var/
wheels/
pip-wheel-metadata/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
.ruff_cache

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
target/

# Jupyter Notebook
.ipynb_checkpoints
notebooks/

# IPython
profile_default/
ipython_config.py

# pyenv
.python-version

# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock

# PEP 582; used by e.g. github.com/David-OConnor/pyflow
__pypackages__/

# Celery stuff
celerybeat-schedule
celerybeat.pid

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/
pyvenv.cfg

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/

# Jetbrains
.idea
modules/
*.swp

# VsCode
.vscode

# pipenv
Pipfile
Pipfile.lock

# pyright
pyrightconfig.json
3 changes: 3 additions & 0 deletions llama-index-integrations/llms/llama-index-llms-cleanlab/BUILD
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
poetry_requirements(
name="poetry",
)
17 changes: 17 additions & 0 deletions llama-index-integrations/llms/llama-index-llms-cleanlab/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
GIT_ROOT ?= $(shell git rev-parse --show-toplevel)

help: ## Show all Makefile targets.
@grep -E '^[a-zA-Z_-]+:.*?## .*$$' $(MAKEFILE_LIST) | awk 'BEGIN {FS = ":.*?## "}; {printf "\033[33m%-30s\033[0m %s\n", $$1, $$2}'

format: ## Run code autoformatters (black).
pre-commit install
git ls-files | xargs pre-commit run black --files

lint: ## Run linters: pre-commit (black, ruff, codespell) and mypy
pre-commit install && git ls-files | xargs pre-commit run --show-diff-on-failure --files

test: ## Run tests via pytest.
pytest tests

watch-docs: ## Build and watch documentation.
sphinx-autobuild docs/ docs/_build/html --open-browser --watch $(GIT_ROOT)/llama_index/
Loading

0 comments on commit a7c7920

Please sign in to comment.