Skip to content

Commit

Permalink
Feature(LLMLingua): add LangChain example (microsoft#97)
Browse files Browse the repository at this point in the history
Co-authored-by: Ayo Ayibiowu <[email protected]>
  • Loading branch information
iofu728 and thehapyone authored Feb 28, 2024
1 parent 042bd0e commit 1c2f5c0
Show file tree
Hide file tree
Showing 6 changed files with 101 additions and 23 deletions.
24 changes: 24 additions & 0 deletions DOCUMENT.md
Original file line number Diff line number Diff line change
Expand Up @@ -171,6 +171,30 @@ from llmlingua import PromptCompressor
llm_lingua = PromptCompressor("TheBloke/Llama-2-7b-Chat-GPTQ", model_config={"revision": "main"})
```

### Integration with LangChain

Thanks to the contributions of Ayo Ayibiowu (@thehapyone), (Long)LLMLingua can be seamlessly integrated into LangChain. Here's an example of how to initialize (Long)LLMLingua within LangChain:

```python
from langchain.retrievers import ContextualCompressionRetriever
from langchain_community.retrievers.document_compressors import LLMLinguaCompressor
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(temperature=0)

compressor = LLMLinguaCompressor(model_name="openai-community/gpt2", device_map="cpu")
compression_retriever = ContextualCompressionRetriever(
base_compressor=compressor, base_retriever=retriever
)

compressed_docs = compression_retriever.get_relevant_documents(
"What did the president say about Ketanji Jackson Brown"
)
pretty_print_docs(compressed_docs)
```

For a more detailed guide, please refer to [Notebook](https://github.com/langchain-ai/langchain/blob/master/docs/docs/integrations/retrievers/llmlingua.ipynb).

### Integration with LlamaIndex

Thanks to the contributions of Jerry Liu (@jerryjliu), (Long)LLMLingua can be seamlessly integrated into LlamaIndex. Here's an example of how to initialize (Long)LLMLingua within LlamaIndex:
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,12 +18,12 @@ https://github.com/microsoft/LLMLingua/assets/30883354/eb0ea70d-6d4c-4aa7-8977-6

## News

- 👾 LLMLingua has been integrated into [LangChain](https://github.com/langchain-ai/langchain/blob/master/docs/docs/integrations/retrievers/llmlingua.ipynb) and [LlamaIndex](https://github.com/run-llama/llama_index/blob/main/docs/examples/node_postprocessor/LongLLMLingua.ipynb), two widely-used RAG frameworks.
- 🤳 Talk slides are available in [AI Time Jan, 24](https://drive.google.com/file/d/1fzK3wOvy2boF7XzaYuq2bQ3jFeP1WMk3/view?usp=sharing).
- 🖥 EMNLP'23 slides are available in [Session 5](https://drive.google.com/file/d/1GxQLAEN8bBB2yiEdQdW4UKoJzZc0es9t/view) and [BoF-6](https://drive.google.com/file/d/1LJBUfJrKxbpdkwo13SgPOqugk-UjLVIF/view).
- 📚 Check out our new [blog post](https://medium.com/@iofu728/longllmlingua-bye-bye-to-middle-loss-and-save-on-your-rag-costs-via-prompt-compression-54b559b9ddf7) discussing RAG benefits and cost savings through prompt compression. See the script example [here](https://github.com/microsoft/LLMLingua/blob/main/examples/Retrieval.ipynb).
- 🎈 Visit our [project page](https://llmlingua.com/) for real-world case studies in RAG, Online Meetings, CoT, and Code.
- 👨‍🦯 Explore our ['./examples'](./examples) directory for practical applications, including [RAG](./examples/RAG.ipynb), [Online Meeting](./examples/OnlineMeeting.ipynb), [CoT](./examples/CoT.ipynb), [Code](./examples/Code.ipynb), and [RAG using LlamaIndex](./examples/RAGLlamaIndex.ipynb).
- 👾 LongLLMLingua is now part of the [LlamaIndex pipeline](https://github.com/run-llama/llama_index/blob/main/llama_index/postprocessor/longllmlingua.py), a widely-used RAG framework.

## TL;DR

Expand Down
86 changes: 69 additions & 17 deletions Transparency_FAQ.md
Original file line number Diff line number Diff line change
Expand Up @@ -127,32 +127,84 @@ We release the parameter in the [issue1](https://github.com/microsoft/LLMLingua/
**LLMLingua**:

```python
prompt = compressor.compress_prompt(
context=xxx,
instruction=xxx,
question=xxx,
ratio=0.75,
iterative_size=100,
context_budget="*2",
prompt = compressor.compress_prompt(
context=xxx,
instruction=xxx,
question=xxx,
ratio=0.75,
iterative_size=100,
context_budget="*2",
)
```

**LongLLMLingua**:

```python
compressed_prompt = llm_lingua.compress_prompt(
demonstration.split("\n"),
instruction,
question,
0.55,
use_sentence_level_filter=False,
condition_in_question="after_condition",
reorder_context="sort",
demonstration.split("\n"),
instruction,
question,
0.55,
use_sentence_level_filter=False,
condition_in_question="after_condition",
reorder_context="sort",
dynamic_context_compression_ratio=0.3, # or 0.4
condition_compare=True,
context_budget="+100",
condition_compare=True,
context_budget="+100",
rank_method="longllmlingua",
)
```

Experiments in LLMLingua and most experiments in LongLLMLingua were conducted in completion mode, whereas chat mode tends to be more sensitive to token-level compression. However, OpenAI has currently disabled GPT-3.5-turbo's completion; you can use GPT-3.5-turbo-instruction or Azure OpenAI service instead.
Experiments in LLMLingua and most experiments in LongLLMLingua were conducted in completion mode, whereas chat mode tends to be more sensitive to token-level compression. However, OpenAI has currently disabled GPT-3.5-turbo's completion; you can use GPT-3.5-turbo-instruction or Azure OpenAI service instead.


## How to use LLMLingua in LangChain and LlamaIndex?

### Integration with LangChain

Thanks to the contributions of Ayo Ayibiowu (@thehapyone), (Long)LLMLingua can be seamlessly integrated into LangChain. Here's an example of how to initialize (Long)LLMLingua within LangChain:

```python
from langchain.retrievers import ContextualCompressionRetriever
from langchain_community.retrievers.document_compressors import LLMLinguaCompressor
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(temperature=0)

compressor = LLMLinguaCompressor(model_name="openai-community/gpt2", device_map="cpu")
compression_retriever = ContextualCompressionRetriever(
base_compressor=compressor, base_retriever=retriever
)

compressed_docs = compression_retriever.get_relevant_documents(
"What did the president say about Ketanji Jackson Brown"
)
pretty_print_docs(compressed_docs)
```

For a more detailed guide, please refer to [Notebook](https://github.com/langchain-ai/langchain/blob/master/docs/docs/integrations/retrievers/llmlingua.ipynb).

### Integration with LlamaIndex

Thanks to the contributions of Jerry Liu (@jerryjliu), (Long)LLMLingua can be seamlessly integrated into LlamaIndex. Here's an example of how to initialize (Long)LLMLingua within LlamaIndex:

```python
from llama_index.query_engine import RetrieverQueryEngine
from llama_index.response_synthesizers import CompactAndRefine
from llama_index.indices.postprocessor import LongLLMLinguaPostprocessor

node_postprocessor = LongLLMLinguaPostprocessor(
instruction_str="Given the context, please answer the final question",
target_token=300,
rank_method="longllmlingua",
additional_compress_kwargs={
"condition_compare": True,
"condition_in_question": "after",
"context_budget": "+100",
"reorder_context": "sort", # Enables document reordering
"dynamic_context_compression_ratio": 0.4, # Enables dynamic compression ratio
},
)
```

For a more detailed guide, please refer to [RAGLlamaIndex Example](https://github.com/microsoft/LLMLingua/blob/main/examples/RAGLlamaIndex.ipynb).
2 changes: 1 addition & 1 deletion examples/RAGLlamaIndex.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@
"id": "a6137de2-0e3f-4962-860c-680da4df2eae",
"metadata": {},
"source": [
"More specifically, [**LongLLMLinguaPostprocessor**](https://github.com/run-llama/llama_index/blob/main/llama_index/postprocessor/longllmlingua.py#L16) can be used as a **Postprocessor** in **LlamaIndex** by invoking it, with arguments consistent with those in the [**PromptCompressor**](https://github.com/microsoft/LLMLingua/blob/main/llmlingua/prompt_compressor.py) of [**LLMLingua**](https://github.com/microsoft/LLMLingua).\n",
"More specifically, [**LongLLMLinguaPostprocessor**](https://github.com/run-llama/llama_index/blob/main/llama-index-legacy/llama_index/legacy/postprocessor/longllmlingua.py#L16) can be used as a **Postprocessor** in **LlamaIndex** by invoking it, with arguments consistent with those in the [**PromptCompressor**](https://github.com/microsoft/LLMLingua/blob/main/llmlingua/prompt_compressor.py) of [**LLMLingua**](https://github.com/microsoft/LLMLingua).\n",
"You can call the corresponding compression algorithms in LLMLingua and the question-aware prompt compression method in LongLLMLingua."
]
},
Expand Down
5 changes: 3 additions & 2 deletions tests/test_llmlingua.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,9 +56,10 @@ def __init__(self, *args, **kwargs):
super(LLMLinguaTester, self).__init__(*args, **kwargs)
try:
import nltk
nltk.download('punkt')

nltk.download("punkt")
except:
print('nltk_data exits.')
print("nltk_data exits.")
self.llmlingua = PromptCompressor("lgaalves/gpt2-dolly", device_map="cpu")

def test_general_compress_prompt(self):
Expand Down
5 changes: 3 additions & 2 deletions tests/test_longllmlingua.py
Original file line number Diff line number Diff line change
Expand Up @@ -60,9 +60,10 @@ def __init__(self, *args, **kwargs):
super(LongLLMLinguaTester, self).__init__(*args, **kwargs)
try:
import nltk
nltk.download('punkt')

nltk.download("punkt")
except:
print('nltk_data exits.')
print("nltk_data exits.")
self.llmlingua = PromptCompressor("lgaalves/gpt2-dolly", device_map="cpu")

def test_general_compress_prompt(self):
Expand Down

0 comments on commit 1c2f5c0

Please sign in to comment.