Feature(LLMLingua): add LangChain example (microsoft#97)

Co-authored-by: Ayo Ayibiowu <[email protected]>
ZZZ041 · Feb 28, 2024 · 1c2f5c0 · 1c2f5c0
1 parent 042bd0e
commit 1c2f5c0
Show file tree

Hide file tree

Showing 6 changed files with 101 additions and 23 deletions.
diff --git a/DOCUMENT.md b/DOCUMENT.md
@@ -171,6 +171,30 @@ from llmlingua import PromptCompressor
 llm_lingua = PromptCompressor("TheBloke/Llama-2-7b-Chat-GPTQ", model_config={"revision": "main"})
 ```
 
+### Integration with LangChain
+
+Thanks to the contributions of Ayo Ayibiowu (@thehapyone), (Long)LLMLingua can be seamlessly integrated into LangChain. Here's an example of how to initialize (Long)LLMLingua within LangChain:
+
+```python
+from langchain.retrievers import ContextualCompressionRetriever
+from langchain_community.retrievers.document_compressors import LLMLinguaCompressor
+from langchain_openai import ChatOpenAI
+
+llm = ChatOpenAI(temperature=0)
+
+compressor = LLMLinguaCompressor(model_name="openai-community/gpt2", device_map="cpu")
+compression_retriever = ContextualCompressionRetriever(
+    base_compressor=compressor, base_retriever=retriever
+)
+
+compressed_docs = compression_retriever.get_relevant_documents(
+    "What did the president say about Ketanji Jackson Brown"
+)
+pretty_print_docs(compressed_docs)
+```
+
+For a more detailed guide, please refer to [Notebook](https://github.com/langchain-ai/langchain/blob/master/docs/docs/integrations/retrievers/llmlingua.ipynb).
+
 ### Integration with LlamaIndex
 
 Thanks to the contributions of Jerry Liu (@jerryjliu), (Long)LLMLingua can be seamlessly integrated into LlamaIndex. Here's an example of how to initialize (Long)LLMLingua within LlamaIndex:

diff --git a/README.md b/README.md
@@ -18,12 +18,12 @@ https://github.com/microsoft/LLMLingua/assets/30883354/eb0ea70d-6d4c-4aa7-8977-6
 
 ## News
 
+- 👾 LLMLingua has been integrated into [LangChain](https://github.com/langchain-ai/langchain/blob/master/docs/docs/integrations/retrievers/llmlingua.ipynb) and [LlamaIndex](https://github.com/run-llama/llama_index/blob/main/docs/examples/node_postprocessor/LongLLMLingua.ipynb), two widely-used RAG frameworks.
 - 🤳 Talk slides are available in [AI Time Jan, 24](https://drive.google.com/file/d/1fzK3wOvy2boF7XzaYuq2bQ3jFeP1WMk3/view?usp=sharing).
 - 🖥 EMNLP'23 slides are available in [Session 5](https://drive.google.com/file/d/1GxQLAEN8bBB2yiEdQdW4UKoJzZc0es9t/view) and [BoF-6](https://drive.google.com/file/d/1LJBUfJrKxbpdkwo13SgPOqugk-UjLVIF/view).
 - 📚 Check out our new [blog post](https://medium.com/@iofu728/longllmlingua-bye-bye-to-middle-loss-and-save-on-your-rag-costs-via-prompt-compression-54b559b9ddf7) discussing RAG benefits and cost savings through prompt compression. See the script example [here](https://github.com/microsoft/LLMLingua/blob/main/examples/Retrieval.ipynb).
 - 🎈 Visit our [project page](https://llmlingua.com/) for real-world case studies in RAG, Online Meetings, CoT, and Code.
 - 👨‍🦯 Explore our ['./examples'](./examples) directory for practical applications, including [RAG](./examples/RAG.ipynb), [Online Meeting](./examples/OnlineMeeting.ipynb), [CoT](./examples/CoT.ipynb), [Code](./examples/Code.ipynb), and [RAG using LlamaIndex](./examples/RAGLlamaIndex.ipynb).
-- 👾 LongLLMLingua is now part of the [LlamaIndex pipeline](https://github.com/run-llama/llama_index/blob/main/llama_index/postprocessor/longllmlingua.py), a widely-used RAG framework.
 
 ## TL;DR
 

diff --git a/Transparency_FAQ.md b/Transparency_FAQ.md
@@ -127,32 +127,84 @@ We release the parameter in the [issue1](https://github.com/microsoft/LLMLingua/
 **LLMLingua**:
 
 ```python
-prompt  = compressor.compress_prompt(  
-    context=xxx,  
-    instruction=xxx,  
-    question=xxx,  
-    ratio=0.75, 
-    iterative_size=100,  
-    context_budget="*2",  
+prompt  = compressor.compress_prompt(
+    context=xxx,
+    instruction=xxx,
+    question=xxx,
+    ratio=0.75,
+    iterative_size=100,
+    context_budget="*2",
 )
 ```
 
 **LongLLMLingua**:
 
 ```python
 compressed_prompt = llm_lingua.compress_prompt(
-    demonstration.split("\n"), 
-    instruction, 
-    question, 
-    0.55, 
-    use_sentence_level_filter=False, 
-    condition_in_question="after_condition", 
-    reorder_context="sort", 
+    demonstration.split("\n"),
+    instruction,
+    question,
+    0.55,
+    use_sentence_level_filter=False,
+    condition_in_question="after_condition",
+    reorder_context="sort",
     dynamic_context_compression_ratio=0.3, # or 0.4
-    condition_compare=True, 
-    context_budget="+100", 
+    condition_compare=True,
+    context_budget="+100",
     rank_method="longllmlingua",
 )
 ```
 
-Experiments in LLMLingua and most experiments in LongLLMLingua were conducted in completion mode, whereas chat mode tends to be more sensitive to token-level compression. However, OpenAI has currently disabled GPT-3.5-turbo's completion; you can use GPT-3.5-turbo-instruction or Azure OpenAI service instead.
+Experiments in LLMLingua and most experiments in LongLLMLingua were conducted in completion mode, whereas chat mode tends to be more sensitive to token-level compression. However, OpenAI has currently disabled GPT-3.5-turbo's completion; you can use GPT-3.5-turbo-instruction or Azure OpenAI service instead.
+
+
+## How to use LLMLingua in LangChain and LlamaIndex?
+
+### Integration with LangChain
+
+Thanks to the contributions of Ayo Ayibiowu (@thehapyone), (Long)LLMLingua can be seamlessly integrated into LangChain. Here's an example of how to initialize (Long)LLMLingua within LangChain:
+
+```python
+from langchain.retrievers import ContextualCompressionRetriever
+from langchain_community.retrievers.document_compressors import LLMLinguaCompressor
+from langchain_openai import ChatOpenAI
+
+llm = ChatOpenAI(temperature=0)
+
+compressor = LLMLinguaCompressor(model_name="openai-community/gpt2", device_map="cpu")
+compression_retriever = ContextualCompressionRetriever(
+    base_compressor=compressor, base_retriever=retriever
+)
+
+compressed_docs = compression_retriever.get_relevant_documents(
+    "What did the president say about Ketanji Jackson Brown"
+)
+pretty_print_docs(compressed_docs)
+```
+
+For a more detailed guide, please refer to [Notebook](https://github.com/langchain-ai/langchain/blob/master/docs/docs/integrations/retrievers/llmlingua.ipynb).
+
+### Integration with LlamaIndex
+
+Thanks to the contributions of Jerry Liu (@jerryjliu), (Long)LLMLingua can be seamlessly integrated into LlamaIndex. Here's an example of how to initialize (Long)LLMLingua within LlamaIndex:
+
+```python
+from llama_index.query_engine import RetrieverQueryEngine
+from llama_index.response_synthesizers import CompactAndRefine
+from llama_index.indices.postprocessor import LongLLMLinguaPostprocessor
+
+node_postprocessor = LongLLMLinguaPostprocessor(
+    instruction_str="Given the context, please answer the final question",
+    target_token=300,
+    rank_method="longllmlingua",
+    additional_compress_kwargs={
+        "condition_compare": True,
+        "condition_in_question": "after",
+        "context_budget": "+100",
+        "reorder_context": "sort",  # Enables document reordering
+        "dynamic_context_compression_ratio": 0.4, # Enables dynamic compression ratio
+    },
+)
+```
+
+For a more detailed guide, please refer to [RAGLlamaIndex Example](https://github.com/microsoft/LLMLingua/blob/main/examples/RAGLlamaIndex.ipynb).
diff --git a/examples/RAGLlamaIndex.ipynb b/examples/RAGLlamaIndex.ipynb
@@ -31,7 +31,7 @@
    "id": "a6137de2-0e3f-4962-860c-680da4df2eae",
    "metadata": {},
    "source": [
-    "More specifically, [**LongLLMLinguaPostprocessor**](https://github.com/run-llama/llama_index/blob/main/llama_index/postprocessor/longllmlingua.py#L16) can be used as a **Postprocessor** in **LlamaIndex** by invoking it, with arguments consistent with those in the [**PromptCompressor**](https://github.com/microsoft/LLMLingua/blob/main/llmlingua/prompt_compressor.py) of [**LLMLingua**](https://github.com/microsoft/LLMLingua).\n",
+    "More specifically, [**LongLLMLinguaPostprocessor**](https://github.com/run-llama/llama_index/blob/main/llama-index-legacy/llama_index/legacy/postprocessor/longllmlingua.py#L16) can be used as a **Postprocessor** in **LlamaIndex** by invoking it, with arguments consistent with those in the [**PromptCompressor**](https://github.com/microsoft/LLMLingua/blob/main/llmlingua/prompt_compressor.py) of [**LLMLingua**](https://github.com/microsoft/LLMLingua).\n",
     "You can call the corresponding compression algorithms in LLMLingua and the question-aware prompt compression method in LongLLMLingua."
    ]
   },

diff --git a/tests/test_llmlingua.py b/tests/test_llmlingua.py
@@ -56,9 +56,10 @@ def __init__(self, *args, **kwargs):
         super(LLMLinguaTester, self).__init__(*args, **kwargs)
         try:
             import nltk
-            nltk.download('punkt')
+
+            nltk.download("punkt")
         except:
-            print('nltk_data exits.')
+            print("nltk_data exits.")
         self.llmlingua = PromptCompressor("lgaalves/gpt2-dolly", device_map="cpu")
 
     def test_general_compress_prompt(self):

diff --git a/tests/test_longllmlingua.py b/tests/test_longllmlingua.py
@@ -60,9 +60,10 @@ def __init__(self, *args, **kwargs):
         super(LongLLMLinguaTester, self).__init__(*args, **kwargs)
         try:
             import nltk
-            nltk.download('punkt')
+
+            nltk.download("punkt")
         except:
-            print('nltk_data exits.')
+            print("nltk_data exits.")
         self.llmlingua = PromptCompressor("lgaalves/gpt2-dolly", device_map="cpu")
 
     def test_general_compress_prompt(self):