Feature(LLMLingua-2): fix the title

microsoft · Mar 24, 2024 · 4794c6c · 4794c6c
1 parent 5828ac3
commit 4794c6c
Show file tree

Hide file tree

Showing 2 changed files with 4 additions and 6 deletions.
diff --git a/README.md b/README.md
@@ -42,7 +42,7 @@ LongLLMLingua mitigates the 'lost in the middle' issue in LLMs, enhancing long-c
 
 LLMLingua-2, a small-size yet powerful prompt compression method trained via data distillation from GPT-4 for token classification with a BERT-level encoder, excels in task-agnostic compression. It surpasses LLMLingua in handling out-of-domain data, offering 3x-6x faster performance.
 
-- [LLMLingua-2: Context-Aware Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression](https://arxiv.org/abs/2403.12968) (Under Review)<br>
+- [LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression](https://arxiv.org/abs/2403.12968) (Under Review)<br>
   _Zhuoshi Pan, Qianhui Wu, Huiqiang Jiang, Menglin Xia, Xufang Luo, Jue Zhang, Qingwei Lin, Victor Ruhle, Yuqing Yang, Chin-Yew Lin, H. Vicky Zhao, Lili Qiu, Dongmei Zhang_
 
 ## 🎥 Overview

diff --git a/llmlingua/prompt_compressor.py b/llmlingua/prompt_compressor.py
@@ -38,19 +38,17 @@ class PromptCompressor:
     The PromptCompressor class is versatile and can be adapted for various models and specific requirements in prompt processing.
     Users can specify different model names and configurations as needed for their particular use case.The architecture is
     based on the paper "LLMLingua: Compressing Prompts for Accelerated Inference of Large Language Models". Jiang, Huiqiang, Qianhui Wu,
-    Chin-Yew Lin, Yuqing Yang, and Lili Qiu. "Llmlingua: Compressing prompts for accelerated inference of large language models."
-    arXiv preprint arXiv:2310.05736 (2023).
+    Chin-Yew Lin, Yuqing Yang, and Lili Qiu. arXiv preprint arXiv:2310.05736 (2023).
 
     Args:
         model_name (str, optional): The name of the language model to be loaded. Default is "NousResearch/Llama-2-7b-hf".
         device_map (str, optional): The device to load the model onto, e.g., "cuda" for GPU. Default is "cuda".
         model_config (dict, optional): A dictionary containing the configuration parameters for the model. Default is an empty dictionary.
         open_api_config (dict, optional): A dictionary containing configuration for openai APIs that may be used in conjunction with the model. Default is an empty dictionary.
         use_llmlingua2 (bool, optional): Whether to use llmlingua-2 compressor based on the paper
-            "LLMLingua-2: Context-Aware Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression".
+            "LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression".
             Zhuoshi Pan, Qianhui Wu, Huiqiang Jiang, Menglin Xia, Xufang Luo, Jue Zhang, Qingwei Lin, Victor Ruhle, Yuqing Yang, Chin-Yew Lin, H. Vicky Zhao, Lili Qiu, Dongmei Zhang.
-            "LLMLingua-2: Context-Aware Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression". arXiv preprint arXiv:,
-            Default is True.
+            arXiv preprint arXiv:2403.2403.12968 (2024), Default is True.
         llmlingua2_config (dict, optional): A dictionary containing the configuration parameters for llmlingua-2. Default is
             {
                 "max_batch_size": 50,