-
Notifications
You must be signed in to change notification settings - Fork 264
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Feature(LLMLingua): add tempature & test scripts
- Loading branch information
Showing
13 changed files
with
380 additions
and
7 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,47 @@ | ||
name: "\U0001F41B Bug Report" | ||
description: Submit a bug report to help us improve LLMLingua | ||
title: "[Bug]: " | ||
labels: ["bug"] | ||
|
||
body: | ||
- type: textarea | ||
id: description | ||
attributes: | ||
label: Describe the bug | ||
description: A clear and concise description of what the bug is. | ||
placeholder: What went wrong? | ||
- type: textarea | ||
id: reproduce | ||
attributes: | ||
label: Steps to reproduce | ||
description: | | ||
Steps to reproduce the behavior: | ||
1. Step 1 | ||
2. Step 2 | ||
3. ... | ||
4. See error | ||
placeholder: How can we replicate the issue? | ||
- type: textarea | ||
id: expected_behavior | ||
attributes: | ||
label: Expected Behavior | ||
description: A clear and concise description of what you expected to happen. | ||
placeholder: What should have happened? | ||
- type: textarea | ||
id: logs | ||
attributes: | ||
label: Logs | ||
description: If applicable, add logs or screenshots to help explain your problem. | ||
placeholder: Add logs here | ||
- type: textarea | ||
id: additional_information | ||
attributes: | ||
label: Additional Information | ||
description: | | ||
- LLMLingua Version: <!-- Specify the LLMLingua version (e.g., v0.1.0) --> | ||
- Operating System: <!-- Specify the OS (e.g., Windows 10, Ubuntu 20.04) --> | ||
- Python Version: <!-- Specify the Python version (e.g., 3.8) --> | ||
- Related Issues: <!-- Link to any related issues here (e.g., #1) --> | ||
- Any other relevant information. | ||
placeholder: Any additional details |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
blank_issues_enabled: true |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
name: "\U0001F680 Feature request" | ||
description: Submit a proposal/request for a new LLMLingua feature | ||
labels: ["feature request"] | ||
title: "[Feature Request]: " | ||
|
||
body: | ||
- type: textarea | ||
id: problem_description | ||
attributes: | ||
label: Is your feature request related to a problem? Please describe. | ||
description: A clear and concise description of what the problem is. Ex. I'm always frustrated when [...] | ||
placeholder: What problem are you trying to solve? | ||
|
||
- type: textarea | ||
id: solution_description | ||
attributes: | ||
label: Describe the solution you'd like | ||
description: A clear and concise description of what you want to happen. | ||
placeholder: How do you envision the solution? | ||
|
||
- type: textarea | ||
id: additional_context | ||
attributes: | ||
label: Additional context | ||
description: Add any other context or screenshots about the feature request here. | ||
placeholder: Any additional information |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
name: "\U0001F31F General Question" | ||
description: File a general question | ||
title: "[Question]: " | ||
labels: ["question"] | ||
|
||
body: | ||
- type: textarea | ||
id: description | ||
attributes: | ||
label: Describe the issue | ||
description: A clear and concise description of what the question is. | ||
placeholder: The detail of question. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,43 @@ | ||
# What does this PR do? | ||
|
||
<!-- | ||
Congratulations! You've made it this far! You're not quite done yet though. | ||
Once merged, your PR is going to appear in the release notes with the title you set, so make sure it's a great title that fully reflects the extent of your awesome contribution. | ||
Then, please replace this with a description of the change and which issue is fixed (if applicable). Please also include relevant motivation and context. List any dependencies (if any) that are required for this change. | ||
Once you're done, someone will review your PR shortly (see the section "Who can review?" below to tag some potential reviewers). They may suggest changes to make the code even better. If no one reviewed your PR after a week has passed, don't hesitate to post a new comment @-mentioning the same persons---sometimes notifications get lost. | ||
--> | ||
|
||
<!-- Remove if not applicable --> | ||
|
||
Fixes # (issue) | ||
|
||
|
||
## Before submitting | ||
- [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). | ||
- [ ] Was this discussed/approved via a Github issue? Please add a link | ||
to it if that's the case. | ||
- [ ] Did you make sure to update the documentation with your changes? | ||
- [ ] Did you write any new necessary tests? | ||
|
||
|
||
## Who can review? | ||
|
||
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag | ||
members/contributors who may be interested in your PR. | ||
|
||
<!-- Your PR will be replied to more quickly if you can figure out the right person to tag with @ | ||
If you know how to use git blame, that is the easiest way, otherwise, here is a rough guide of **who to tag**. | ||
Please tag fewer than 3 people. | ||
LLMLingua/LongLLMLingua: | ||
- general: @iofu728, @QianhuiWu, @XufangLuo, and @mydmdm | ||
- new feature: @SiyunZhao | ||
Documentation: @SiyunZhao | ||
--> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
# This workflows will build and upload a Python Package using Twine when a release is published | ||
# Conda-forge bot will pick up new PyPI version and automatically create new version | ||
# For more information see: https://help.github.com/en/actions/language-and-framework-guides/using-python-with-github-actions#publishing-to-package-registries | ||
|
||
name: release | ||
run-name: Release LLMLingua by @${{ github.actor }} | ||
|
||
on: | ||
release: | ||
types: [published] | ||
permissions: {} | ||
|
||
jobs: | ||
deploy: | ||
strategy: | ||
matrix: | ||
os: ['ubuntu-latest'] | ||
python-version: [3.10] | ||
runs-on: ${{ matrix.os }} | ||
environment: package | ||
steps: | ||
- name: Checkout | ||
uses: actions/checkout@v3 | ||
|
||
- name: Install from source | ||
# This is required for the pre-commit tests | ||
shell: pwsh | ||
run: pip install . | ||
|
||
- name: Build | ||
shell: pwsh | ||
run: | | ||
pip install twine | ||
python setup.py sdist bdist_wheel | ||
- name: Publish to PyPI | ||
env: | ||
TWINE_USERNAME: ${{ secrets.PYPI_USERNAME }} | ||
TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }} | ||
shell: pwsh | ||
run: twine upload dist/* |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
name: Unit Test | ||
|
||
# see: https://help.github.com/en/actions/reference/events-that-trigger-workflows | ||
on: # Trigger the workflow on pull request or merge | ||
pull_request: | ||
merge_group: | ||
types: [checks_requested] | ||
|
||
defaults: | ||
run: | ||
shell: bash | ||
permissions: {} | ||
|
||
jobs: | ||
UniTest: | ||
runs-on: ${{ matrix.os }} | ||
strategy: | ||
fail-fast: false | ||
matrix: | ||
os: [ubuntu-latest, macos-latest, windows-2019] | ||
python-version: ["3.9", "3.10", "3.11"] | ||
steps: | ||
- uses: actions/checkout@v4 | ||
- uses: actions/setup-python@v5 | ||
name: Setup python ${{ inputs.python-version }} | ||
id: setup-python | ||
with: | ||
python-version: ${{ inputs.python-version }} | ||
|
||
- name: Install packages and dependencies for all tests | ||
run: | | ||
python -m pip install --upgrade pip wheel | ||
pip install pytest pytest-xdist | ||
- name: Install packages | ||
run: | | ||
pip install -e . | ||
- name: Run core tests | ||
shell: bash | ||
run: | | ||
make test |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,51 @@ | ||
default_language_version: | ||
python: python3 | ||
exclude: 'dotnet' | ||
ci: | ||
autofix_prs: true | ||
autoupdate_commit_msg: '[pre-commit.ci] pre-commit suggestions' | ||
autoupdate_schedule: 'quarterly' | ||
|
||
repos: | ||
- repo: https://github.com/pre-commit/pre-commit-hooks | ||
rev: v4.4.0 | ||
hooks: | ||
- id: check-added-large-files | ||
- id: check-ast | ||
- id: check-yaml | ||
- id: check-toml | ||
- id: check-json | ||
- id: check-byte-order-marker | ||
exclude: .gitignore | ||
- id: check-merge-conflict | ||
- id: detect-private-key | ||
- id: trailing-whitespace | ||
- id: end-of-file-fixer | ||
- id: no-commit-to-branch | ||
- repo: https://github.com/psf/black | ||
rev: 23.3.0 | ||
hooks: | ||
- id: black | ||
- repo: https://github.com/charliermarsh/ruff-pre-commit | ||
rev: v0.0.261 | ||
hooks: | ||
- id: ruff | ||
args: ["--fix"] | ||
- repo: https://github.com/codespell-project/codespell | ||
rev: v2.2.6 | ||
hooks: | ||
- id: codespell | ||
args: ["-L", "ans,linar,nam,"] | ||
exclude: | | ||
(?x)^( | ||
pyproject.toml | | ||
website/static/img/ag.svg | | ||
website/yarn.lock | | ||
notebook/.* | ||
)$ | ||
- repo: https://github.com/nbQA-dev/nbQA | ||
rev: 1.7.1 | ||
hooks: | ||
- id: nbqa-ruff | ||
args: ["--fix"] | ||
- id: nbqa-black |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,45 @@ | ||
import unittest | ||
import unittest.mock as mock | ||
|
||
from llmlingua import PromptCompressor | ||
|
||
|
||
class LLMLinguaTester(unittest.TestCase): | ||
""" | ||
End2end Test for LLMLingua | ||
""" | ||
|
||
GSM8K_PROMPT = "Question: Angelo and Melanie want to plan how many hours over the next week they should study together for their test next week. They have 2 chapters of their textbook to study and 4 worksheets to memorize. They figure out that they should dedicate 3 hours to each chapter of their textbook and 1.5 hours for each worksheet. If they plan to study no more than 4 hours each day, how many days should they plan to study total over the next week if they take a 10-minute break every hour, include 3 10-minute snack breaks each day, and 30 minutes for lunch each day?\nLet's think step by step\nAngelo and Melanie think they should dedicate 3 hours to each of the 2 chapters, 3 hours x 2 chapters = 6 hours total.\nFor the worksheets they plan to dedicate 1.5 hours for each worksheet, 1.5 hours x 4 worksheets = 6 hours total.\nAngelo and Melanie need to start with planning 12 hours to study, at 4 hours a day, 12 / 4 = 3 days.\nHowever, they need to include time for breaks and lunch. Every hour they want to include a 10-minute break, so 12 total hours x 10 minutes = 120 extra minutes for breaks.\nThey also want to include 3 10-minute snack breaks, 3 x 10 minutes = 30 minutes.\nAnd they want to include 30 minutes for lunch each day, so 120 minutes for breaks + 30 minutes for snack breaks + 30 minutes for lunch = 180 minutes, or 180 / 60 minutes per hour = 3 extra hours.\nSo Angelo and Melanie want to plan 12 hours to study + 3 hours of breaks = 15 hours total.\nThey want to study no more than 4 hours each day, 15 hours / 4 hours each day = 3.75\nThey will need to plan to study 4 days to allow for all the time they need.\nThe answer is 4\n\nQuestion: You can buy 4 apples or 1 watermelon for the same price. You bought 36 fruits evenly split between oranges, apples and watermelons, and the price of 1 orange is $0.50. How much does 1 apple cost if your total bill was $66?\nLet's think step by step\nIf 36 fruits were evenly split between 3 types of fruits, then I bought 36/3 = 12 units of each fruit\nIf 1 orange costs $0.50 then 12 oranges will cost $0.50 * 12 = $6\nIf my total bill was $66 and I spent $6 on oranges then I spent $66 - $6 = $60 on the other 2 fruit types.\nAssuming the price of watermelon is W, and knowing that you can buy 4 apples for the same price and that the price of one apple is A, then 1W=4A\nIf we know we bought 12 watermelons and 12 apples for $60, then we know that $60 = 12W + 12A\nKnowing that 1W=4A, then we can convert the above to $60 = 12(4A) + 12A\n$60 = 48A + 12A\n$60 = 60A\nThen we know the price of one apple (A) is $60/60= $1\nThe answer is 1" | ||
GSM8K_150TOKENS_COMPRESSED_SINGLE_CONTEXT_PROMPT = "Question: Angelo and Melanie to plan how many hours they should together their test have 2 their textbook and 4 to They out should and 1 hours. they study, many they study total week they a break every hour, include 3minute and lunch day\n's think step\n Melanie should the chapters hours 2 = hours\n the to dedicate x\n Melanie to with planning 12 hours to study, at 4 hours a day, 12 / 4 = 3 days.\nHowever, they need to include time for breaks and lunch. Every hour they want to include a 10-minute break, so 12 total hours x 10 minutes = 120 extra minutes for breaks.\nThey also want to include 3 10-minute snack breaks, 3 x 10 minutes = 30 minutes.\nAnd they want to include 30 minutes for lunch each day, so 120 minutes for breaks + 30 minutes for snack breaks + 30 minutes for lunch = 180 minutes, or 180 / 60 minutes per hour = 3 extra hours.\nSo Angelo and Melanie want to plan 12 hours to study + 3 hours of breaks = 15 hours total.\nThey want to study no more than 4 hours each day, 15 hours / 4 hours each day = 3.75\nThey will need to plan to study 4 days to allow for all the time they need.\nThe answer is 4" | ||
GSM8K_150TOKENS_COMPRESSED_MULTIPLE_CONTEXT_PROMPT = "Question: You can buy 4 apples or 1 for. You bought 36 fruits evenly split between, waterons and of 1 orange $.. much does cost if your total bill $\n's think step\nIf were between 3 of, then I 36/3 = 12 of fruitIf 1 orange50 then oranges50 * $If66 I $ oranges I $66 $60 on the other 2 fruit\nAssuming the of is W, and that you price and of is then 1W=4AIf we know we bought 12 and, then we know that $60 = 12W + 12A\nKnowing that 1W=4A, then we can convert the above to $60 = 12(4A) + 12A\n$60 = 48A + 12A\n$60 = 60A\nThen we know the price of one apple (A) is $60/60= $1\nThe answer is 1" | ||
|
||
def __init__(self, *args, **kwargs): | ||
super(LLMLinguaTester, self).__init__(*args, **kwargs) | ||
self.llmlingua = PromptCompressor("lgaalves/gpt2-dolly", device_map="cpu") | ||
|
||
def test_general_compress_prompt(self): | ||
# Single Context | ||
compressed_prompt = self.llmlingua.compress_prompt( | ||
self.GSM8K_PROMPT.split("\n\n")[0], target_token=150 | ||
) | ||
self.assertEqual( | ||
compressed_prompt["compressed_prompt"], | ||
self.GSM8K_150TOKENS_COMPRESSED_SINGLE_CONTEXT_PROMPT, | ||
) | ||
self.assertEqual(compressed_prompt["origin_tokens"], 422) | ||
self.assertEqual(compressed_prompt["compressed_tokens"], 293) | ||
self.assertEqual(compressed_prompt["ratio"], "1.4x") | ||
self.assertEqual(compressed_prompt["rate"], "69.4%") | ||
|
||
# Multiple Context | ||
compressed_prompt = self.llmlingua.compress_prompt( | ||
self.GSM8K_PROMPT.split("\n\n"), target_token=150 | ||
) | ||
self.assertEqual( | ||
compressed_prompt["compressed_prompt"], | ||
self.GSM8K_150TOKENS_COMPRESSED_MULTIPLE_CONTEXT_PROMPT, | ||
) | ||
self.assertEqual(compressed_prompt["origin_tokens"], 727) | ||
self.assertEqual(compressed_prompt["compressed_tokens"], 206) | ||
self.assertEqual(compressed_prompt["ratio"], "3.5x") | ||
self.assertEqual(compressed_prompt["rate"], "28.3%") |
Oops, something went wrong.