Skip to content

Commit

Permalink
Unified cli (#86)
Browse files Browse the repository at this point in the history
* feat: **Commit Message:**

Add CLI support for grammar-constrained generation and remove old examples

- Introduced  sub-command in CLI for text generation with grammar constraints.
- Added options for 4-bit and 8-bit model loading using bitsandbytes.
- Removed obsolete example scripts for generating C code, JSON arrays, and relation extraction triples.
- Refactored  to include device handling.
- Updated  to enhance device management within the grammar-constrained generation process.

* feat: Refactor grammar-constrained generation scripts and CLI options

- Updated  with new examples for JSON, code generation, semantic parsing, and Unicode support.
- Removed outdated scripts (, , , , , , , ) and merged functionalities into .
- Enhanced grammar examples, including more detailed entities and relations in .
- Added new CLI arguments (, ) in  to control contrast mode and save output to a file.
- Improved unicode detection in  by adding a static method to automatically detect Unicode in grammar strings.
- Removed the  argument from various recognizer classes as it is now automatically detected.
- Added tests for Unicode detection in .

* doc: Update README with CLI example for JSON generation and improve documentation structure

- Added a command-line example for generating a valid JSON object using .
- Updated descriptions for generating JSON objects with minimal changes to HF code.
- Improved section summaries for examples using the HF pipeline API.
- Renamed  to  to better reflect its purpose and location in the project structure.

* feat: Refactor and improve documentation and CLI interface

- **README Enhancements:**
  - Improved structure, readability, and consistency across sections.
  - Updated the Quick Start section with examples for generating JSON objects using .
  - Expanded explanations for grammar use cases and clarified documentation on automatic JSON schema grammar conversion.
  - Updated the list of supported models and provided better guidance for advanced grammar debugging.

- **CLI Improvements:**
  - Simplified CLI prompt argument by renaming  to .
  - Added device selection options (, ) for model execution.

- **Code Cleanup:**
  - Removed outdated  script, consolidating its functionality within the CLI and README examples.

- **Miscellaneous:**
  - Removed outdated comments and TODOs from the CLI code.
  • Loading branch information
Saibo-creator authored Aug 27, 2024
1 parent 0586750 commit 0ce783b
Show file tree
Hide file tree
Showing 21 changed files with 465 additions and 854 deletions.
188 changes: 102 additions & 86 deletions README.md

Large diffs are not rendered by default.

117 changes: 117 additions & 0 deletions examples/demo.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@

################
#
# JSON generation: object and array
#
################

# generate json object
transformers-cfg-cli generate \
-m "microsoft/Phi-3-mini-4k-instruct" \
-g "examples/grammars/json.ebnf" \
-p "This is a valid json string for http request:" \
--use_4bit \
--max_new_tokens 60 \
--repetition_penalty 1.1

# generate json array

transformers-cfg-cli generate \
-m "microsoft/Phi-3-mini-4k-instruct" \
-g "examples/grammars/json_arr.ebnf" \
-p "Put my shopping list into a json array:" \
--use_4bit \
--max_new_tokens 60 \
--repetition_penalty 1.1

################
#
# Code generation: Python, C
#
################

# generate C code
transformers-cfg-cli generate \
-m "microsoft/Phi-3-mini-4k-instruct" \
-g "examples/grammars/c.ebnf" \
-p "#include <stdio.h>\n" \
--use_4bit \
--max_new_tokens 20 \
--repetition_penalty 3.0

################
#
# NLP tasks: relation extraction
#
################

# generate relation extraction triples
transformers-cfg-cli generate \
-m "microsoft/Phi-3-mini-4k-instruct" \
-g "examples/grammars/cIE.ebnf" \
-p "Extract relations from the following sentence: René Descartes was a French philosopher, scientist, and mathematician" \
--use_8bit \
--max_new_tokens 60 \
--repetition_penalty 1.1


################
#
# Semantic parsing: CalFlow, GeoQuery, overnight, etc.
#
################

transformers-cfg-cli generate \
-m "microsoft/Phi-3-mini-4k-instruct" \
-g "examples/grammars/calflow.ebnf" \
-p 'Generate 3 CalFlow strings: 1.(Yield (toRecipient (CurrentUser))) 2.(Yield (CreateCommitEventWrapper (CreatePreflightEventWrapper (Event.subject_? (?= "choose the meeting"))))) 3.' \
--use_4bit \
--max_new_tokens 60 \
--repetition_penalty 1.1

transformers-cfg-cli generate \
-m "microsoft/Phi-3-mini-4k-instruct" \
-g "examples/grammars/geo_query.ebnf" \
-p "Translate the following sentence into GeoQuery: What is the population of the largest city in California?" \
--use_4bit \
--max_new_tokens 60 \
--repetition_penalty 1.1

transformers-cfg-cli generate \
-m "microsoft/Phi-3-mini-4k-instruct" \
-g "examples/grammars/overnight.ebnf" \
-p """Translate natural language to DSL:
Q: which brick is no wider than 3 inches
A: listValue (filter (getProperty (singleton en.block) !type) (ensureNumericProperty width) <= (ensureNumericEntity 3 en.inch)))
Q: which block is above block 1
A: (listValue (filter (filter (getProperty (singleton en.block) !type) (reverse above) = en.block.block1) above = en.block.block1))
Q: what block is longer than 3 inches
A: """ \
--use_4bit \
--max_new_tokens 60 \
--repetition_penalty 1.1



################
#
# Unicode support, Chinese, Emoji, etc.
#
################

transformers-cfg-cli generate \
-m "microsoft/Phi-3-mini-4k-instruct" \
-g "examples/grammars/chinese.ebnf" \
-p "Translate the following sentence into Chinese: My neighbor is a very nice person. -> " \
--use_4bit \
--max_new_tokens 60 \
--repetition_penalty 1.1


transformers-cfg-cli generate \
-m "microsoft/Phi-3-mini-4k-instruct" \
-g "examples/grammars/emoji.ebnf" \
-p "Translate the following sentence into emoji: I am very happy today. -> " \
--use_4bit \
--max_new_tokens 60 \
--repetition_penalty 1.1
73 changes: 0 additions & 73 deletions examples/generate.py

This file was deleted.

45 changes: 0 additions & 45 deletions examples/generate_arabic.py

This file was deleted.

50 changes: 0 additions & 50 deletions examples/generate_cIE.py

This file was deleted.

58 changes: 0 additions & 58 deletions examples/generate_c_code.py

This file was deleted.

Loading

0 comments on commit 0ce783b

Please sign in to comment.