Ape (AI prompt engineer) is a prompt optimization library with implementations of various state-of-the-art prompt optimization methods.
Ape focuses on easier benchmarking, experimentation, and collaborative research of various techniques within the community. Ape makes it easy to apply and compare different prompt optimization techniques.
- Modular Architecture: Easily extendable classes and types in
ape-common
for building custom prompt optimization techniques. - Comprehensive Implementations: 1 file clean implementations of state-of-the-art methods by inheriting from a unified
Trainer
class. - Benchmarking Suite: A diverse set of benchmarks to evaluate performance across different tasks:
- bird-bench (SQL)
- gpqa (Reasoning)
- MATH (Mathematical Reasoning)
- boolq (Question Answering)
- NYT (Classification)
- More Benchmarks will be added soon
- Community Collaboration: A dedicated space in
ape-core/trainer/community
for proposing new architectures and sharing innovative ideas.
- DSPy-MIPRO
- EvoPrompt
- Few-Shot Trainer
- TextGradient Trainer
- TextGrad-Evo Trainer
- Optuna Trainer
- Expel Trainer
If you want to see the experiment results of methods over various benchmarks, please refer to the Experiment Results file.
pip install ape-core
from ape import Prompt
from ape.trainer import FewShotTrainer
student_prompt = Prompt(
messages=messages,
model="gpt-4o-mini",
temperature=0.0,
response_format=json_schema
)
trainer = FewShotTrainer(
generator=Generator(), # You should implement your own generator
metric=Metric(), # You should implement your own metric
# global_metric=GlobalMetric(), # If you want to use specific metric like MICRO-F1, you should implement your own global metric
)
optimized_prompt, report = await trainer.train(
prompt=student_prompt,
trainset=trainset,
testset=testset,
)
To enable syntax highlighting for .prompt files, consider using the Promptfile IntelliSense extension for VS Code.
Explore the ape-core/trainer/paper directory to find implementations of various prompt optimization techniques. Each subdirectory corresponds to a specific paper and contains:
README.md: An overview of the method. paper_name_trainer.py: The implementation of the technique inheriting from the Trainer class.
If you want to see the tutorial code to run Ape, please refer to the Example Experiment Code.
We welcome contributions to enhance Ape's capabilities and expand its collection of prompt optimization techniques. There are four main types of contributions you can make:
We aim to implement every paper on prompt optimization or automated prompt engineering.
If you want to implement a new paper, please refer to the CONTRIBUTING.md
file for more information.
All prompt optimization methods will be evaluated on various benchmarks to understand the strengths and weaknesses of each approach. Currently, we have 5 benchmarks: bird-bench, gpqa, math, boolq, and NYT.
Community research contributions focus on innovating beyond existing methods.
These contributions include bug fixes, documentation improvements, experiment management, and more.
For more information on contributing, please see the CONTRIBUTING.md
file.
If you have any questions, feedback, or suggestions, feel free to:
Raise an issue in the issue tracker. Join the Weavel Community Discord to connect with other users and contributors.
Ape is released under the MIT License.
Special thanks to the Stanford NLP's DSPy project for inspiration and foundational ideas.