Skip to content

Latest commit

 

History

History
149 lines (105 loc) · 5.42 KB

README.md

File metadata and controls

149 lines (105 loc) · 5.42 KB

Ape: Open-Source Hub for AI Prompt Engineering

About

Ape (AI prompt engineer) is a prompt optimization library with implementations of various state-of-the-art prompt optimization methods.
Ape focuses on easier benchmarking, experimentation, and collaborative research of various techniques within the community. Ape makes it easy to apply and compare different prompt optimization techniques.

Read the docs →

Features

  • Modular Architecture: Easily extendable classes and types in ape-common for building custom prompt optimization techniques.
  • Comprehensive Implementations: 1 file clean implementations of state-of-the-art methods by inheriting from a unified Trainer class.
  • Benchmarking Suite: A diverse set of benchmarks to evaluate performance across different tasks:
    • bird-bench (SQL)
    • gpqa (Reasoning)
    • MATH (Mathematical Reasoning)
    • boolq (Question Answering)
    • NYT (Classification)
    • More Benchmarks will be added soon
  • Community Collaboration: A dedicated space in ape-core/trainer/community for proposing new architectures and sharing innovative ideas.

Implemented Techniques

Paper Implementations (ape-core/trainer/paper)

  • DSPy-MIPRO
  • EvoPrompt

Community Implementations (ape-core/trainer/community)

  • Few-Shot Trainer
  • TextGradient Trainer
  • TextGrad-Evo Trainer
  • Optuna Trainer
  • Expel Trainer

Experiment Results

If you want to see the experiment results of methods over various benchmarks, please refer to the Experiment Results file.

Installation

pip install ape-core

How to run

from ape import Prompt
from ape.trainer import FewShotTrainer

student_prompt = Prompt(
    messages=messages,
    model="gpt-4o-mini",
    temperature=0.0,
    response_format=json_schema
)

trainer = FewShotTrainer(
    generator=Generator(), # You should implement your own generator
    metric=Metric(), # You should implement your own metric
    # global_metric=GlobalMetric(), # If you want to use specific metric like MICRO-F1, you should implement your own global metric
)

optimized_prompt, report = await trainer.train(
    prompt=student_prompt,
    trainset=trainset,
    testset=testset,
)

To enable syntax highlighting for .prompt files, consider using the Promptfile IntelliSense extension for VS Code.

Getting Started

Explore the ape-core/trainer/paper directory to find implementations of various prompt optimization techniques. Each subdirectory corresponds to a specific paper and contains:

README.md: An overview of the method. paper_name_trainer.py: The implementation of the technique inheriting from the Trainer class.

If you want to see the tutorial code to run Ape, please refer to the Example Experiment Code.

Contributing

We welcome contributions to enhance Ape's capabilities and expand its collection of prompt optimization techniques. There are four main types of contributions you can make:

1. Paper Implementation Contributions

We aim to implement every paper on prompt optimization or automated prompt engineering. If you want to implement a new paper, please refer to the CONTRIBUTING.md file for more information.

2. Benchmark Contributions

All prompt optimization methods will be evaluated on various benchmarks to understand the strengths and weaknesses of each approach. Currently, we have 5 benchmarks: bird-bench, gpqa, math, boolq, and NYT.

3. Community Research Contributions

Community research contributions focus on innovating beyond existing methods.

4. Other Contributions

These contributions include bug fixes, documentation improvements, experiment management, and more.

For more information on contributing, please see the CONTRIBUTING.md file.

Help and Support

If you have any questions, feedback, or suggestions, feel free to:

Raise an issue in the issue tracker. Join the Weavel Community Discord to connect with other users and contributors.

License

Ape is released under the MIT License.

Acknowledgments

Special thanks to the Stanford NLP's DSPy project for inspiration and foundational ideas.

References