Release 0.3.4 · open-compass/opencompass

The OpenCompass team is thrilled to announce the release of OpenCompass v0.3.4!

🎉 OpenCompass v0.3.4 brings major enhancements including new benchmarks, improved documentation, and numerous bug fixes.
🌈 Notable features include support for new datasets and the integration of lmdeploy pipeline API.

🔧 Support for New Datasets:

Addition of GaoKaoMath Dataset for Evaluation.
Support for MMMLU & MMMLU-lite Benchmark.
Integration of Judgerbench and reorganization of subeval.
Support for LiveCodeBench.

📝 Output Format Enhancements:

Support for printing and saving results as markdown format tables.

🔧 Pipeline and Integration Improvements:

Integration of lmdeploy pipeline API.
Update of TurboMindModel through integration of lmdeploy pipeline API.
Removal of prefix bos_token from messages when using lmdeploy as the accelerator.

🛠️ Miscellaneous Enhancements:

Updates to the common summarizer regex extraction.
Internal humaneval postprocess addition and updates.

📖 Documentation Updates

🐛 Bug Fixes

🎉 Welcome New Contributors
👋 New Contributors Joined the Team:

@BobTsang1995 - Contributed support for MMMLU & MMMLU-lite Benchmark.
@noemotiovon - Provided NPU support fixes.
@changlan - Fixed RULER datasets.
@BIGWangYuDong - Added support for printing and saving results as markdown format tables.
Thank you to all contributors who have made this release possible. For a complete list of changes, please see the full changelog linked below.

Full Changelog: 0.3.3...0.3.4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

0.3.4

Contributors