0.3.4
The OpenCompass team is thrilled to announce the release of OpenCompass v0.3.4!
🎉 OpenCompass v0.3.4 brings major enhancements including new benchmarks, improved documentation, and numerous bug fixes.
🌈 Notable features include support for new datasets and the integration of lmdeploy pipeline API.
🔧 Support for New Datasets:
- Addition of GaoKaoMath Dataset for Evaluation.
- Support for MMMLU & MMMLU-lite Benchmark.
- Integration of Judgerbench and reorganization of subeval.
- Support for LiveCodeBench.
📝 Output Format Enhancements:
- Support for printing and saving results as markdown format tables.
🔧 Pipeline and Integration Improvements:
- Integration of lmdeploy pipeline API.
- Update of TurboMindModel through integration of lmdeploy pipeline API.
- Removal of prefix bos_token from messages when using lmdeploy as the accelerator.
🛠️ Miscellaneous Enhancements:
- Updates to the common summarizer regex extraction.
- Internal humaneval postprocess addition and updates.
📖 Documentation Updates
🐛 Bug Fixes
🎉 Welcome New Contributors
👋 New Contributors Joined the Team:
@BobTsang1995 - Contributed support for MMMLU & MMMLU-lite Benchmark.
@noemotiovon - Provided NPU support fixes.
@changlan - Fixed RULER datasets.
@BIGWangYuDong - Added support for printing and saving results as markdown format tables.
Thank you to all contributors who have made this release possible. For a complete list of changes, please see the full changelog linked below.
Full Changelog: 0.3.3...0.3.4