Release OpenCompass v0.1.4 · open-compass/opencompass

OpenCompass v0.1.4 is here with an array of features, documentation improvements, and key fixes! Dive in to see what's in store:

🆕 Highlights:

More Tools and Features: OpenCompass continues to expand its repertoire with the addition of tools like update suffix, codellama, preds collection tools, qwen & qwen-chat support, and more. Not forgetting our attention to Otter and the MMBench Evaluation!
Documentation Facelift: We've made several updates to our documentation, ensuring it stays relevant, user-friendly, and aesthetically pleasing.
Essential Bug Fixes: We’ve tackled numerous bugs, especially those concerning tokens, triviaqa, nq postprocess, and qwen config.
Enhancements: From simplifying execution logic to suppressing warnings, we’re always on the lookout for ways to improve our product.

Dive deeper to learn more:

🌟 New Features:

📦 Tools and Integrations:

Application of update suffix tool (#280).
Support for codellama and preds collection tools (#335).
Addition of qwen & qwen-chat support (#286).
Introduction of Otter to OpenCompass MMBench Evaluation (#232).
Support for LLaVA and mPLUG-Owl (#331).

🛠 Utilities and Functionality:

Enhanced sample count in prompt_viewer (#273).
Ignored ZeroRetriever error when id_list provided (#340).
Improved default task size (#360).

📝 Documentation:

Updated communication channels: WeChat and Discord (#328).
Documentation theme revamped for a fresh look (#332).
Detailed documentation for the new entry script (#246).
MMBench documentation updated (#336).

🛠️ Bug Fixes:

Resolved issue when missing both pad and eos token (#287).
Addressed triviaqa & nq postprocess glitches (#350).
Fixed qwen configuration inaccuracies (#358).
Default value added for zero retriever (#361).

⚙ Enhancements and Refactors:

Streamlined execution logic in run.py and ensured temp files cleanup (#337).
Suppressed unnecessary warnings raised by get_logger (#353).
Import checks of multimodal added (#352).

🎉 New Contributors:

Thank you to all our contributors for this release, with a special shoutout to our new contributors:

@Luodian (First PR)
@ZhangYuanhan-AI (First PR)
@HAOCHENYE (First PR)

Thank you to the entire community for pushing OpenCompass forward. Make sure to star 🌟 our GitHub repository if OpenCompass aids your endeavors! We treasure your feedback and contributions.

Changelog

[Feature] Add and apply update suffix tool by @Leymore in #280
support sample count in prompt_viewer by @cdpath in #273
docs: update wechat and discord by @vansin in #328
[Docs] Update doc theme by @gaotongxiao in #332
[Feat] support codellama and preds collection tools by @yingfhu in #335
[Feature] Add qwen & qwen-chat support by @Leymore in #286
[Feat] Add Otter to OpenCompass MMBench Evaluation by @Luodian in #232
[Docs] Update docs for new entry script by @gaotongxiao in #246
[Fix] Fix when missing both pad and eos token by @Leymore in #287
[Doc] Update MMBench.md by @kennymckormick in #336
[Feat] Support LLaVA and mPLUG-Owl by @ZhangYuanhan-AI in #331
[Feature] Ignore ZeroRetriever error when id_list provided by @Leymore in #340
[Enhance] Add import check of multimodal by @fangyixiao18 in #352
[Sync] [Enhancement] Simplify execution logic in run.py; use finally to clean up temp files by @gaotongxiao in #337
[Fix] Fix triviaqa & nq postprocess by @Leymore in #350
[Enhance] Supress warning raised by get_logger by @HAOCHENYE in #353
[Fix] Update qwen config by @Leymore in #358
[Fix] zero retriever add default value by @Leymore in #361
[Enhancement] Increase default task size by @gaotongxiao in #360
[Fix] Quick lint fix by @Leymore in #362
[Docs] update code evaluator docs by @yingfhu in #354
[Feat] support wizardcoder series by @yingfhu in #344
[Feat] Support Qwen-VL-Chat on MMBench. by @yyk-wew in #312
[Feature] Update claude2 postprocessor by @gaotongxiao in #365
[Doc] Update Overview by @tonysy in #242
[Feat] Update URL by @tonysy in #368
[Feature] Update llama2 implement by @Leymore in #372
[Feature] Add open source dataset eval config of instruct-blip by @fangyixiao18 in #370
[Fix] Update bbh implement & Fix bbh suffix by @Leymore in #371
[Feaure] Add new models: baichuan2, tigerbot, vicuna v1.5 by @Leymore in #373
Bump version to 0.1.4 by @gaotongxiao in #367

For an exhaustive list of changes, kindly check our Full Changelog.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OpenCompass v0.1.4