Skip to content

Frankgu3528/Awesome-Long-Context-Benchmarks

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 

Repository files navigation

Long-Context LLM Benchmarks

πŸš€ A List of Long-Context LLM Benchmarks. Better view at here.

Dataset Release Date Type Domain Token Length Language Data Released? Answer Released
ZeroSCROLLS 2023-05 Realistic Novel Report Meetings TV Wikipedia Avg ~15k EN βœ… ❌
L-Eval 2023-07 Realistic Math Code Paper e.t.c ACL'24 Outstanding Avg ~ 15k ZH βœ… βœ…
LongBench 2023-08 Realistic Code Meeting Wiki Novel Avg ~13k ZH EN βœ… βœ…
BAMBOO 2023-09 Realistic Paper TVshows GovReport Code Meeting Only 4k, 16k EN βœ… βœ…
LooGLE 2023-11 Realistic Paper Wikipedia TV&Movie Avg ~24K EN βœ… βœ…
LVEval 2024-02 Realistic Mixup 16 32 64 128 256k ZH EN βœ… βœ…
InfiniteBench 2024-02 Realistic Code Novel Math Dialogue > 100k ZH EN βœ… βœ…
DocFInQA 2024-02 Realistic Finance > 100k EN βœ… βœ…
Counting-Stars 2024-03 Needle Essay Novel Any ZH EN βœ… βœ…
ClongEval 2024-03 Realistic Story News Conversation < 100k ZH βœ… βœ…
NovelQA 2024-03 Realistic Novel > 100 k EN βœ… ❌
RULER 2024-04 Needle Essays Any EN βœ… βœ…
XL2Bench 2024-04 Realistic Novel Paper Law > 100k ZH EN ❌ ❌
babilong 2024-06 Needle Books Any EN βœ… βœ…
MedOdyssey 2024-06 Realistic Needle Medical 40k-180K ZH EN βœ… βœ…
Loong 2024-06 Realistic Papers Legal Finance 40k-230k ZH EN βœ… βœ…
LongIns 2024-06 Other Multible QA 256 - 16k EN ❌ ❌
NOCHA 2024-07 Realistic Novel > 100k EN ❌ ❌
[SummaryStack][https://arxiv.org/abs/2407.01370] 2024-07 Other News Conversations Avg ~92k EN βœ… βœ…
NeedleBench 2024-07 Needle Essays Any ZH EN βœ… βœ…
ML-Needle 2024-08 Needle Wikipedia 4K-32K ZH EN SP GR AR VT βœ… βœ…

About

A collection of Long-Context LLM Benchmarks

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published