Scripts

Before preparing task-specific data, you MUST first download eval.zip. It contains custom annotations, scripts, and the prediction files with LLaVA v1.5. Extract to ./playground/data/eval. This also provides a general structure for all datasets. For more details,please refer to doc

TextVQA

Download TextVQA_0.5.1_val.json and images and extract to ./playground/data/eval/textvqa.
Single-GPU inference and evaluate.

CUDA_VISIBLE_DEVICES=0 bash scripts/v1_5/eval/textvqa.sh

MME

Download the data following the official instructions here.
Downloaded images to MME_Benchmark_release_version.
put the official eval_tool and MME_Benchmark_release_version under ./playground/data/eval/MME.
Single-GPU inference and evaluate.

CUDA_VISIBLE_DEVICES=0 bash scripts/v1_5/eval/mme.sh

MMBench

Download mmbench_dev_20230712.tsv and put under ./playground/data/eval/mmbench.
Single-GPU inference.

CUDA_VISIBLE_DEVICES=0 bash scripts/v1_5/eval/mmbench.sh

Submit the results to the evaluation server: ./playground/data/eval/mmbench/answers_upload/mmbench_dev_20230712.

MMBench-CN

Download mmbench_dev_cn_20231003.tsv and put under ./playground/data/eval/mmbench.
Single-GPU inference.

CUDA_VISIBLE_DEVICES=0 bash scripts/v1_5/eval/mmbench_cn.sh

Submit the results to the evaluation server: ./playground/data/eval/mmbench/answers_upload/mmbench_dev_cn_20231003.

MM-Vet

Extract mm-vet.zip to ./playground/data/eval/mmvet.
Single-GPU inference.

CUDA_VISIBLE_DEVICES=0 bash scripts/v1_5/eval/mmvet.sh

Evaluate the predictions in ./playground/data/eval/mmvet/results using the official jupyter notebook.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Eval.md

Eval.md

Scripts

TextVQA

MME

MMBench

MMBench-CN

MM-Vet

Files

Eval.md

Latest commit

History

Eval.md

File metadata and controls

Scripts

TextVQA

MME

MMBench

MMBench-CN

MM-Vet