Question about running inference #237

SYuan03 · 2024-08-22T12:37:12Z

Thank you for the work you've done—it's really awesome!

I'm trying to reproduce your results on some datasets and I'm noticing some differences in accuracy, I'm using AutoModelForCausalLM when loading the model and I'm not sure if this will have an impact on the accuracy?

model = AutoModelForCausalLM.from_pretrained(
            model_path,
            attn_implementation='flash_attention_2',
            torch_dtype=torch.half,
            trust_remote_code=True
        )

LukeForeverYoung · 2024-08-27T08:34:14Z

Hi, the accuracy of most large language models can be affected by the prompt. We will release the evaluation pipeline in the coming days for easy reproducibility of the results. For a specific task, can you provide more information about your evaluation process, such as the prompts and processors used?

SYuan03 · 2024-08-28T06:50:49Z

Hi, the accuracy of most large language models can be affected by the prompt. We will release the evaluation pipeline in the coming days for easy reproducibility of the results. For a specific task, can you provide more information about your evaluation process, such as the prompts and processors used?

Thanks for the reply. I'm a developer for the project VLMEvalKit. We're trying to add mPLUG-Owl3 into the project. I have submitted a PR to support the model, but the accuracy on e.g. the MMBench dataset fails to meet expectations. Here is my code.

We'd appreciate it if you'd mention PR yourself in support! But it will take you some time to read the developer's manual. So if it's convenient for you, you can see what's wrong with my code. Thanks a lot.

chancharikmitra · 2024-08-29T17:26:37Z

Side point regarding this question. Can we get a confirmation from the authors that what @SYuan03 showed is the proper way to load the model through huggingface? This is because the README does not use AutoModelforCausalLM to load the model for some reason. Would appreciate confirmation on that point.

SYuan03 · 2024-08-29T17:32:34Z

Side point regarding this question. Can we get a confirmation from the authors that what @SYuan03 showed is the proper way to load the model through huggingface? This is because the README does not use AutoModelforCausalLM to load the model for some reason. Would appreciate confirmation on that point.

Hi，thank you for your attention. I actually noticed this, and I tried loading the model two ways, but they were consistent in my tests.

LukeForeverYoung · 2024-09-23T07:51:56Z

Hi, the accuracy of most large language models can be affected by the prompt. We will release the evaluation pipeline in the coming days for easy reproducibility of the results. For a specific task, can you provide more information about your evaluation process, such as the prompts and processors used?

Thanks for the reply. I'm a developer for the project VLMEvalKit. We're trying to add mPLUG-Owl3 into the project. I have submitted a PR to support the model, but the accuracy on e.g. the MMBench dataset fails to meet expectations. Here is my code.

We'd appreciate it if you'd mention PR yourself in support! But it will take you some time to read the developer's manual. So if it's convenient for you, you can see what's wrong with my code. Thanks a lot.

Thank you for supporting our models! We have recently released the evaluation pipelines at this link to help you reproduce the evaluation results.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about running inference #237

Question about running inference #237

SYuan03 commented Aug 22, 2024

LukeForeverYoung commented Aug 27, 2024

SYuan03 commented Aug 28, 2024

chancharikmitra commented Aug 29, 2024

SYuan03 commented Aug 29, 2024

LukeForeverYoung commented Sep 23, 2024

Question about running inference #237

Question about running inference #237

Comments

SYuan03 commented Aug 22, 2024

LukeForeverYoung commented Aug 27, 2024

SYuan03 commented Aug 28, 2024

chancharikmitra commented Aug 29, 2024

SYuan03 commented Aug 29, 2024

LukeForeverYoung commented Sep 23, 2024