Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thank you for your work. How can we reproduce the V-NIAH benchmark results? #7

Open
fistyee opened this issue Jun 30, 2024 · 12 comments

Comments

@fistyee
Copy link

fistyee commented Jun 30, 2024

No description provided.

@jzhang38
Copy link
Collaborator

jzhang38 commented Jul 1, 2024

We cannot provide the haystack video ourselves as we use an actual movie in our evaluation. Specifically, I use the movie "孤注一掷(no more bet)" as the haystack :)

The rest instructions are written the README:

https://github.com/EvolvingLMMs-Lab/LongVA?tab=readme-ov-file#v-niah-evaluation

Let me know if you encounter any problems.

@fistyee
Copy link
Author

fistyee commented Jul 1, 2024

Thanks, could you describe what the duration distribution of clips extracted from movies is, and can you provide more query prompts for LongVA as a reference?

@jzhang38
Copy link
Collaborator

jzhang38 commented Jul 1, 2024

could you describe what the duration distribution of clips extracted from movies i

We do not use clips from the movie. We load the entire movie as the haystack video and sample frames at 1 fps, as stated in the paper and also reflected in our code:

def load_video_batches(video_path, batch_size):
vr = VideoReader(video_path, ctx=cpu(0))
total_frame_num = len(vr)
fps = round(vr.get_avg_fps())
frame_idx = [i for i in range(0, len(vr), fps)]
for start_idx in range(0, len(frame_idx), batch_size):
end_idx = min(start_idx + batch_size, total_frame_num)
frame_indices = frame_idx[start_idx:end_idx]
batch_frames = vr.get_batch(frame_indices).asnumpy()
yield batch_frames

can you provide more query prompts for LongVA as a reference?

I am not sure what you mean by "query prompts". If you are looking for needle images & questions, they are here: https://huggingface.co/datasets/lmms-lab/v_niah_needles.
If you are looking for the prompt template:

"qwen2": {
"preprompt": "<|im_start|>system\nYou are a helpful assistant.<|im_end|>\n<|im_start|>user\n",
"postprompt": "<|im_end|>\n<|im_start|>assistant\n",
},

@hello-bluedog
Copy link

截屏2024-09-10 16 39 54

@hello-bluedog
Copy link

I got None logits.Is there any possible reason.
I changed nothing.

@hello-bluedog
Copy link

[rank0]: Traceback (most recent call last):
[rank0]: File "/home/yunzhu/LongVA/vision_niah/eval_vision_niah.py", line 355, in
[rank0]: main(args.parse_args())
[rank0]: File "/home/yunzhu/LongVA/vision_niah/eval_vision_niah.py", line 333, in main
[rank0]: all_accuracies, accelerator = inference(args)
[rank0]: File "/home/yunzhu/LongVA/vision_niah/eval_vision_niah.py", line 239, in inference
[rank0]: correct = eval_forward(
[rank0]: File "/home/yunzhu/LongVA/vision_niah/eval_vision_niah.py", line 106, in eval_forward
[rank0]: pred = logits.argmax(dim=-1)
[rank0]: AttributeError: 'NoneType' object has no attribute 'argmax'
This is my log.

@hello-bluedog
Copy link

I Find You set logits=None in forward function of class Qwen2ForCausalLM_RingAttn, which is located at easy_context/modeling_qwen2.py.
I think this a bug, can you fix it?

@hello-bluedog
Copy link

Uploading 截屏2024-09-10 17.14.08.png…

@xuanxianyou
Copy link

I got None logits.Is there any possible reason. I changed nothing.

Hello, I got the same problem, have you solved it?

@hello-bluedog
Copy link

I solved it by calculating logits in modeling_qwen_ringattn's forward function
logits = self.lm_head(hidden_states).float()

in easy_context/modeling_qwen2.py
截屏2024-10-17 06 29 17

@jzhang38
Copy link
Collaborator

@hello-bluedog Hi yes this is a bug introduced by a PR. I just fix this.

@liyucheng09
Copy link

@jzhang38 Hi Peiyuan, do you consider to add v-NIAH to lmms-eval?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants