Thank you for your work. How can we reproduce the V-NIAH benchmark results? #7

fistyee · 2024-06-30T02:31:24Z

No description provided.

jzhang38 · 2024-07-01T02:31:14Z

We cannot provide the haystack video ourselves as we use an actual movie in our evaluation. Specifically, I use the movie "孤注一掷(no more bet)" as the haystack :)

The rest instructions are written the README:

https://github.com/EvolvingLMMs-Lab/LongVA?tab=readme-ov-file#v-niah-evaluation

Let me know if you encounter any problems.

fistyee · 2024-07-01T03:20:41Z

Thanks, could you describe what the duration distribution of clips extracted from movies is, and can you provide more query prompts for LongVA as a reference?

jzhang38 · 2024-07-01T05:26:36Z

could you describe what the duration distribution of clips extracted from movies i

We do not use clips from the movie. We load the entire movie as the haystack video and sample frames at 1 fps, as stated in the paper and also reflected in our code:

LongVA/vision_niah/produce_haystack_embedding.py

Lines 12 to 21 in efc27fd

    
           def load_video_batches(video_path, batch_size): 
        
               vr = VideoReader(video_path, ctx=cpu(0)) 
        
               total_frame_num = len(vr) 
        
               fps = round(vr.get_avg_fps()) 
        
               frame_idx = [i for i in range(0, len(vr), fps)] 
        
               for start_idx in range(0, len(frame_idx), batch_size): 
        
                   end_idx = min(start_idx + batch_size, total_frame_num) 
        
                   frame_indices = frame_idx[start_idx:end_idx] 
        
                   batch_frames = vr.get_batch(frame_indices).asnumpy() 
        
                   yield batch_frames

can you provide more query prompts for LongVA as a reference?

I am not sure what you mean by "query prompts". If you are looking for needle images & questions, they are here: https://huggingface.co/datasets/lmms-lab/v_niah_needles.
If you are looking for the prompt template:

LongVA/vision_niah/eval_vision_niah.py

Lines 48 to 51 in efc27fd

    
           "qwen2": { 
        
               "preprompt": "<|im_start|>system\nYou are a helpful assistant.<|im_end|>\n<|im_start|>user\n", 
        
               "postprompt": "<|im_end|>\n<|im_start|>assistant\n", 
        
           },

hello-bluedog · 2024-09-10T08:40:08Z

hello-bluedog · 2024-09-10T08:41:29Z

I got None logits.Is there any possible reason.
I changed nothing.

hello-bluedog · 2024-09-10T08:42:09Z

[rank0]: Traceback (most recent call last):
[rank0]: File "/home/yunzhu/LongVA/vision_niah/eval_vision_niah.py", line 355, in
[rank0]: main(args.parse_args())
[rank0]: File "/home/yunzhu/LongVA/vision_niah/eval_vision_niah.py", line 333, in main
[rank0]: all_accuracies, accelerator = inference(args)
[rank0]: File "/home/yunzhu/LongVA/vision_niah/eval_vision_niah.py", line 239, in inference
[rank0]: correct = eval_forward(
[rank0]: File "/home/yunzhu/LongVA/vision_niah/eval_vision_niah.py", line 106, in eval_forward
[rank0]: pred = logits.argmax(dim=-1)
[rank0]: AttributeError: 'NoneType' object has no attribute 'argmax'
This is my log.

hello-bluedog · 2024-09-10T09:13:30Z

I Find You set logits=None in forward function of class Qwen2ForCausalLM_RingAttn, which is located at easy_context/modeling_qwen2.py.
I think this a bug, can you fix it?

hello-bluedog · 2024-09-10T09:14:15Z

xuanxianyou · 2024-09-11T18:06:54Z

I got None logits.Is there any possible reason. I changed nothing.

Hello, I got the same problem, have you solved it?

hello-bluedog · 2024-10-17T13:30:57Z

I solved it by calculating logits in modeling_qwen_ringattn's forward function
logits = self.lm_head(hidden_states).float()

in easy_context/modeling_qwen2.py

jzhang38 · 2024-11-23T03:41:42Z

@hello-bluedog Hi yes this is a bug introduced by a PR. I just fix this.

liyucheng09 · 2024-11-28T19:27:28Z

@jzhang38 Hi Peiyuan, do you consider to add v-NIAH to lmms-eval?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Thank you for your work. How can we reproduce the V-NIAH benchmark results? #7

Thank you for your work. How can we reproduce the V-NIAH benchmark results? #7

fistyee commented Jun 30, 2024

jzhang38 commented Jul 1, 2024 •

edited

Loading

fistyee commented Jul 1, 2024

jzhang38 commented Jul 1, 2024

hello-bluedog commented Sep 10, 2024

hello-bluedog commented Sep 10, 2024

hello-bluedog commented Sep 10, 2024

hello-bluedog commented Sep 10, 2024

hello-bluedog commented Sep 10, 2024

xuanxianyou commented Sep 11, 2024

hello-bluedog commented Oct 17, 2024

jzhang38 commented Nov 23, 2024

liyucheng09 commented Nov 28, 2024

Thank you for your work. How can we reproduce the V-NIAH benchmark results? #7

Thank you for your work. How can we reproduce the V-NIAH benchmark results? #7

Comments

fistyee commented Jun 30, 2024

jzhang38 commented Jul 1, 2024 • edited Loading

fistyee commented Jul 1, 2024

jzhang38 commented Jul 1, 2024

hello-bluedog commented Sep 10, 2024

hello-bluedog commented Sep 10, 2024

hello-bluedog commented Sep 10, 2024

hello-bluedog commented Sep 10, 2024

hello-bluedog commented Sep 10, 2024

xuanxianyou commented Sep 11, 2024

hello-bluedog commented Oct 17, 2024

jzhang38 commented Nov 23, 2024

liyucheng09 commented Nov 28, 2024

jzhang38 commented Jul 1, 2024 •

edited

Loading