Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarification on Fixed Visual Token Counts (192, 128, 64) in Table 1 #10

Open
naajeehxe opened this issue Nov 1, 2024 · 2 comments
Open

Comments

@naajeehxe
Copy link

Hello, Thanks for your wonderful research.

I understand that the number of tokens being pruned depends on the value of lambda multiplied by the rank, while the number of recycled tokens is influenced by the hyper parameter tau .

But the table1 of this paper shows that the number of visual tokens is fixed at 192, 128, and 64.

Could you please clarify whether these token counts were hardcoded to select exactly 192, 128, or 64 visual tokens, or if there was another approach to maintaining a fixed token count for these experiments?

Thank you, Sincerely

@Gumpest
Copy link
Owner

Gumpest commented Nov 10, 2024

I appreciate your interest in our work.

  1. The retained tokens are controlled by changing the scaling factor in Formula 8.
  2. This is the equivalent number of tokens. For example, T = {(L1-L0) * T0 + (L2-L1) * T1} / L2.
  3. Therefore, we select exactly 192, 128, or 64 visual tokens to compare with other methods fairly.

@YUECHE77
Copy link

Hi,

I'm also very curious about the configuration in tabel 1. Could you please us an example?

For instance, when you retain exactly 128 tokens, what is your "--scale" and "--bias"? Are they 9 and 6? Please correct me if I'm wrong.

Also, I'm kind of confused about the meaning of "scale" and "bias". Does "scale" stand for the lambda in Formula 8? And what is "bias" exactly?

And by the way, could you please share the configuration for FastV in table 1? I'm trying to reproduce your results. What k (layer) and r (attention rank) were you using, for example "Retain 128 Tokens" as you mentioned in table 1?

image

Really like your work, it's amazing! And hope I can get your respond.

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants