New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

about terms of apply_rope (freqs_cis, pe ...) #17

Open

KinamSalad opened this issue Sep 9, 2024 · 0 comments

KinamSalad commented Sep 9, 2024

Hello. Thank you for your wonderful code :)
I have a question about the freqs_cis term in the apply_rope function in modules/layers.py.

This function is used for attention, and if we look at model.py, we can see that the embeddings of txt_id and img_id are used as the freqs_cis term.

What are txt_id and img_id? Do we need any other terms besides the text and music pairs?

I commented out the apply_rope function and trained my model with just text/music pairs, but I didn't get good results.

It would be great if you could tell me what format this data is in.

Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment