Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mask CLS at the end of seq #550

Closed
wants to merge 1 commit into from
Closed

Conversation

gpucce
Copy link
Contributor

@gpucce gpucce commented Jun 24, 2023

@yiren-jiran thanks for finding the bug, in #549 it looks like I made a mistake in creating the cls mask.

This PR should be the fix, will have to look how much changing this affects models performance.

@yiren-jian
Copy link

I believe fixing this issue requires re-training CoCa models.

However, it's not supervised to me that the current trained models working well: [CLS] token was still able to "see" all the relevant preceding text, but not itself.

Fixing the bug without re-training models would hurt retrieval/classification performance, I guess.

@gpucce
Copy link
Contributor Author

gpucce commented Jun 25, 2023

Indeed, likely needs retraining we this change.

@gpucce
Copy link
Contributor Author

gpucce commented Jun 26, 2023

@rwightman what would be the best way to address this? Does it make sense to retrain the model?

@rom1504
Copy link
Collaborator

rom1504 commented Jun 26, 2023

my opinion is

  1. do not change the default
  2. do small scale experiment to see if the fixes help
  3. if they don't help, have a comment explaining why things are how they are and don't change the default
  4. if they do help significantly, plan to retrain

@rom1504
Copy link
Collaborator

rom1504 commented Jun 26, 2023

but first fix the CI here

@gpucce
Copy link
Contributor Author

gpucce commented Jun 26, 2023

@rom1504, @iejMac it appears to be an issue with caching older test run times or smth similar, removing

touch .test_durations
cp .test_durations durations_1
mv .test_durations durations_2

seems to fix it, however I am not super confident about github ci,
if you or someone else who is can check it would probably help

@gpucce
Copy link
Contributor Author

gpucce commented Jun 27, 2023

@rom1504 @iejMac so I found that pluggy was not versioned, adding it to the requirements-test.txt file fixed the issue in the appropriate way I believe.

Would still be useful to have someone that knows more about pytest confirm this

@gpucce
Copy link
Contributor Author

gpucce commented Jun 27, 2023

@rom1504 sorry for several tags, I think it was a caching issue but honestly not sure how it got fixed

@gpucce
Copy link
Contributor Author

gpucce commented Jun 27, 2023

Moved to #551

@gpucce gpucce closed this Jun 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants