-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
what do we do about identical signatures when saving? #1501
Comments
(what I did in #1497 is have them append |
from #1574 (comment), It would be nice if we came up with some strategies for handling duplicated md5sum's downstream (e.g. report them as alternative match results?). Seems particularly important for |
relevant: #1573 |
From sourmash-bio/sourmash_plugin_branchwater#136 (comment) @bluegenes: @ctb Thinking through some of the challenges:
For zips used as the database:
|
from @bluegenes comment on #1497,
worth discussing!
my current hot take is that actual identical signatures (hashes + metadata identical) should generally not be saved, but I think there are performance nuances to be discussed here around tracking such things in really large collections of signatures. Hence - this issue to discuss!
The text was updated successfully, but these errors were encountered: