-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pairwise producing empty csv #368
Comments
well, that is very weird - I have some ideas about identical sketches maybe causing problems with the way we load them - but I am traveling and can't debug in detail at the moment. One question for you - have you tried using potentially related issues / comments -
thank you for reporting this! |
Adding
|
If I add an extra hash to "mins" in both sample_1 and sample_3, then I get a comparison between sample_1 and sample_3 in the output. Though it has a Jaccard of 1.0 even though the threshold is the default at 0.01? New csv
New signature
|
I took a look at this just now - based on the I did a conversion and it looks like you are using a scaled of 1, right? Try specifying (I'll debug this more myself when I have a moment ;)) |
Perfect, thanks! That works. I now see
|
thanks for this - we really do need to document the distance vs similarity thing better! ref sourmash-bio/sourmash#2406 please leave this issue open for a bit - I'm going to create some new issues to fix and document the various things you've highlighted for us! thanks again!
|
I am using sourmash to compare some custom kmer sets. I've used
sourmash compare
successfully, but in scaling up, I hit memory issues. To fix this, I triedsourmash scripts pairwise
, which is much faster and less memory intensive in general, but I am getting some odd results in testing.I have an example signature-set with 5 signatures, two identical. When I run
sourmash compare
, I get expected results, a Jaccard distance of 0 between the identical signatures and other comparisons as expected. When I runsourmash scripts pairwise
, I get an empty csv. Any ideas?example.sig:
[{"class":"sourmash_signature","email":"","hash_function":"0.murmur64","filename":null,"name":"sample_1","license":"CC0","signatures":[{"num":0,"ksize":60,"seed":42,"max_hash":18446744073709551615,"mins":[4635994617403463936,14326303558305821343],"md5sum":"2ce198fcc7ea357fb5069a0e9448516c","molecule":"DNA"}],"version":0.4},{"class":"sourmash_signature","email":"","hash_function":"0.murmur64","filename":null,"name":"sample_5","license":"CC0","signatures":[{"num":0,"ksize":60,"seed":42,"max_hash":18446744073709551615,"mins":[749074125177198013,4736707972582783194,14326303558305821343],"md5sum":"d915856b77f4c6b57b612093d402decc","molecule":"DNA"}],"version":0.4},{"class":"sourmash_signature","email":"","hash_function":"0.murmur64","filename":null,"name":"sample_2","license":"CC0","signatures":[{"num":0,"ksize":60,"seed":42,"max_hash":18446744073709551615,"mins":[4635994617403463936,14326303558305821343],"md5sum":"2ce198fcc7ea357fb5069a0e9448516c","molecule":"DNA"}],"version":0.4},{"class":"sourmash_signature","email":"","hash_function":"0.murmur64","filename":null,"name":"sample_3","license":"CC0","signatures":[{"num":0,"ksize":60,"seed":42,"max_hash":18446744073709551615,"mins":[428577040998145274,749074125177198013,4736707972582783194],"md5sum":"f0a7bceac43003ed9287bec5b5003c22","molecule":"DNA"}],"version":0.4}]
Sourmash compare command
Result
Sourmash pairwise command
Log, produces empty file
The text was updated successfully, but these errors were encountered: