-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Index.gather is not doing gather? #1263
Conversation
Codecov Report
@@ Coverage Diff @@
## latest #1263 +/- ##
==========================================
+ Coverage 88.71% 93.95% +5.24%
==========================================
Files 125 98 -27
Lines 18238 14615 -3623
Branches 1434 1434
==========================================
- Hits 16180 13732 -2448
+ Misses 1812 637 -1175
Partials 246 246
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
matches = lidx.gather(ss47) | ||
assert len(matches) == 2 | ||
assert len(matches) == 1 | ||
assert matches[0][0] == 1.0 | ||
assert matches[0][1] == ss47 | ||
assert round(matches[1][0], 2) == 0.49 | ||
assert matches[1][1] == ss63 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test is a good example of the 'weird' behavior: if doing gather with ss47
, it should match ss47
completely, and nothing will be left to match other sigs. But the test is asserting that there is a second match (ss63
), at 0.49 containment. That's... not gather, that's a regular search using containment.
I need to look at this more closely (and I thought there was an issue on this, or maybe just a long-winded discussion in some PR...) I do vaguely recall running into a mental roadblock around the behavior of Which is all to say... yeah, I agree it's probably messed up, and it'd be great to revisit, and I'll do so :) |
what about a function ref #1310 for the proposed CLI functionality. (this may be me just appropriating your idea under a different name, @luizirber. apologies if so - on a bit of a holiday & wanted to write this down while I had the thought at the tip of my brain, so am not taking time to remind myself of the whole conversation :) |
See #1489 for an alternate approach to resolving this conundrum. |
this can probably be closed, yah? viz #1370 |
yup. |
The default implementation of
Index.gather
returns all signatures with a containment above a threshold, which... is not what gather is supposed to do =]SBT and LCA reimplement the method, but
LinearIndex
reuses it and generate the wrong results.There is also a discussion about the proper intended behavior of
Index.gather
. Sometimes it returns the best match (only), sometimes it returns a list of matches. I think we should have a separate method for the first use case (best_match
?), and another with a simplegather
based on linear scans over the signatures.cc @ctb
Checklist
make test
Did it pass the tests?make coverage
Is the new code covered?without a major version increment. Changing file formats also requires a
major version number increment.
changes were made?