-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MRG: add utility functions for zip reading; use in index, multisearch #105
Conversation
@ctb this is ready for review |
src/utils.rs
Outdated
sketchlist.push(sm); | ||
} else { | ||
// track number of paths that have no matching sigs | ||
skipped_paths.fetch_add(1, atomic::Ordering::SeqCst); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need atomic types? This loop doesn't look like it's parallelized, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, not parallelized.
There is a parallel zip library we could use if we want, but probably not needed except when reading from genbank-scale zips.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed atomic in 1ea65be
I like overall, but want to dig in to code a bit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! good code simplification too 👍
Extracted from #99, because all the changes were getting unruly.
This should be easier to review - contains the functions & modifications to use them in
index
andmultisearch
, with tests.New in
utils.py
:Renamed:
read_signatures_from_zip
-->load_sketches_from_zip
load_sketch_fromfile
-->load_fasta_fromfile
Note: In the case of reading a pathlist from a zip file, this means we extract the sig files and store them in a temp directory which will be deleted when it goes out of scope. This adds a little overhead, but I think it could worth it for the simplicity of being able to specify a zip database. Open to removing, though.