Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] refactor & clean up database loading around MultiIndex class #1406

Merged
merged 56 commits into from
Apr 2, 2021

Conversation

ctb
Copy link
Contributor

@ctb ctb commented Mar 24, 2021

This PR builds on #1374 to make database loading logic more generic. Specifically,

  • move load_from_directory and load_from_file_list to MultiIndex class
  • refactor sourmash_args._load_database accordingly

This results in substantial cleanup to sourmash_args, which is nice!

This is an intermediate step on the way to plugins, #1368 #1353.

This PR also:

notes

As currently written, there are some changes to exceptions that may not be acceptable for semantic versioning.

  • load_file_as_index now raises ValueError instead of OSError. See test_api.py.
  • load_signatures now raises ValueError instead of a general Exception for a parse error.

Fixes #1077
Fixes #810
Fixes #1376
Addresses #1072

TODO:

Checklist

  • Is it mergeable?
  • make test Did it pass the tests?
  • make coverage Is the new code covered?
  • Did it change the command-line interface? Only additions are allowed
    without a major version increment. Changing file formats also requires a
    major version number increment.
  • Was a spellchecker run on the source code and documentation after
    changes were made?

@ctb ctb changed the base branch from latest to add/multi_index March 24, 2021 15:46
@codecov
Copy link

codecov bot commented Mar 24, 2021

Codecov Report

Merging #1406 (6d6eb42) into latest (ed3c809) will increase coverage by 5.23%.
The diff coverage is 94.11%.

Impacted file tree graph

@@            Coverage Diff             @@
##           latest    #1406      +/-   ##
==========================================
+ Coverage   89.27%   94.51%   +5.23%     
==========================================
  Files         123       96      -27     
  Lines       18790    15299    -3491     
  Branches     1447     1463      +16     
==========================================
- Hits        16775    14460    -2315     
+ Misses       1782      606    -1176     
  Partials      233      233              
Flag Coverage Δ
python 94.51% <94.11%> (+0.02%) ⬆️
rust ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/sourmash/logging.py 28.70% <28.57%> (-0.01%) ⬇️
src/sourmash/lca/lca_db.py 91.34% <66.66%> (-1.09%) ⬇️
src/sourmash/sbt.py 83.65% <75.00%> (-0.49%) ⬇️
src/sourmash/sourmash_args.py 95.95% <94.23%> (+3.02%) ⬆️
src/sourmash/index.py 90.76% <96.87%> (-2.85%) ⬇️
src/sourmash/commands.py 83.33% <100.00%> (+0.69%) ⬆️
src/sourmash/lca/command_summarize.py 80.95% <100.00%> (+0.15%) ⬆️
src/sourmash/signature.py 90.73% <100.00%> (+0.97%) ⬆️
tests/test_api.py 100.00% <100.00%> (ø)
tests/test_index.py 100.00% <100.00%> (ø)
... and 30 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ed3c809...6d6eb42. Read the comment docs.

@ctb ctb changed the title [WIP] refactor & clean up database loading around MultiIndex class [MRG] refactor & clean up database loading around MultiIndex class Mar 28, 2021
@ctb
Copy link
Contributor Author

ctb commented Mar 31, 2021

Ready for review and merge @luizirber @bluegenes!

src/sourmash/index.py Outdated Show resolved Hide resolved
ctb added 3 commits April 1, 2021 22:23
…ding code (#1420)

* refactor select, add scaled/num/abund
* fix scaled check for LCA database
* add debug_literal
* fix scaled check for SBT
* fix LCA database ksize message & test
* add 'containment' to 'select'
* added 'is_database' flag for nicer UX
* remove overly broad exception catching
* document downsampling foo
@ctb
Copy link
Contributor Author

ctb commented Apr 2, 2021

@bluegenes sorry I missed that you were reviewing this along with #1420 :). I resolved the issues you raised, let me know if there's more!

@ctb
Copy link
Contributor Author

ctb commented Apr 2, 2021

🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants