Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MRG: upgrade sig overlap and sig subtract to load more than JSON signatures #3153

Merged
merged 10 commits into from
Jun 4, 2024

Conversation

ctb
Copy link
Contributor

@ctb ctb commented May 12, 2024

Fix sig overlap and sig subtract to take more than just JSON signatures.

Also, adds a function sourmash_args.load_one_signature that I think should (eventually) replace the now-deprecated sourmash.signature.load_one_signature. This will be the topic of a new PR - for now, I think it's a nice quick fix!

Fixes #3136

Related issues:

TODO:

  • test uncovered code
  • do a bit more of a search and digest of related issues to see if there's other low hanging fruit

Copy link

codecov bot commented May 12, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 86.65%. Comparing base (fabe76a) to head (6217360).
Report is 81 commits behind head on latest.

Additional details and impacted files
@@            Coverage Diff             @@
##           latest    #3153      +/-   ##
==========================================
+ Coverage   86.64%   86.65%   +0.01%     
==========================================
  Files         136      136              
  Lines       15807    15821      +14     
  Branches     2713     2713              
==========================================
+ Hits        13696    13710      +14     
  Misses       1801     1801              
  Partials      310      310              
Flag Coverage Δ
hypothesis-py 25.30% <4.76%> (-0.02%) ⬇️
python 92.33% <100.00%> (+<0.01%) ⬆️
rust 62.05% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ctb ctb force-pushed the upgrade_sig_cmds branch from 2c98573 to 6da2d3a Compare May 12, 2024 16:54
@ctb ctb changed the title WIP: upgrade sig overlap and sig subtract MRG: upgrade sig overlap and sig subtract May 13, 2024
@ctb
Copy link
Contributor Author

ctb commented May 13, 2024

Ready for review @sourmash-bio/devs

@ctb ctb changed the title MRG: upgrade sig overlap and sig subtract MRG: upgrade sig overlap and sig subtract to load more than JSON signatures May 14, 2024
@ctb
Copy link
Contributor Author

ctb commented May 23, 2024

@ccbaumler @AnneliektH would either of you be able to look at this and (potentially) approve it? lmk if you don't have review privileges.

@ctb ctb enabled auto-merge (squash) May 23, 2024 13:49
Copy link
Contributor

@bluegenes bluegenes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@ctb ctb disabled auto-merge June 4, 2024 18:03
@ctb ctb merged commit 2542c69 into latest Jun 4, 2024
39 of 40 checks passed
@ctb ctb deleted the upgrade_sig_cmds branch June 4, 2024 18:03
ctb added a commit that referenced this pull request Jun 4, 2024
Note: PR into #3153

Tackles some signature loading and saving cleanup throughout the
codebase. Most changes are in the tests, and this is a significant
cleanup of the test code!

Fixes #1062.

---

Goals:
* deprecate external use of `sourmash.signature` load/save functions,
because they are JSON-specific and inflexible.
* simplify and standardize signature load/save function usage during
tests;
* get rid of deprecation messages during tests;

In brief,

* rename `sourmash.signature.load_signatures` to
`load_signatures_from_json`;
* rename `sourmash.signature.load_one_signature` to
`load_one_signature_from_json`;
* rename `sourmash.signature.save_signatures` to
`save_signatures_to_json`;
* deprecate `sourmash.save_signatures` and `sourmash.load_one_signature`
for 5.0 (joining `load_signatures`, which was already deprecated);
* reduce/eliminate deprecations by transitioning internal test code to
use these three functions directly from `sourmash.signature` instead of
from the top-level sourmash import.
* **bonus**: eliminate zipfile UserWarning around overwriting files,
which causes lots of warnings when running tests.

---

Done:
- [x] in sourmash.signature submodule, rename `load_signatures` to
`load_signatures_from_json`, `load_one_signature` to
`load_one_signatures_from_json`, and `save_signatures` to
`save_signatures_to_json`; make tests pass.
- [x] deprecate `sourmash.load_one_signature` and
`sourmash.save_signatures`.
- [x] catch zipfile UserWarning for duplicate filenames in
ZipStorage.save

TODO:
- [x] transition internal sourmash code+tests away from deprecated
functions
- [ ] create issue around changing API documentation prior to 5.0;

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
@ctb ctb mentioned this pull request Jun 10, 2024
ctb added a commit that referenced this pull request Jun 11, 2024
Minor new features:

* add `--set-name` to `sig intersect` and `sig subtract` (#3162)
* upgrade `sig overlap` and `sig subtract` to load more than JSON
signatures (#3153)
* force continue past `tax genome` classification errors (#3100)

Bug fixes:

* fix `remaining_bp` output from sourmash gather (#3195)
* fix RocksDB-based gather & other rust-based infelicities revealed by
plugins (#3193, #3197)
* use correct denominator in f_unique_to_query (#3138)

Cleanup and documentation updates:

* update JOSS for sourmash v4 (#3114, #3203, #3209)
* fix links to taxonomy spreadsheets (#3119)
* fix description of `f_unique_weighted` (#3164)

Developer updates:

* transition internal signature loading functions (#3161)
* allow get/set record.filename (#3121)
* round a number that is losing precision in 15th place in
`test_distance_utpy` (#3126)
* disable ppc64le wheel building (#3127)
* prepare to remove `sourmash compute` for sourmash v5.0 (#3103)
* add rustup target x86_64-apple-darwin (#3148)
* mv `.cargo/config` to `config.toml` (#3147)
* fix clippy warnings about max_value (#3146)
* bump to v4.8.9-dev (#3135)
* update src/core/CHANGELOG.md for sourmash-rs core release r0.14.0
(#3199)

Dependabot updates:

* Bump DeterminateSystems/nix-installer-action from 11 to 12 (#3184)
* Bump DeterminateSystems/magic-nix-cache-action from 6 to 7 (#3185)
* Bump statrs from 0.16.0 to 0.16.1 (#3186)
* Bump serde from 1.0.202 to 1.0.203 (#3175)
* Bump ouroboros from 0.18.3 to 0.18.4 (#3176)
* Bump itertools from 0.12.1 to 0.13.0 (#3166)
* Bump camino from 1.1.6 to 1.1.7 (#3169)
* Bump serde from 1.0.201 to 1.0.202 (#3168)
* Bump thiserror from 1.0.60 to 1.0.61 (#3167)
* Bump pypa/cibuildwheel from 2.18.0 to 2.18.1 (#3165)
* Bump DeterminateSystems/magic-nix-cache-action from 4 to 6 (#3157)
* Bump DeterminateSystems/nix-installer-action from 10 to 11 (#3156)
* Bump pypa/cibuildwheel from 2.17.0 to 2.18.0 (#3155)
* Bump serde_json from 1.0.116 to 1.0.117 (#3159)
* Bump thiserror from 1.0.59 to 1.0.60 (#3158)
* Bump serde from 1.0.200 to 1.0.201 (#3160)
* Bump roaring from 0.10.3 to 0.10.4 (#3142)
* Bump histogram from 0.10.0 to 0.10.1 (#3141)
* Bump getrandom from 0.2.14 to 0.2.15 (#3143)
* Bump num-iter from 0.1.44 to 0.1.45 (#3140)
* Bump jinja2 from 3.1.3 to 3.1.4 (#3145)
* Bump serde from 1.0.199 to 1.0.200 (#3144)
* Bump serde from 1.0.198 to 1.0.199 (#3130)
* Bump conda-incubator/setup-miniconda from 3.0.3 to 3.0.4 (#3131)
* Update pytest requirement from <8.2.0,>=6.2.4 to >=6.2.4,<8.3.0
(#3132)
* Bump myst-parser from 2.0.0 to 3.0.1 (#3133)
* Bump thiserror from 1.0.58 to 1.0.59 (#3123)
* Bump serde_json from 1.0.115 to 1.0.116 (#3124)
* Bump serde from 1.0.197 to 1.0.198 (#3122)
* Update docutils requirement from <0.21,>=0.17.1 to >=0.17.1,<0.22
(#3116)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sig overlap and sig subtract should be upgraded to support more than JSON sigs
2 participants