Refactored modules related to input-output #194

niksirbi · 2024-05-27T14:05:17Z

Description

What is this PR

Bug fix
Addition of a new feature
Other

Why is this PR needed?

The public functions in the load_poses.py module currently assume that users are always loading data from a file (or from a DeepLabCut-style pandas dataframe). However, there are some use-cases where the data are already in Python, in the form of numpy arrays, perhaps imported with custom loaders (this is not hypothetical, a potential user has already asked for it). There is a way to convert such data into a properly-formatted movement dataset, but this way is not easy to find and is not documented.

What does this PR do?

Adds a from_numpy() function that explicitly accepts position (+ optional confidence) data in the form of numpy arrays and returns a movement dataset. Under the hood it calls the ValidPosesDataset validator and the existing _from_valid_data() utility.

The addition of this function enabled me to slightly refactor the load_poses.py module such that from_numpy() is the single point-of-entry into a movement dataset - i.e. every other loading function first reads data into numpy arrays before calling the new function. This was already de facto the case, but it's much more explicit now. Moreover, this refactoring also enabled me to get read of a redundant validation call for LightningPose data.

Here's the schematic of the updated load_poses.py module. The previous version can be found here.

How has this PR been tested?

I added a simple unit test for the new function. The underlying ValidPosesDataset is already extensively tested, and so are all file loaders.

Is this a breaking change?

No.

Does this PR require an update to the documentation?

The API index has been updated accordingly. The new function's docstring also includes example usage.

Checklist:

The code has been tested locally
Tests have been added to cover all new functionality
The documentation has been updated to reflect any changes
The code has been formatted with pre-commit

EDIT 2024-05-31

Following @sfmig review, the scope of this PR expanded, resulting in a more thorough refactoring of IO-related modules: load_poses.py, save_poses.py, and validators.py. This mostly involved renaming functions and editing docstrings, to make the whole thing more logical and internally consistent.

These are the names of the updated public functions:

Note that we renamed from_dlc_df to from_dlc_style_df (and likewise for save), because LightningPose also uses "DeepLabCut-style" dataframes.

We also decided to rename private functions such that it's clear what is being converted to what, e.g.: _ds_from_sleap_labels_file() instead of _load_from_sleap_labels.file(). There is one remaining inconsistency, namely the fact that public functions start with from_ while private functions start with _ds_from_ or _df_from_. That's because the way public functions are actually invoked is the following:

from movement.io import load_poses

ds = load_poses.from_file(`file/path')

and load_poses.ds_from_file would be redundant. Perhaps there is scope for renaming load_poses to load_dataset (and save_poses to save_dataset accordingly), such that the syntax would be movement.io.load_dataset.from_file(). That could make more sense now, because "poses" is a bit ambiguous, while we've fully defined what a "dataset" is. I'll open an issue about that.

Here's the updated diagram for movement's I/O functionalites.

codecov · 2024-05-27T14:07:30Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 99.68%. Comparing base (426003c) to head (f99d7d8).

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #194      +/-   ##
==========================================
- Coverage   99.68%   99.68%   -0.01%     
==========================================
  Files          11       11              
  Lines         638      634       -4     
==========================================
- Hits          636      632       -4     
  Misses          2        2

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

sfmig

Nice refactoring ✨ !
Just some suggestions on function names and docstrings. Maybe it's not a lot to add to this PR but also happy to have it separately.

movement/io/load_poses.py

tests/test_unit/test_load_poses.py

niksirbi · 2024-05-30T09:32:03Z

Thanks for the review @sfmig, I like all you suggestions and will implement them here. The scope of this PR will increase to "refactoring load_poses module" and I will edit the PR title and description accordingly.

sonarqubecloud · 2024-05-31T14:50:03Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

niksirbi · 2024-05-31T15:11:37Z

@sfmig I've updated this PR's description and title. I think there is no need to go line-by-line through the diff again, just let me know whether you agree with the changes as I've described them in the updated PR description.

sfmig · 2024-06-03T14:26:14Z

Looks fantastic @niksirbi 🚀
And also great recap of the renaming bits, thanks!
Will merge now

niksirbi requested a review from sfmig May 28, 2024 07:09

niksirbi marked this pull request as ready for review May 28, 2024 07:09

sfmig approved these changes May 29, 2024

View reviewed changes

niksirbi added 13 commits May 31, 2024 14:56

added from_numpy() function to the load_poses module

c9b3746

unit test new function

1eefbf0

use from_numpy() function in other loaders as well

134512c

add Examples section to docstring and render in API index

553faba

add Examples section to docstring and render in API index

6b066da

confidence array is optional

c782ab5

None is the default for confidence

8d8b901

rename private functions

d319823

renamef from_dlc_df to from_dlc_style_df

f8c44d1

harmonise docstrings in load_poses

c1f5e7d

harmonised function names and docstrings in save_poses

e36c50f

harmonised docstrings in validators

17ec603

split Input/Output section of API index into modules

332864b

niksirbi force-pushed the load-from-numpy branch from 3188df4 to 332864b Compare May 31, 2024 13:57

renamed _from_lp_or_dlc_file to _ds_from_lp_or_dlc_file

f99d7d8

niksirbi changed the title ~~added from_numpy() function to the load_poses module~~ Refactored modules related to input-output May 31, 2024

niksirbi mentioned this pull request May 31, 2024

Consider renaming load_poses and save_poses to load_dataset and save_dataset #199

Open

sfmig added this pull request to the merge queue Jun 3, 2024

Merged via the queue into main with commit 8e20498 Jun 3, 2024
27 checks passed

niksirbi mentioned this pull request Jun 4, 2024

Add from_mmp_file function #198

Draft

7 tasks

niksirbi mentioned this pull request Jun 14, 2024

Make _from_valid_data public #223

Closed

niksirbi deleted the load-from-numpy branch June 15, 2024 09:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactored modules related to input-output #194

Refactored modules related to input-output #194

niksirbi commented May 27, 2024 •

edited

Loading

codecov bot commented May 27, 2024 •

edited

Loading

sfmig left a comment

niksirbi commented May 30, 2024

sonarqubecloud bot commented May 31, 2024

niksirbi commented May 31, 2024

sfmig commented Jun 3, 2024

Refactored modules related to input-output #194

Refactored modules related to input-output #194

Conversation

niksirbi commented May 27, 2024 • edited Loading

Description

How has this PR been tested?

Is this a breaking change?

Does this PR require an update to the documentation?

Checklist:

EDIT 2024-05-31

codecov bot commented May 27, 2024 • edited Loading

Codecov Report

sfmig left a comment

Choose a reason for hiding this comment

niksirbi commented May 30, 2024

sonarqubecloud bot commented May 31, 2024

Quality Gate passed

niksirbi commented May 31, 2024

sfmig commented Jun 3, 2024

niksirbi commented May 27, 2024 •

edited

Loading

codecov bot commented May 27, 2024 •

edited

Loading