Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement stats calculations for new calculated_stats_ container #468

Closed
wants to merge 31 commits into from

Conversation

TinyMarsh
Copy link
Contributor

@TinyMarsh TinyMarsh commented Jul 3, 2024

This PR adds much of the functionality to calculate the stats specified in channels_. A lot of the logic for this is copied from the existing code, but modified to use the calculated_stats_ vector in place of the existing series object. As a reminder, this is necessary because the calculated stats are being stored in a flat vector (calculated_stats_) which represents a multidimensional array of factors that the user wants to group by, as opposed to the series object. A result of this is the introduction of the calculate_index method, and the get_channel_index method - both necessary to find the correct index of the factor bin, and "channel" stat in the calculated_stats_ vector, respectively.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

src/HealthGPS/analysis_module.cpp Outdated Show resolved Hide resolved
src/HealthGPS/analysis_module.cpp Outdated Show resolved Hide resolved
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

src/HealthGPS/analysis_module.cpp Outdated Show resolved Hide resolved
Copy link
Contributor

@alexdewar alexdewar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nearly there, I think. I do think it would be worth adding a test for the calculate_index function though and I've made a few other small suggestions.

The Windows CI runner is failing because of some type narrowing warning again 😞

src/HealthGPS/analysis_module.cpp Outdated Show resolved Hide resolved
src/HealthGPS/analysis_module.cpp Outdated Show resolved Hide resolved
src/HealthGPS/analysis_module.cpp Outdated Show resolved Hide resolved
src/HealthGPS/analysis_module.cpp Outdated Show resolved Hide resolved
src/HealthGPS/analysis_module.cpp Show resolved Hide resolved
Comment on lines 48 to 49
std::vector<std::string> channels_;
std::unordered_map<std::string, int> channel_index_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might not be completely understanding what these data structures do... But am I right in thinking channel_index_ contains indexes into channels_? If so, could you just make channels_ a std::map<std::string, std::string> (I'm guessing it needs to be ordered...) and drop channels_index_?

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

src/HealthGPS/analysis_module.cpp Outdated Show resolved Hide resolved
src/HealthGPS/analysis_module.cpp Outdated Show resolved Hide resolved
src/HealthGPS/analysis_module.cpp Show resolved Hide resolved
src/HealthGPS/analysis_module.cpp Outdated Show resolved Hide resolved
@TinyMarsh TinyMarsh requested a review from alexdewar July 17, 2024 23:34
Copy link
Contributor

@alexdewar alexdewar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

I'd probs have a go at #470 before merging this, just so we can be confident that the indexing works, but it's up to you. I don't think it needs lots of test cases -- just a few basic ones.

@jamesturner246 jamesturner246 self-assigned this Sep 26, 2024
@jamesturner246 jamesturner246 removed their assignment Oct 4, 2024
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

}
}

auto ses_module = build_ses_noise_module(repository, config);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: no matching function for call to 'build_ses_noise_module' [clang-diagnostic-error]

        auto ses_module = build_ses_noise_module(repository, config);
                          ^
Additional context

src/HealthGPS/ses_noise_module.h:44: candidate function not viable: no known conversion from 'std::shared_ptrhgps::ModelInput' to 'const ModelInput' for 2nd argument

std::unique_ptr<SESNoiseModule> build_ses_noise_module(Repository &repository,
                                ^

Copy link

codecov bot commented Nov 15, 2024

Codecov Report

Attention: Patch coverage is 80.55556% with 28 lines in your changes missing coverage. Please review.

Project coverage is 51.21%. Comparing base (a2f766a) to head (667200c).
Report is 7 commits behind head on main.

Files with missing lines Patch % Lines
src/HealthGPS/analysis_module.cpp 68.18% 28 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #468      +/-   ##
==========================================
+ Coverage   48.47%   51.21%   +2.73%     
==========================================
  Files         164      165       +1     
  Lines        8284     8370      +86     
  Branches     1065     1070       +5     
==========================================
+ Hits         4016     4287     +271     
+ Misses       4259     4074     -185     
  Partials        9        9              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


🚨 Try these New Features:

@TinyMarsh
Copy link
Contributor Author

@jamesturner246 @alexdewar I'm going to close this PR without merging. The reason is that this body of work is at a point where it's not possible to make iterative progress to the new way of calculating population statistics (essentially by this I mean running the publish_result_message method) without breaking what is there currently.

The feedback I have received in this PR has been useful and has been implemented in the extend_analysis branch, which I will keep. Further work on the new stats calculation stuff can be merged into that branch. Then, when it is complete, we can merge the extend_analysis branch into main in one big merge which should implement the new stuff without breaking anything.

I'm aware that our engagement with HealthGPS has essentially ended, but I really think it's worth finishing this since so much work has gone into it and it's quite close to completion now - it shouldn't take a huge amount of effort to get this finished.

@TinyMarsh TinyMarsh closed this Nov 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants