Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Merge_Models.pl script #431

Open
abensonca opened this issue Jul 10, 2023 · 0 comments
Open

Improve Merge_Models.pl script #431

abensonca opened this issue Jul 10, 2023 · 0 comments
Labels

Comments

@abensonca
Copy link
Collaborator

This script currently contains a lot of logic which controls how it combines different datasets. It would be better to have this logic encoded in the output file itself. So, for example, every dataset could have a mergeable attribute that specifies how to merge (e.g. append, sum, average etc.), and the script just applies that method. It could then simply walk the group hierarchy of the files and merge as needed. Groups could have a similar attribute to specify what to do with their attributes.

Also, the script is slow on large files. This may be due to the slowness of append (or glue) in PDL. Maybe it would be better to read the datasets from all files first (or at least get their sizes) and then create a merged dataset of the full size and just fill in the entries.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant