Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chunk key parsing speedup #1

Merged
merged 1 commit into from
May 14, 2024
Merged

Conversation

agoodm
Copy link

@agoodm agoodm commented May 14, 2024

Hey @ayushnag ,

I actually took a quick look your code and made a few small optimizations. This should give you a 2-3x speedup.

@ayushnag
Copy link
Owner

Thanks @agoodm! I actually just replaced the ast.literal_eval() segment with np.fromstring(chunk_tag.attrib["chunkPositionInArray"][1:-1], dtype=int, sep=',') which has also shown performance improvements. Now I am getting similar performance numbers to your method but I have only tested on files with <= 300 chunks per variable. I will keep this in mind when testing with more files/chunks

@ayushnag
Copy link
Owner

Actually there is a ~10ms improvement using this instead of the np method (just testing on one file) which might scale. However as Tom mentioned here we can have less memory pressure with numpy2.0. I will go ahead and merge this for now and we can revisit later

@ayushnag ayushnag merged commit fc8b0d8 into ayushnag:dmr-adapter May 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants