Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor DataFrameNormalizer to improve performance #1964

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

G-D-Petrov
Copy link
Collaborator

Reference Issues/PRs

Fixes #1963

What does this implement or fix?

This fix aims to reduce the number of calls to make_block and thus improve the performance of the post processing steps when there are multiple columns of the same type next to each other.

Note: there is not improvement when the columns are of different types

Any other comments?

Before the fix the code from the repro took:

8050444 function calls (7700439 primitive calls) in 4.323 seconds
ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.002    0.002    2.822    2.822 _store.py:1831(_post_process_dataframe)

After the fix it took:

1679935 function calls (1503062 primitive calls) in 2.043 seconds
ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.005    0.005    0.594    0.594 _store.py:1831(_post_process_dataframe)

Checklist

Checklist for code changes...
  • Have you updated the relevant docstrings, documentation and copyright notice?
  • Is this contribution tested against all ArcticDB's features?
  • Do all exceptions introduced raise appropriate error messages?
  • Are API changes highlighted in the PR description?
  • Is the PR labelled as enhancement or bug so it appears in autogenerated release notes?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Arcticdb reads can be slow when reading many columns
2 participants