You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Improve the kernels(take.rs, take_chunks.rs, take_compact.rs, concat.rs, filter.rs) and reduce memory usage:
Use low-level operations as much as possible.
Before building a StringColumn, we first scan iter and calculate the space required by it. This can reduce the memory usage of StringColumn and avoid the resize and grow operations of Vec;
When the output_schema of hash join includes StringColumn, we add string_items_buf to avoid allocating memory frequently in kernels.
ctx.get_function_context() and ctx.get_settings() have a certain amount of overhead, so we should call them as little as possible when building pipelines.
For concat, when merging two Datablocks, we need Vec push num_rows times before, now we only need one copy_nonoverlapping.
ci-benchmarkBenchmark: run all testci-cloudBuild docker image for cloud testpr-refactorthis PR changes the code base without new features or bugfix
5 participants
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
I hereby agree to the terms of the CLA available at: https://databend.rs/dev/policies/cla/
Summary
Summary about this PR
Improve the
kernels
(take.rs
,take_chunks.rs
,take_compact.rs
,concat.rs
,filter.rs
) and reduce memory usage:StringColumn
, we first scaniter
and calculate the space required by it. This can reduce the memory usage ofStringColumn
and avoid theresize
andgrow
operations ofVec
;StringColumn
, we addstring_items_buf
to avoid allocating memory frequently inkernels
.ctx.get_function_context()
andctx.get_settings()
have a certain amount of overhead, so we should call them as little as possible when building pipelines.The ci-benchmark TPC-H standalone results:
This change is