-
Notifications
You must be signed in to change notification settings - Fork 752
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: implement StringColumn using StringViewArray #16610
Conversation
src/query/storages/stage/src/read/row_based/formats/csv/block_builder.rs
Outdated
Show resolved
Hide resolved
need to finish the datatype matches.
|
Docker Image for PR
|
1 similar comment
Rerun the TPCH tests in large warehouse with (CI and bendsql). This pr is large enough and did not introduce breaking changes, let's merge it and imporve later. |
* feat: implement StringColumn using StringViewArray * fix * convert binaryview between arrow1 and arrow2 * fix * fix * fix * fix * fix * fix some issue * fix view slice bug * fix view slice bug * fix * support native read write * fix * fix * fix tests * add with_data_type * add with_data_type * fix gen_random_uuid commit row * move record batch to block * remove unused dep * fix lint * fix commit row * fix commit row * fix size * fix size * add NewBinaryColumnBuilder and NewStringColumnBulder * fix incorrect serialize_size * fix incorrect serialize_size * lint * lint * fix tests * use binary state * use binary state * update tests * update tests * update tests * fix native view encoding * fix * [ci skip] updata kernel concat for view types * [ci skip] improve kernels for view types * [ci skip] only string type use string view type * [ci skip] only string type use string view type * fix tests * [ci skip] fix tests * [ci skip] fix * fix * use NewStringColumnBuilder * rename NewString -> String * fmt * [ci skip] update tests * optimize take * add bench * fix tests * update * improve compare * implement compare using string view prefix * fix * fix * fix * fix-length * disable spill * [ci skip] add put_and_commit * [ci skip] update * update test * lint * [ci skip] add maybe gc * fix endiness * fix endiness * fix * update string compare * update --------- Co-authored-by: sundy-li <[email protected]>
I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/
Summary
Use StringViewArray to replace StringArray in memory column format
Performance of string view in kernels:
filter benchmark scripts:
compact
compact
Performance of comparison should be improved in another PR.
Tests
Type of change
This change is
depends on arrow-udf/arrow-udf#78