-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
enhance: binlog primary key turn off dict encoding #34358
enhance: binlog primary key turn off dict encoding #34358
Conversation
Invalid PR Title Format Detected Your PR submission does not adhere to our required standards. To ensure clarity and consistency, please meet the following criteria:
Required Title Structure:
Where Example:
Please review and update your PR to comply with these guidelines. |
@shaoting-huang E2e jenkins job failed, comment |
1 similar comment
@shaoting-huang E2e jenkins job failed, comment |
@shaoting-huang E2e jenkins job failed, comment |
73e6603
to
1401ff6
Compare
@shaoting-huang E2e jenkins job failed, comment |
3409c87
to
0ac6853
Compare
@shaoting-huang E2e jenkins job failed, comment |
f0ffaf8
to
8537cab
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #34358 +/- ##
==========================================
+ Coverage 80.76% 84.43% +3.67%
==========================================
Files 1161 891 -270
Lines 141015 116725 -24290
==========================================
- Hits 113892 98559 -15333
+ Misses 22750 13801 -8949
+ Partials 4373 4365 -8
|
8537cab
to
c05b548
Compare
@shaoting-huang E2e jenkins job failed, comment |
c05b548
to
efffbab
Compare
@shaoting-huang E2e jenkins job failed, comment |
/run-cpu-e2e |
@shaoting-huang E2e jenkins job failed, comment |
9f3ca85
to
0ce1da4
Compare
@shaoting-huang E2e jenkins job failed, comment |
ea914b5
to
3d52fe7
Compare
@shaoting-huang E2e jenkins job failed, comment |
3d52fe7
to
0ef89bb
Compare
0ef89bb
to
0cb2611
Compare
/lgtm |
Signed-off-by: shaoting-huang <[email protected]> add ut Signed-off-by: shaoting-huang <[email protected]>
0cb2611
to
c58fd88
Compare
/lgtm |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: czs007, shaoting-huang The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
issue: #34357
Go Parquet uses dictionary encoding by default, and it will fall back to plain encoding if the dictionary size exceeds the dictionary size page limit. Users can specify custom fallback encoding by using
parquet.WithEncoding(ENCODING_METHOD)
in writer properties. However, Go Parquet fallbacks to plain encoding rather than custom encoding method users provide. Therefore, this patch only turns off dictionary encoding for the primary key.With a 5 million auto ID primary key benchmark, the parquet file size improves from 13.93 MB to 8.36 MB when dictionary encoding is turned off, reducing primary key storage space by 40%.