-
Notifications
You must be signed in to change notification settings - Fork 589
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: improve nexmark q16 scaling up performance #14985
Comments
Some notable phenomenon: 1. Mem table spill frequency.4X’s mem table spill frequency is higher than 1X’s mem table spill frequency. It sounds unreasonable because 4X has 4 times more memory than 1X. The real memory utilization is about 31GB versus 7.1GB, which is even larger than 4 times. 2. Different compaction patterns3. Skewness4X: 1X: 4.
|
try to run the query:
without 2-phase aggregation
with 2-phase aggregation
RW Session Variable:
RW4X:1X:Flink4X:1X:Both 1X and 4X, RW is not as good as Flink. It is possible that increasing We remark that q16 asks for very little memory, and there is no data block miss ops, aka no cache miss rate. Therefore, it is weird that RW cannot out-perfom |
However, the most recent evaluation for the original q16 seems to be much better RWFlinkHowever, RW's scaling ratio is still inferior |
Thanks to #15696, we are good enough right now, let's close it for now. We can re-open later if needed. |
nightly-20240127
all default configuration
4X:
1X:
RW 4X: 145K
RW 1X: 102K
4X/1X Ratio: 1.42
Flink: http://metabase.risingwave-cloud.xyz/question/9549-nexmark-rw-vs-flink-avg-source-throughput-all-testbeds?rw_tag=nightly-20240127&flink_tag=v1.16.0&flink_label=flink-medium-1tm-test-20230104,flink-4x-medium-1tm-test-20240104&flink_metrics=avg-job-throughput-per-second
4X/1X Ratio: 2.36.
RW:
One critical choice is whether we want to enable partial aggregation for q16.
We know that
date_time
is monotonically increasing and since it isto_char
to the granularity ofDD
, during a period of time, theto_char(date_time, 'YYYY-MM-DD')
of each input record is the same.For the
channel
column, we can refer to the data generator:In essence,
50% chance is that
channel
is chosen from 4 hot channel candidates.Another 50% chance is that
channel
is chosen from 10000non-hot
channel candidates.By default,
RW_FORCE_TWO_PHASE_AGG
andRW_FORCE_SPLIT_DISTINCT_AGG
are both set to false.query plan:
Dist query plan:
If we set both
RW_FORCE_TWO_PHASE_AGG
andRW_FORCE_SPLIT_DISTINCT_AGG
to true,query plan:
Dist query plan:
Flink:
Query plan:
The text was updated successfully, but these errors were encountered: