Imbalanced kafka source actors' throughput when running nexmark benchmark #5214

lmatz · 2022-09-08T07:55:35Z

This is a simple Nexmark q10 query with 12 parallelism on a single computing node with 16 CPU.

The number of records in each Kafka partition:

The throughput of each source connector:

If the speed of source connectors does not match with each other,
and if the job has event time and watermark,
then the job will probably throw away more events than usual
because one partition may make the watermark progress quickly
and thus other partitions' events get discarded more aggressively.

Some random ideas:
throttle some source actors by considering both the current throughput and the value of the event timestamp.

BugenZhao · 2022-09-08T08:23:50Z

Is the throughput able to be automatically synced with back pressure? Note that the current bound of the connector message buffer of 512 chunks is too large.

risingwave/src/source/src/connector_source.rs

Line 69 in 761b1ba

const CONNECTOR_MESSAGE_BUFFER_SIZE: usize = 512;

lmatz · 2022-09-08T08:45:53Z

Some more context
The plan is:

StreamMaterialize { columns: [auction, bidder, price, date_time, date, time, _row_id(hidden)], pk_columns: [_row_id] }
   StreamExchange { dist: HashShard(_row_id) }
     StreamProject { exprs: [Field(bid, 0:Int32), Field(bid, 1:Int32), Field(bid, 2:Int32), Field(bid, 5:Int32), ToChar(Field(bid, 5:Int32), 'YYYY-MM-DD':Varchar), ToChar(Field(bid, 5:Int32), 'HH:MI':Varchar), _row_id] }
       StreamSource { source: "nexmark_source", columns: [event_type, person, auction, bid, _row_id] }

But both StreamMaterialize and StreamExchange do nothing in this case as their code of processing data chunk is commented out.

Every actor (with its corresponding source) should be on its own.

fuyufjh · 2022-09-08T08:57:07Z

throttle some source actors by considering both the current throughput and the value of the event timestamp.

Should it be applied to NexMark source only or all kinds of sources?

lmatz · 2022-09-08T09:47:05Z

throttle some source actors by considering both the current throughput and the value of the event timestamp.

Should it be applied to NexMark source only or all kinds of sources?

Currently, it does not block doing benchmark on NexMark sources as the total throughput aggregated from all the source actors is normal,
I feel it should be applied to all kinds of sources as long as the job involves watermark?

lmatz · 2022-09-09T06:14:20Z

Another example:

No matter high throughput or low throughput, both are quite stable.

github-actions · 2022-11-09T02:17:38Z

This issue has been open for 60 days with no activity. Could you please update the status? Feel free to continue discussion or close as not planned.

BugenZhao · 2022-11-09T02:24:04Z

Will the situation be better after #6170? I'd like to try it out.

BugenZhao · 2022-11-09T04:10:56Z

The throughputs are at least really balanced if 3 nodes running on the same host. 🤔

lmatz · 2022-11-09T07:08:12Z

The setting I tested is a single-node setting, so there should be no remote exchange but only local exchange.
I am uncertain whether this should be expected to be solved in #6170?

BugenZhao · 2022-11-09T07:19:25Z

The setting I tested is a single-node setting

That's weird. 😕

lmatz · 2022-11-09T07:22:29Z

The setting I tested is a single-node setting

That's weird. 😕

No worries, let me track this issue manually when we have the performance dashboard.

So push to the release where performance dashboard is in place.

Add a configurable parameter to limit the source throughput in terms of bytes. This parameter may or may not(so not go into doc and release notes this time) be intended to be used by public users at the moment. It is convenient for people to test resource utilization/performance for a fixed input throughput.(https://redpanda.com/blog/redpanda-vs-kafka-performance-benchmark#:~:text=3.2%20under%20workloads%20(-,up%20to%201GB/sec,-)%20that%20are%20common) Otherwise, people have to always generate events by some source generators on the fly to Kafka/Redpanda at exactly the throughput you want, which may not be achievable and depends on many other factors. Instead, we can pre-generate a lot and then limit the source throughput stably. Also, it may be a workaround solution for #5214. Approved-By: tabVersion

BugenZhao · 2023-03-20T12:05:01Z

Related: #8451 (comment)

lmatz · 2023-03-22T05:23:03Z

Grafana: https://g-2927a1b4d9.grafana-workspace.us-east-1.amazonaws.com/d/EpkBw5W4k/risingwave-bench-dashboard?orgId=1&from=1679413185000&to=1679414780000&to=&var-namespace=nexmark-bs-0-1-2-3-4-daily-20230321

lmatz · 2023-03-23T16:07:52Z

seems pretty easy to reproduce, almost every day recently

BugenZhao · 2023-03-24T09:53:35Z

Per some discussions with @liurenjie1024 and @hzxa21, I tend to think that this is an issue of Kafka (server or client, both possible) when the workload is heavy. 😟

When testing with Nexmark datagen sources, I cannot observe any imbalance. This is feasible as it's not natural to write some codes to make the performance of actors not balanced, if there's no interference with each other.
There's also no Tokio's I/O in this case, as there's no remote exchange, and rdkafka is also driven by its own async poll interface.

I'll keep investigating if I get a new idea, but I guess we need some metrics of the Kafka cluster in order to make things clearer. 🤔

lmatz · 2023-03-27T04:50:06Z

The EBS bandwidth of the Kafka machine is likely limited.

The maximum bandwidth of GP2
and
the default bandwidth of GP3 are both smaller than the 298MB/s shown in the figure.

Regarding why the throughput during the second half does not go up, probably because there are some extra IO credits at the beginning, and then we need to pay them back. This is hust a guess.

BugenZhao · 2023-03-27T07:15:51Z

Yes. I also suspected that the bottleneck between Kafka and the source leads to this issue. But there's another remaining problem of why there's the imbalanced performance for different partitions under disk throughput. 😄 Guess it's caused by the internal implementation of Kafka, which needs further investigation.

lmatz · 2023-03-27T07:23:44Z

But the weird thing is that if looking at the history of q2, it is less severe than q0
q2 does more filtering than q0.

The good news is that q2 can achieve 1M rows/s if this phenomenon does not happen.

And also the client of Kafka since Flink seems not to suffer from the same problem. It uses Java native client while rdkafka is an independent implementation.

BugenZhao · 2023-03-27T08:54:52Z

But the weird thing is that if looking at the history of q2, it is less severe than q0
q2 does more filtering than q0.

I took a quick look and found that this happens randomly for all queries that reach the throughput of around 800k rows/s, like q0 q1 q3 q10 q22. All imbalances occur when the bytes per sec is under 300 MB/s.

shanicky · 2023-03-29T07:22:22Z

Can we get the split distribution of each actor? may be we can see the information from the logs.

BugenZhao · 2023-03-29T12:03:11Z

Can we get the split distribution of each actor? may be we can see the information from the logs.

The splits are assigned equally. See the legends of "Source Throughput Per Partition". 🤔

github-actions bot added this to the release-0.1.13 milestone Sep 8, 2022

lmatz removed this from the release-0.1.13 milestone Sep 8, 2022

lmatz added type/enhancement Improvements to existing implementation. component/streaming Stream processing related issue. component/connector type/perf labels Sep 8, 2022

github-actions bot added the no-issue-activity label Nov 9, 2022

lmatz removed the no-issue-activity label Nov 9, 2022

fuyufjh assigned lmatz Nov 9, 2022

fuyufjh added this to the release-0.1.14 milestone Nov 9, 2022

lmatz modified the milestones: release-0.1.14, release-0.1.15 Nov 21, 2022

lmatz removed this from the release-0.1.15 milestone Dec 19, 2022

lmatz mentioned this issue Dec 26, 2022

feat(connector): optionally limit kafka input throughput bytes #7058

Merged

3 tasks

lmatz assigned BugenZhao and unassigned lmatz Mar 22, 2023

lmatz mentioned this issue Mar 22, 2023

Tracking: Nexmark queries optimization #7289

Open

54 tasks

BugenZhao mentioned this issue Oct 20, 2023

perf: kafka source performance regression due to tracing #12959

Closed

st1page mentioned this issue Feb 8, 2024

perf: improve tpc-h q1 performance (single-topic) #15034

Open

st1page mentioned this issue Feb 20, 2024

2024-02-18 Nexmark performance regression Source Throughput Imbalance #15142

Open

BugenZhao changed the title ~~Imbalanced source actors' throughput may cause trouble when there is watermark~~ Imbalanced kafka source actors' throughput when running nexmark benchmark Jun 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Imbalanced kafka source actors' throughput when running nexmark benchmark #5214

Imbalanced kafka source actors' throughput when running nexmark benchmark #5214

lmatz commented Sep 8, 2022 •

edited

Loading

BugenZhao commented Sep 8, 2022

lmatz commented Sep 8, 2022 •

edited

Loading

fuyufjh commented Sep 8, 2022

lmatz commented Sep 8, 2022 •

edited

Loading

lmatz commented Sep 9, 2022

github-actions bot commented Nov 9, 2022

BugenZhao commented Nov 9, 2022

BugenZhao commented Nov 9, 2022

lmatz commented Nov 9, 2022

BugenZhao commented Nov 9, 2022

lmatz commented Nov 9, 2022 •

edited

Loading

BugenZhao commented Mar 20, 2023

lmatz commented Mar 22, 2023

lmatz commented Mar 23, 2023

BugenZhao commented Mar 24, 2023

lmatz commented Mar 27, 2023

BugenZhao commented Mar 27, 2023

lmatz commented Mar 27, 2023

BugenZhao commented Mar 27, 2023

shanicky commented Mar 29, 2023

BugenZhao commented Mar 29, 2023

Imbalanced kafka source actors' throughput when running nexmark benchmark #5214

Imbalanced kafka source actors' throughput when running nexmark benchmark #5214

Comments

lmatz commented Sep 8, 2022 • edited Loading

BugenZhao commented Sep 8, 2022

lmatz commented Sep 8, 2022 • edited Loading

fuyufjh commented Sep 8, 2022

lmatz commented Sep 8, 2022 • edited Loading

lmatz commented Sep 9, 2022

github-actions bot commented Nov 9, 2022

BugenZhao commented Nov 9, 2022

BugenZhao commented Nov 9, 2022

lmatz commented Nov 9, 2022

BugenZhao commented Nov 9, 2022

lmatz commented Nov 9, 2022 • edited Loading

BugenZhao commented Mar 20, 2023

lmatz commented Mar 22, 2023

lmatz commented Mar 23, 2023

BugenZhao commented Mar 24, 2023

lmatz commented Mar 27, 2023

BugenZhao commented Mar 27, 2023

lmatz commented Mar 27, 2023

BugenZhao commented Mar 27, 2023

shanicky commented Mar 29, 2023

BugenZhao commented Mar 29, 2023

lmatz commented Sep 8, 2022 •

edited

Loading

lmatz commented Sep 8, 2022 •

edited

Loading

lmatz commented Sep 8, 2022 •

edited

Loading

lmatz commented Nov 9, 2022 •

edited

Loading