-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The producer performance of pulsar-client v3.5.0 is much lower than that of v2.10.2 #229
Comments
It can be reproduced in my env (Python 3.8 and Ubuntu 20.04 WSL2) but the gap is not so high as your env. 3.5.0:
2.10.2
It's also weird that |
I rewrote the test script to avoid being affected by the pending queue (because it would be blocked if the queue is full) and reduce the test time. I tested various client versions with Python 3.8 on Ubuntu 20.04 WSL for 3 times against Pulsar 4.0.0 standalone and the same topic locally. from pulsar import Client, CompressionType, Result
import os
import time
def send_callback(i, result, msg_id):
if result != Result.Ok:
print(f'{i} failed: {result}')
if __name__ == "__main__":
client = Client(service_url='pulsar://localhost:6650',
io_threads=4)
msg = os.urandom(100).hex().encode()
producer = client.create_producer(
'test-topic',
compression_type=CompressionType.LZ4,
batching_enabled=True,
batching_max_messages=1000, # batch size will be always 1000
batching_max_allowed_size_in_bytes=10485760,
batching_max_publish_delay_ms=10,
max_pending_messages=0, # avoid send_async being blocked due to full queue
block_if_queue_full=True)
t1 = time.time()
for i in range(0, 200000):
producer.send_async(msg, lambda result, msg_id, i=i: send_callback(i, result, msg_id))
producer.flush()
t2 = time.time()
print(f'send_async: {round(t2 - t1, 3)} s')
client.close()
P.S. 3.0.0 is not tested because it has a deadlock bug. As we can see, actually 3.1.0 has better performance than 2.10.2. But there are some significant performance regressions from 3.1.0 -> 3.2.0, 3.3.0 -> 3.4.0, 3.4.0 -> 3.5.0 |
I disabled the compression and the test results are:
Then I increased the batch delay to avoid being affected by the batch timer (
NOTE:
|
When using pulsar-client v3.5.0 and pulsar-client v2.10.2 to send the same batch of data (about 100MB), 3.5.0 takes about 3.5 times longer than 2.10.2.
Core Code
3.5.0
2.10.2
The above is the running time statistics of the line_profiler tool. The main time consumption is
pro.send_async(line, callback=send_callback)
, which accounts for more than 97%. The pulsar-client v3.5.0 takes about 127s, and the pulsar-client v2.10.2 version takes about 35.6s.Reproduce
Demo
Result
The text was updated successfully, but these errors were encountered: