Synchronized access to MessageService restricts performance #296

magnusbaeck · 2024-08-16T09:14:12Z

Description

The ProducerController synchronizes all calls to the MessageService. This serializes all publish operations and thus has severe performance implications when the rate of incoming requests reaches a certain point. In our case our requests usually take about 35 ms but during our peak hours publishing a single message can take several seconds. This increases the total execution time for our ETOS executions significantly.

Additional Context

The problematic synchronization was added in #219 in an attempt to address a problem with corrupt HTTP responses without addressing the root cause of the poor implementation of MessageServiceRMQImpl. My code review comment in said PR (#219 (comment)) explains the problem but was never addressed.

The following graphs illustrate what we're seeing every night. Note how the request latency shoots up by several orders of magnitude without being constrained by thread pools, CPU, or memory. (The CPU usage on the right y axis of the top-left graph is in millicores.)

This serialization also effectively disables the attempt at a connection pool in RabbitMqProperties. Which perhaps is just as well since that class doesn't appear to be very thread safe and that its giveMeRandomChannel method doesn't return a random channel as advertised but actually just returns the first open channel in the list of channels.

Logs

No response

Expected Behavior

I expect Eiffel REMReM Publish to perform well under reasonable loads and that it its performance will be contrained by CPU, memory, thread pool size, and other tunable parameters.

Steps To Reproduce

No response

The version of this project/repo, if applicable

master

The version/edition of the Eiffel Protocol used, if applicable

N/A

z-sztrom · 2024-09-03T14:03:18Z

Confirmed, the synchronization sequence limits publish service to publish one event in a time, i.e. an event is published only when publishing of a previous one has been finished.

z-sztrom · 2024-11-01T14:26:46Z

I tried to compare current implementation of publish service with one without the synchronized section and I didn't observe any performance difference. Publishing took roughly the same time: non-synchronized version 1% faster.

I tested in Docker environment and I suspect that MB was a bottleneck. I'll continue testing.

@magnusbaeck, can you perform testing if I provide you non-synchronized version?

z-sztrom · 2024-11-07T14:09:43Z

I did testing using real machines (not Docker containers) and I found that publishing without synchronized section is cca 15-20% faster (publishing up to cca 100 events).

magnusbaeck · 2024-11-07T14:16:40Z

Sorry, I was away last week and have had to spend some time this week on getting back on track. Yes, I can do some load testing.

100 events/s sounds very low. How many concurrent client connections did you run?

z-sztrom · 2024-11-08T07:22:54Z

No, it's a misunderstanding. Version without synchronization is 15-20% faster when publishing up to 100 events. Then the spped advantage drops down gradually and when publishing cca 10000 events, performance of both, synchronized and non-synchronized versions, becomes the same.

Overall performance I achieved on my setup was cca 3000events/sec.

z-sztrom · 2024-11-12T11:10:03Z

As you stated in your comment (#219 (comment)), the problem arises from inside of MessageServiceRMQImpl. The implementation has never been prepared to work in parallel/concurrent environment.

Originally, I had an impression that the implementation is located in semantics library. But, fortunately, it isn't. So that could be altered as part of publish service.

magnusbaeck added the bug Something is wrong and needs fixing. label Aug 16, 2024

e-backmark-ericsson assigned z-sztrom Sep 25, 2024

z-sztrom linked a pull request Nov 14, 2024 that will close this issue

Events can be published in parallel #302

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Synchronized access to MessageService restricts performance #296

Synchronized access to MessageService restricts performance #296

magnusbaeck commented Aug 16, 2024

z-sztrom commented Sep 3, 2024

z-sztrom commented Nov 1, 2024

z-sztrom commented Nov 7, 2024

magnusbaeck commented Nov 7, 2024

z-sztrom commented Nov 8, 2024

z-sztrom commented Nov 12, 2024 •

edited

Loading

Synchronized access to MessageService restricts performance #296

Synchronized access to MessageService restricts performance #296

Comments

magnusbaeck commented Aug 16, 2024

Description

Additional Context

Logs

Expected Behavior

Steps To Reproduce

The version of this project/repo, if applicable

The version/edition of the Eiffel Protocol used, if applicable

z-sztrom commented Sep 3, 2024

z-sztrom commented Nov 1, 2024

z-sztrom commented Nov 7, 2024

magnusbaeck commented Nov 7, 2024

z-sztrom commented Nov 8, 2024

z-sztrom commented Nov 12, 2024 • edited Loading

z-sztrom commented Nov 12, 2024 •

edited

Loading