Use proper threading to encourage work completion of AMQP subscribers in a predictable manner. #95
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Underlying Issues and Justification
Currently Event Source is categorized, when it comes to AMQP, by three properties:
However, a problem can arise when allowing multiple consumers to perform work simultaneously without coordination in a multi-threaded environment: the system can switch the working thread during work being performed by a consumer, and there is no guarantee it will return to that message. Usually this isn't a problem under low loads for event_source, but becomes a problem when:
Under these circumstances, since workers are not prevented from interruption, and AMQP subscribers don't have any coordination around when work they are doing is allowed to be interrupted, a worker can be suspended while processing a work intensive task, with no promise it may ever be resumed.
This can result in:
The Fix
This can be fixed by marking the unit of work performed by an Event Source worker as atomic - so that it can not be interrupted.
However, certain portions of this approach must be taken into account in order not to cripple performance:
In this case, the solution this offers is a ruby
Monitor
, synchronized only around the portion of the AMQP subscriber where work is actually being performed.This ticket is tracked as: https://www.pivotaltracker.com/story/show/186036844
Caveats
Please note that while introducing a monitor to be used later, this fix does not attempt to manage or constrain the behaviour of the HTTP worker portion of Event Source. I was less certain of how that might behave in isolation and would rather exercise caution and handle that issue in a separate submission.