[Option 2] Fix race between handle_replica_event
and start_pid_remotely
#2
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There appears to be a race condition leading the
Swarm.Tracker
process to die. When this happens the monitors that were acquired bySwarm.Tracker
are lost and the process:down
events will fail to be captured.The race occurs when
handle_replica_event
attempts to find the remote process in the local registry. If it is not found, but is then registered in the mean time bystart_pid_remotely
an error will occur whenhandle_replica_event
tries to register the process itselfswarm/lib/swarm/tracker/tracker.ex
Lines 1004 to 1008 in 7dd08d6
This solution removes the call to
add_registration
fromstart_pid_remotely
to remove the potential race condition. This reintroduces the issue bitwalker#45 fixed by bitwalker#46. Option 1 may be preferable.