Skip to content
This repository has been archived by the owner on Jul 1, 2023. It is now read-only.

Add test that stresses races during transport shutdown #250

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

lw
Copy link
Contributor

@lw lw commented Dec 11, 2020

Summary:
This test wants to do two things: defer functions to the loop while the context is shutting down, and create new objects (connections and listeners) before and/or just after the closing. Both these operations interfere with a correct shutdown, and are tricky to handle.

Transports however don't offer a way to directly defer functions, so we achieve it by attaching a read callback to the connection, and causing it to be called immediately by closing the connection.

Transports also don't really allow to control timing of when something that is deferred to the loop will really run. To "work around" it we simply jam the context by creating new connections over and over and over at an insane rate.

This test proved its worth by finding many issues in transport shutdown.

Differential Revision: D25495683

@facebook-github-bot
Copy link

This pull request was exported from Phabricator. Differential Revision: D25495683

lw added a commit to lw/tensorpipe that referenced this pull request Dec 12, 2020
Summary:
Pull Request resolved: pytorch#250

This test wants to do two things: defer functions to the loop while the context is shutting down, and create new objects (connections and listeners) before and/or just after the closing. Both these operations interfere with a correct shutdown, and are tricky to handle.

Transports however don't offer a way to directly defer functions, so we achieve it by attaching a read callback to the connection, and causing it to be called immediately by closing the connection.

Transports also don't really allow to control timing of when something that is deferred to the loop will really run. To "work around" it we simply jam the context by creating new connections over and over and over at an insane rate.

This test proved its worth by finding many issues in transport shutdown.

Differential Revision: D25495683

fbshipit-source-id: 9cfe593c27681952a32bced2e7afa073c5668980
Differential Revision: D25495685

fbshipit-source-id: 8ef974338cf363fc69189356e886d787547693ee
Differential Revision: D25495682

fbshipit-source-id: 00e62ba390eb2e7d88ff92392d8fa645b0d7a1f3
…ers initing

Differential Revision: D25495684

fbshipit-source-id: e074c2f45c046b791da23b9eeb256495b316db6b
Differential Revision: D25495681

fbshipit-source-id: c941fb1bb1a1d49a12c750c0c7b874b09dc6a5b2
Summary: And thus retire the ClosingEmitter/Receiver from transports.

Differential Revision: D25495914

fbshipit-source-id: 0075c2db8fa804e6584c23b9519bfc4d373ac307
Summary:
Pull Request resolved: pytorch#250

This test wants to do two things: defer functions to the loop while the context is shutting down, and create new objects (connections and listeners) before and/or just after the closing. Both these operations interfere with a correct shutdown, and are tricky to handle.

Transports however don't offer a way to directly defer functions, so we achieve it by attaching a read callback to the connection, and causing it to be called immediately by closing the connection.

Transports also don't really allow to control timing of when something that is deferred to the loop will really run. To "work around" it we simply jam the context by creating new connections over and over and over at an insane rate.

This test proved its worth by finding many issues in transport shutdown.

Differential Revision: D25495683

fbshipit-source-id: 6cb4a01385bd23811457a99afab2915425d5ecd8
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants