-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ECO-4914] Fix websocket reconnection can get stuck in disconnected/connecting cycle #1855
Conversation
WalkthroughThe changes include the addition of a test case that verifies the WebSocket connection's behavior during connectivity failures and subsequent recovery. This is achieved by simulating a disconnection with an unroutable host address. Additionally, the Changes
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (1)
- test/realtime/transports.test.js (1 hunks)
Additional comments not posted (1)
test/realtime/transports.test.js (1)
160-213
: Review of New Test Case:ws_can_reconnect_after_ws_connectivity_fail
This test case is designed to verify the WebSocket connection's behavior during connectivity failures and subsequent recovery. Here are some observations and suggestions:
Simulation of Disconnection: The test effectively simulates a disconnection by setting
Defaults.wsConnectivityUrl
to an unroutable address. This is a good use of the test environment to mimic real-world scenarios.Short Timeouts: The use of short timeouts (
webSocketSlowTimeout
,realtimeRequestTimeout
,disconnectedRetryTimeout
) is appropriate for a test environment as it allows the test to execute quickly and efficiently.Blocking Initial Connection: The method
tryATransport
is stubbed to prevent the initial connection attempt, which is a clever way to simulate the connectivity issue without external dependencies.Restoration of Original Settings: The test ensures that original settings are restored after the test execution, which is crucial to avoid side effects on other tests. This includes restoring
Defaults.wsConnectivityUrl
,realtime.options.realtimeHost
, andconnection.connectionManager.wsHosts
.Reconnection Logic: The test checks that the connection can be successfully re-established, which is the core functionality being tested. This is done through a series of asynchronous callbacks, ensuring that each step is completed before moving on to the next.
Error Handling: The test includes error handling in the final callback of the
async.series
, which is essential to catch and report errors during test execution.Suggestions:
- Enhance Readability: Consider breaking down the test into smaller functions or adding more comments to clarify each step of the test. This can improve maintainability and readability.
- Increase Robustness: Add assertions to check intermediate states or specific error messages to ensure that the test is behaving as expected during each phase of the simulation.
Overall, the test is well-constructed and achieves its objective of testing the reconnection logic under simulated network failure conditions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (1)
- src/common/lib/transport/connectionmanager.ts (2 hunks)
Additional comments not posted (2)
src/common/lib/transport/connectionmanager.ts (2)
1494-1495
: Resetting connection state properties at the start ofconnectWs
method.The changes to reset
wsCheckResult
andabandonedWebSocket
at the beginning of theconnectWs
method are crucial for ensuring that stale state does not affect new connection attempts. This is a positive change that aligns with the PR's objective to fix the reconnection loop issue.
Line range hint
1-1495
: Well-structured and documentedConnectionManager
class.The
ConnectionManager
class is well-organized with clear separation of concerns and comprehensive documentation. The use of modern JavaScript features and helper functions enhances readability and maintainability. This structure supports the robust handling of connection states and transitions, crucial for the functionality of the WebSocket-based communication system.
61757f9
to
d65b03d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…ycle Fixes a regression introduced in ably-js v2 in the "no upgrade" PR [1]. Under certain network conditions (internet down) a race condition could occur in the connection manager. Specifically, websocket connectivity check in the `startWebSocketSlowTimer` function needed to set the `wsCheckResult` flag to `false` before the timer was cleared by other code branches. If that happened, it then caused an endless loop of websocket connection retries, even after the network conditions stabilized. A websocket transport would successfully open but then be immediately disposed due to the `wsCheckResult` flag still being set to `false`, without any mechanism to reset it in this scenario. Upon closer look, it makes sense that the websocket connection health flags (`wsCheckResult` and `abandonedWebSocket`) should be reset when attempting a new websocket connection. The previous solution did not explicitly reset these flags in the `connectWs` function, leading to the described race condition and endless loop. Thus, `wsCheckResult` and `abandonedWebSocket` are now reset to their neutral values upon entering the `connectWs` function. Resolves #1844 [1] #1645
d65b03d
to
60288b3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (3)
- src/common/lib/transport/connectionmanager.ts (3 hunks)
- test/common/modules/private_api_recorder.js (4 hunks)
- test/realtime/transports.test.js (1 hunks)
Files skipped from review as they are similar to previous changes (2)
- src/common/lib/transport/connectionmanager.ts
- test/realtime/transports.test.js
Additional comments not posted (4)
test/common/modules/private_api_recorder.js (4)
71-71
: Approved: Addition of WebSocket slow timeout option.The addition of
pass.clientOption.webSocketSlowTimeout
aligns with the PR's objectives to enhance WebSocket management. This should help in better handling slow or lagging WebSocket connections.
101-101
: Approved: Addition of realtime host option.The inclusion of
read.realtime.options.realtimeHost
is a positive step towards enhancing real-time communication capabilities and connection reliability.
128-128
: Approved: Addition of WebSocket connectivity URL option.The addition of
write.Defaults.wsConnectivityUrl
is crucial for specifying endpoints for WebSocket connectivity checks, aligning well with the PR's focus on improving WebSocket reconnection behavior.
140-140
: Approved: Addition of WebSocket hosts management option.The addition of
write.connectionManager.wsHosts
enhances the management of WebSocket hosts, which is crucial for handling complex connectivity scenarios effectively.
This fixes a regression introduced in ably-js v2 in this PR.
Under certain network conditions (internet down) a race condition could
occur in the connection manager. Specifically, websocket connectivity
check in the
startWebSocketSlowTimer
function needed to set thewsCheckResult
flag tofalse
before the timer was cleared by othercode branches. If that happened, it then caused an endless loop of
websocket connection retries, even after the network conditions
stabilized. A websocket transport would successfully open but then be
immediately disposed due to the
wsCheckResult
flag still being set tofalse
, without any mechanism to reset it in this scenario.Upon closer look, it makes sense that the websocket connection health
flags (
wsCheckResult
andabandonedWebSocket
) should be reset whenattempting a new websocket connection. The previous solution did not
explicitly reset these flags in the
connectWs
function, leading to thedescribed race condition and endless loop. Thus,
wsCheckResult
andabandonedWebSocket
are now reset to their neutral values upon enteringthe
connectWs
function.The websocket reconnection test that was added now checks such condition. You can see it fail on all envrionments on a library version without the aforementioned fix: node, browsers
Resolves #1844
Summary by CodeRabbit