Skip to content

Commit

Permalink
Fix a bad race condition that causes intermittent socket receive
Browse files Browse the repository at this point in the history
failures.

The issue is caused by the code checking socket status *after* checking
if bytes are available. Bytes may have become available between the
last byte available check and the socket status check. This causes us to
incorrectly ignore the newly available bytes and return 0 to the caller,
which in any blocking/timeout scenario tells the caller no more bytes
will ever be available.

This was introduced in commit 89b9f10
See also
#151
as a fix for
#135
  • Loading branch information
aggieNick02 committed Oct 10, 2024
1 parent 9000b5a commit 364d054
Showing 1 changed file with 11 additions and 2 deletions.
13 changes: 11 additions & 2 deletions adafruit_wiznet5k/adafruit_wiznet5k_socketpool.py
Original file line number Diff line number Diff line change
Expand Up @@ -601,7 +601,14 @@ def recv_into( # pylint: disable=unused-argument
self._buffer = self._buffer[bytes_to_read:]
# explicitly recheck num_to_read to avoid extra checks
continue

# We need to read the socket status here before seeing if any bytes are available.
# Why? Because otherwise we have a bad race condition in the if/elif/... logic below
# that can cause recv_into to fail when the other side closes the connection after
# sending bytes. The problem is that initially we can have num_avail=0 while the
# socket state is not in CLOSED or CLOSED_WAIT. Then after the if but before the elif
# that checks the socket state, bytes arrive and the other end closes the connection.
# So now bytes are available but we see the CLOSED/CLOSED_WAIT state and ignore them.
status_before_getting_available = self._status
num_avail = self._available()
if num_avail > 0:
last_read_time = ticks_ms()
Expand All @@ -620,7 +627,9 @@ def recv_into( # pylint: disable=unused-argument
elif num_read > 0:
# We got a message, but there are no more bytes to read, so we can stop.
break
elif self._status in (
# See note where we set status_before_getting_available for why we can't just check
# _status here
elif status_before_getting_available in (
wiznet5k.adafruit_wiznet5k.SNSR_SOCK_CLOSED,
wiznet5k.adafruit_wiznet5k.SNSR_SOCK_CLOSE_WAIT,
):
Expand Down

0 comments on commit 364d054

Please sign in to comment.