-
-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lock on windows platform using multiple workers #74
base: master
Are you sure you want to change the base?
Conversation
Hello, I'm not sure I got the scope of this. |
Hello,
This list contains task to work on, but event_loop blocked in
The issue is that granian fail to respond on 3 simultaneous requests, when more than one worker used (on windows platform).
This test case is reproducing the problem and failing. |
Ok, thank you for the detailed explanation. So the actual theme here is either to:
I need to dig quite a bit into socket's implementation in windows, which is not exactly my cup of tea. |
If you need any assistance in socket's implementation in windows, I can try to help you. Before starting to fix something, first we need to figure out what we need to fix:
runtime blocked in |
Understood, but I still don't get why on Windows is there a data race. The socket created in the main process is set to be inheritable, thus the sub-processes (workers that actually accepts connections) should not have any data races as the file-descriptor is shared across all of them. Also, the rust code which set the socket to non blocking should fail if that call cannot succeed. |
Just realized that we can check will synchronisation fix the problem or not. --- shared_socket.py~ 2023-04-17 13:27:57.000000000 +0400
+++ shared_socket.py 2023-04-17 15:38:43.000000000 +0400
@@ -1,7 +1,7 @@
import asyncio
import socket
from contextlib import contextmanager
-from multiprocessing import Process
+from multiprocessing import Process, Lock
from concurrent.futures import ThreadPoolExecutor
from select import select
import httpx
@@ -14,8 +14,8 @@
def __init__(self):
self.procs = []
- def spawn(self, func, *args):
- proc = Process(target=func, args=args)
+ def spawn(self, func, *args, **kwargs):
+ proc = Process(target=func, args=args, kwargs=kwargs)
proc.start()
self.procs.append(proc)
@@ -54,7 +54,7 @@
executor.submit(do_response, conn)
-def acceptor_nonblocking(sock, worker_id):
+def acceptor_nonblocking(sock, worker_id, lock=None):
sock.setblocking(False)
conns = [sock]
@@ -65,11 +65,16 @@
# we still need try...except because other
# processes will steal our connections
try:
+ if lock:
+ lock.acquire()
conn, addr = sock.accept()
print(worker_id, 'Connected by', addr)
conns.append(conn)
except BlockingIOError:
pass
+ finally:
+ if lock:
+ lock.release()
else:
with conn:
data = conn.recv(1024)
@@ -99,8 +104,9 @@
sock.set_inheritable(True)
with process_spawner() as process:
+ lock = Lock()
for worker_id in range(2):
- process.spawn(acceptor_nonblocking, sock, worker_id)
+ process.spawn(acceptor_nonblocking, sock, worker_id, lock=lock)
for iteration in range(100):
res = test(port) Yes it is fixing blocking on windows platform. |
@izmmisha sorry for the super-late reply. I understand the fix on the sample code but:
So let's say at the moment, given the information I have, I would be more prone to lock windows on just 1 worker. |
403a2ea
to
a0d86e7
Compare
ff9588a
to
1814868
Compare
eb5297c
to
8f05250
Compare
granian use shared socket.
hyper make it nonblocking socket.
internals of WS2_32 accept in pseudo code:
"A critical section object provides synchronization similar to that provided by a mutex object, except that a critical section can be used only by the threads of a single process. "
https://learn.microsoft.com/en-us/windows/win32/sync/critical-section-objects
2 processes waiting to accept for connections
3 request came simultaneously
First two requests were accepted without problems, and start to process them simultaneously.
As we are async, as soon as it possible we are going to accept the rest of requests.
And now, two workers calling accept, both came across select and found that there is accept possible.
But when both of them going to actual accept, only one will obtain the connection, the second one will stuck in blocking manner until new connection received.
As result, event_loop blocked. Already accepted request will not be answered until event_loop unblocked.
callstack:
I'm not familiar with rust and tokio, so I'm still seeking for last direct evidence where is request task in tokio queues.
But indirect confirmation of this hypothesis exists: