End-to-end support for concurrent async models #2066

We require python >=3.11 to support asyncio.TaskGroup

This builds on the work in #2057 and wires it up end-to-end. We can now support async models with a max concurrency configured, and submit multiple predictions concurrently to them. We only support python 3.11 for async models; this is so that we can use asyncio.TaskGroup to keep track of multiple predictions in flight and ensure they all complete when shutting down. The cog http server was already async, but at one point it called wait() on a concurrent.futures.Future() which blocked the event loop and therefore prevented concurrent prediction requests (when not using prefer-async, which is how the tests run). I have updated this code to wait on asyncio.wrap_future(fut) instead which does not block the event loop. As part of this I have updated the training endpoints to also be asynchronous. We now have three places in the code which keep track of how many predictions are in flight: PredictionRunner, Worker and _ChildWorker all do their own bookkeeping. I'm not sure this is the best design but it works. The code is now an uneasy mix of threaded and asyncio code. This is evident in the usage of threading.Lock, which wouldn't be needed if we were 100% async (and I'm not sure if it's actually needed currently; I just added it to be safe).

The use of `Optional` allowed `None` as a valid value. This has been changed to use `NotRequired` which allows the field to be omitted but must always be an integer when present.

Inside the worker we track predictions by tag not exterenal predicition IDs, this commit updates the variable names to reflect this.

the `for tag in done_tags:` was resetting the existing `tag` variable and breaking things.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

End-to-end support for concurrent async models #2066

End-to-end support for concurrent async models #2066

Commits on Nov 25, 2024

Commits on Nov 26, 2024

End-to-end support for concurrent async models #2066

Are you sure you want to change the base?

End-to-end support for concurrent async models #2066

Commits on Nov 25, 2024

Commits on Nov 26, 2024