Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

End-to-end support for concurrent async models #2066

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Commits on Nov 25, 2024

  1. cog build: validate python version is new enough to support concurrency

    We require python >=3.11 to support asyncio.TaskGroup
    philandstuff committed Nov 25, 2024
    Configuration menu
    Copy the full SHA
    b89abad View commit details
    Browse the repository at this point in the history

Commits on Nov 26, 2024

  1. End-to-end support for concurrent async models

    This builds on the work in #2057 and wires it up end-to-end.
    
    We can now support async models with a max concurrency configured, and submit
    multiple predictions concurrently to them.
    
    We only support python 3.11 for async models; this is so that we can use
    asyncio.TaskGroup to keep track of multiple predictions in flight and ensure
    they all complete when shutting down.
    
    The cog http server was already async, but at one point it called wait() on a
    concurrent.futures.Future() which blocked the event loop and therefore prevented
    concurrent prediction requests (when not using prefer-async, which is how the
    tests run).  I have updated this code to wait on asyncio.wrap_future(fut)
    instead which does not block the event loop.  As part of this I have updated the
    training endpoints to also be asynchronous.
    
    We now have three places in the code which keep track of how many predictions
    are in flight: PredictionRunner, Worker and _ChildWorker all do their own
    bookkeeping. I'm not sure this is the best design but it works.
    
    The code is now an uneasy mix of threaded and asyncio code.  This is evident in
    the usage of threading.Lock, which wouldn't be needed if we were 100% async (and
    I'm not sure if it's actually needed currently; I just added it to be safe).
    philandstuff committed Nov 26, 2024
    Configuration menu
    Copy the full SHA
    572fb3f View commit details
    Browse the repository at this point in the history
  2. Fix typing of CogConcurrencyConfig

    The use of `Optional` allowed `None` as a valid value. This has been
    changed to use `NotRequired` which allows the field to be omitted but
    must always be an integer when present.
    aron committed Nov 26, 2024
    Configuration menu
    Copy the full SHA
    8460d15 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    d19b3b7 View commit details
    Browse the repository at this point in the history
  4. Re-word internal use of id to tag

    Inside the worker we track predictions by tag not exterenal predicition
    IDs, this commit updates the variable names to reflect this.
    aron committed Nov 26, 2024
    Configuration menu
    Copy the full SHA
    41adaa9 View commit details
    Browse the repository at this point in the history
  5. fix test failure

    the `for tag in done_tags:` was resetting the existing `tag` variable and
    breaking things.
    philandstuff committed Nov 26, 2024
    Configuration menu
    Copy the full SHA
    897e28e View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    5f25c22 View commit details
    Browse the repository at this point in the history