Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Download inputs off the event consumer loop #2092

Merged
merged 2 commits into from
Dec 20, 2024

Conversation

philandstuff
Copy link
Contributor

@philandstuff philandstuff commented Dec 20, 2024

Currently, we download inputs (via URLPath's convert() method) in
_prepare_payload, in the main Worker event consumer loop. This is bad, because
downloading inputs is a blocking operation and we cannot process other events
while downloads are in progress.

This is a substantial restructuring:

  • the event consumer loop now only processes events from the child; prediction
    requests and cancel requests from the http server are now processed directly.
  • there are new concurrent.future.ThreadPoolExecutor instances for preparing
    predictions and downloading inputs.
    • as part of this, this commit also supports downloading inputs concurrently
      rather than serially. I have hardcoded a maximum of 8 concurrent downloads
      based on the ThreadPoolExecutor size. I picked this number arbitrarily.
  • when the web server receives a request to start a prediction, we submit a task
    on the prediction start executor, which in turn submits downloads on the input
    download executor. Once all inputs are downloaded, we send the prediction
    request to the child.
  • cancel requests are handled similarly, but as there is no input download this
    is much simpler.
  • all calls to self._events.send() now happen within the http server async event
    loop, not the Worker ThreadPoolExecutor. (This is maybe not strictly correct,
    as we're calling blocking i/o from an async event loop, but it's a local
    Connection object and it's a tiny amount of data)

@philandstuff philandstuff changed the title WIP: download input events off the event consumer loop WIP: download input off the event consumer loop Dec 20, 2024
Currently, we download inputs (via URLPath's convert() method) in
_prepare_payload, in the main Worker event consumer loop.  This is bad, because
downloading inputs is a blocking operation and we cannot process other events
while downloads are in progress.

This is a substantial restructuring:

- the event consumer loop now only processes events from the child; prediction
  requests and cancel requests from the http server are now processed directly.
- there are new concurrent.future.ThreadPoolExecutor instances for preparing
  predictions and downloading inputs.
  - as part of this, this commit also supports downloading inputs concurrently
    rather than serially.  I have hardcoded a maximum of 8 concurrent downloads
    based on the ThreadPoolExecutor size.  I picked this number arbitrarily.
- when the web server receives a request to start a prediction, we submit a task
  on the prediction start executor, which in turn submits downloads on the input
  download executor. Once all inputs are downloaded, we send the prediction
  request to the child.
- cancel requests are handled similarly, but as there is no input download this
  is much simpler.
- all calls to self._events.send() now happen within the http server async event
  loop, not the Worker ThreadPoolExecutor.  (This is maybe not strictly correct,
  as we're calling blocking i/o from an async event loop, but it's a local
  Connection object and it's a tiny amount of data)
@philandstuff philandstuff force-pushed the download-inputs-off-event-loop branch from c1b7f72 to f2ba468 Compare December 20, 2024 15:22
@philandstuff philandstuff marked this pull request as ready for review December 20, 2024 15:23
@philandstuff philandstuff changed the title WIP: download input off the event consumer loop Download inputs off the event consumer loop Dec 20, 2024
@philandstuff philandstuff force-pushed the download-inputs-off-event-loop branch from 84acf65 to ca860df Compare December 20, 2024 16:01
@philandstuff philandstuff force-pushed the download-inputs-off-event-loop branch from ca860df to f7809b4 Compare December 20, 2024 16:05
@philandstuff philandstuff enabled auto-merge (squash) December 20, 2024 16:08
@philandstuff philandstuff enabled auto-merge (rebase) December 20, 2024 16:08
@philandstuff philandstuff merged commit 40da4b6 into main Dec 20, 2024
19 checks passed
@philandstuff philandstuff deleted the download-inputs-off-event-loop branch December 20, 2024 16:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants