Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interrupt ongoing requests at end of test #37

Open
dagrayvid opened this issue Mar 18, 2024 · 0 comments
Open

Interrupt ongoing requests at end of test #37

dagrayvid opened this issue Mar 18, 2024 · 0 comments

Comments

@dagrayvid
Copy link
Collaborator

Currently at the end of the test duration, the main process waits for the user processes to finish all active requests. This behavior can produce strange results when load test concurrency goes above the maximum batch size that the runtime can handle for a given model. In cases like these, the server side throughput looks lower because of the time spent finishing up the last few pending requests, not fully utilizing the server side resources.

Some potential solutions:

  • In streaming case, user processes can check if the test is over between each token
  • Main process can communicate expected end time to the user processes, and user processes can add a timeout to the http requests based on the end time of the test.
  • Keep existing test behavior and filter out the results for requests that ended after the test end time in the results processing code.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant