Vegeta-break is a comandline tool for discovering the max requests per second a service can handle while staying below a specific latency. It also outputs detailed latency curve files that can be load into HdrHistogram Plotter in order to better understand how the application acts under stress. The tool does this by amping up the number of requests per second of the application while insuring it is under a max latency.
Usage: ./vegeta-break [OPTIONS] url
-body-file string
a file to be read and used as the body of each request
-climb-multiple float
How many times more requests to send after a success. Must be greater than 1.0 (default 2)
-duration duration
Duration for each latency test (default 1m0s)
-keep-alive
whether or not to use http keep alive connections (default true)
-max-connections int
Max open idle connections per target host (default 10000)
-max-timeout duration
Max time to wait before a response (default 3s)
-method string
the http request method (default "GET")
-percentile float
The percentile that latency is measured at (default 99.9)
-rps int
Starting requests per second (default 20)
-rps-accuracy float
How close the output should be to the correct rps. 100 is exact rps. 95 would be within 5% (default 100)
-sla duration
Max acceptable latency (default 500ms)
The work by Gil Tene is a large inspiration for vegeta-break. Gil Tene teaches just how bad most latency measuring tools are through his talk How NOT to Measure Latency. Other resources that talk about this issue in shorter article form are Everything You Know About Latency Is Wrong and Your Load Generator is Probably Lying to You. I highly suggest you watch the talk if not at least read the articles. They show just how wrongly we measure and interpret latency. I hope that this tool can help people better compare websites changes based on latency and requests per second.
The core to this project is a tool called vegeta. Vegeta is a load testing tool that gets around the latency calculations issues that other tools have by asyncronously requesting for pages. If vegeta does not have enoug workers, it will spin up a new worker to ensure that request are sent on time. Thus, for vegeta, 50 requests per second means a request will be sent out every 20ms no matter the current latency and state of other requests.
Lastly, HTTP Load Testing with Vegeta (and a dash of Python) is the basis of how I decided to test webpages. The author of this post actually made a script called vegeta-break. I took that script and expanded upon it to build the tool found here.
As a general rule of thumb, take the latency that you want X% of users to see and the add one to two 9s onto the end. For example, if you want 99% of users to see a latency of less than 1 second, then set percentile to 99.9 or 99.99 and sla to 1s.