-
-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add benchmark tooling for rama-cli and profile rama #340
Comments
For now we can just benchmark:
as these scenarios are probably going to be the best ones supported for v0.2. |
I forgot that we in the current http-backend require a Mutex for the http client, e.g.: #[derive(Debug)]
// TODO: once we have hyper as `rama_core` we can
// drop this mutex as there is no inherint reason for `sender` to be mutable...
pub(super) enum SendRequest<Body> {
Http1(Mutex<hyper::client::conn::http1::SendRequest<Body>>),
Http2(Mutex<hyper::client::conn::http2::SendRequest<Body>>),
} I bet that this is already a big explanation on why it is so slow. For sure there is other stuff that can be improved, haven't profiled yet. But let's first get the fork+embed work of hyper going and done, so that we can start from a benchmark without this mutex still in place. As that will then no longer be required. |
To test the theory I ran against a rama-based http server. And yeah it works a lot faster... Still not as fast as I would hope, but this is better. We can circle back into this issue after hyper migration has happened. |
Started doing some profiling. Seems not as much to do with the Mutex (which we no longer do for h2 but only for h1). Connection pooling is gonna have to be done in 0.3 for sure, and decently so. After that is done we can also see what we can improve around the TLS usage. |
No special tooling required, our bench setup using What can still be added is some benchmarks of a full rama stack (e.g. https traffic from a client over a proxy to a server. These benchmarks give a nice overview of the allocations as well as the performance. Once such full picture benchmarks are added I think we can close it. |
Current request throughput is pretty sloppy...
Some benchmarks that came in measure at <400req/sec... That's embarrassingly slow.
The text was updated successfully, but these errors were encountered: