-
Notifications
You must be signed in to change notification settings - Fork 310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tracking issue for core: improve performance of IO backends #684
Comments
Some great stuff here! ❤️ I also made some observations when re-designing the
I am also gaining more context daily about the rest of the codebase, but this is great to have somewhere to mark down observations or just open questions around IO performance 👍 So I'll be sure to keep up with this thread. Note For reference I do believe that #570 is indeed fixed on |
Yup, I had a look at your PR a few days ago. If you come up with any other improvement, feel free to add it. |
I think we should evaluate the efficiency or just general use of vectored IO operations with the |
This is a tracking issue for changes to the IO backends to improve their performance. Additions, suggestions and improvements welcome!
Rationale
Motivation and my current thoughts
I have been testing both the unix and the io_uring backend against each other, and the unix backend always has lower latency. Of course, my testing is pretty limited, since it was only using the testing database file with the current banches. I got the same results for both backends on SATA SSD and NVMe, so I'm pretty sure there is more overhead with the io_uring calls than the potential performance gains. Fortunately, I think there is a lot of room for improvement, including registering the files to the ring, registering buffers (this is pretty complex with the current Limbo codebase, but I'm working on something), turning remaining syscalls into io_uring opcodes, improving O_DIRECT alignment sizes, and so much more. Many of these changes should also translate in better performance for the other backends.
However, before turning to implementing things, better observability and benchmarking are needed. I will open an issue for each in a moment and edit this. Right now, Limbo CLI uses log with env-logger and core has criterion with pprof-rs. However, the set of benches and the current log points are not very exhaustive. Since Limbo is in heavy development, this is fine, but I think improving the situation now would be beneficial even for the development process, as there is frequently a need to debug some behavior or to compare performance of implementations.
For IO testing, we also need tests stressing high concurrency. I will have new hardware in a few days, where I can work on this better than on my personal system tuned for responsiveness.
The performance improvements don't have to be isolated to the IO backends. It's just that I have worked mostly on those and I have a deeper understanding. If anyone wants to add other parts of the system, I welcome the addition to this issue and will rename it.
Steps
An initial list of steps, in rough order
Issues that require or would be affected by this issue
no_std
#442 will hopefully be a side effect of this, since it involves custom allocation and I would like to use the allocator_api, we could just use alloc and core everywhere instead of std.Footnotes
Swizzling (LeanStore) ↩
Umbra ↩ ↩2
Virtual-Memory Assisted Buffer Management (LeanStore) ↩
The text was updated successfully, but these errors were encountered: