Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BufRing: the buf_ring provided buffer pool #245

Open
wants to merge 16 commits into
base: master
Choose a base branch
from

Conversation

FrankReh
Copy link
Collaborator

Adds new types: bufring::{BufRing, Builder}, bufgroup:{BufX, Bgid, Bid}

(BufX was chosen as the name of the buffer type of this pool
simply to make it short and easy to grep for.)

bufgroup may be better named provbuf, or provided_buffers, down the
road, especially if we intent to support the other, older, much less
efficient provided buffer mechanism. But the liburing team
specifically recommends the buf_ring implementation for ease of use
and performance reasons so we may not want to incur the overhead of
supporting both forms.

bufring could become a trait down the road, where different BufRing
implementations can be provided. But to avoid the extra decision
making that comes with designing traits, and to keep the code
relatively easy to follow, these are all concrate types.

Adds register and unregister functionality to BufRing and to the
tokio_uring driver.

Adds experimental recv, recv_provbuf methods to TcpStream.

recv_provbuf is a purposefully ugly name. It will be replaced by
something ... maybe a recv builder sometimme soon.

Whatever is chosen would then be copied for Udp, Unix and for all
the other operations that optionally take a provided buffer pool id.

Adds 'net' unit tests. All are admittedly simple ping/pong tests where
only the clients' received lengths are checked, not the actual data.

Adds a tests/common area.

Adds a test case that uses two std::threads, where each thread runs its
own tokio_uring runtime and its own buf_ring provided buffer pool.

The two-thread case is made long, with many clients, sending many
large messages to the server and getting them back, in order to see
gross performance impacts when changing things. It takes 3s on my
machine. Before going into mainline, the numbers would be changed so
it took no more than the other unit tests, so about 10ms.

Some TODOs left to cleanup. Primarily how to build more meaningful errors when a register fails.

The buffer allocation is made as a single allocation, along with the
ring.

The buffer group id, bgid, also sometimes called the provided buffer
group id, is manually selected through the builder api. There is no
mechanism to pick one automatically. That could be added later but is
not really necessary for this feature to be useful.

This first implementation is without traits and without public
interfaces that would let a user create a different kind of buf_ring or
a different kind of provided buffers pool.

There's a question to the liburing team outstanding about how to
interpret an unexpected cqe result of res=0 and flags=4.

Adds new types: bufring::{BufRing, Builder}, bufgroup:{BufX, Bgid, Bid}

    (BufX was chosen as the name of the buffer type of this pool
    simply to make it short and easy to grep for.)

    bufgroup may be better named provbuf, or provided_buffers, down the
    road, especially if we intent to support the other, older, much less
    efficient provided buffer mechanism. But the liburing team
    specifically recommends the buf_ring implementation for ease of use
    and performance reasons so we may not want to incur the overhead of
    supporting both forms.

    bufring could become a trait down the road, where different BufRing
    implementations can be provided. But to avoid the extra decision
    making that comes with designing traits, and to keep the code
    relatively easy to follow, these are all concrate types.

Adds register and unregister functionality to BufRing and to the
tokio_uring driver.

Adds experimental recv, recv_provbuf methods to TcpStream.

    recv_provbuf is a purposefully ugly name. It will be replaced by
    something ... maybe a recv builder sometimme soon.

    Whatever is chosen would then be copied for Udp, Unix and for all
    the other operations that optionally take a provided buffer pool id.

Adds 'net' unit tests. All are admittedly simple ping/pong tests where
only the clients' received lengths are checked, not the actual data.

Adds a tests/common area.

Adds a test case that uses two std::threads, where each thread runs its
own tokio_uring runtime and its own buf_ring provided buffer pool.

    The two-thread case is made long, with many clients, sending many
    large messages to the server and getting them back, in order to see
    gross performance impacts when changing things. It takes 3s on my
    machine. Before going into mainline, the numbers would be changed so
    it took no more than the other unit tests, so about 10ms.

Many TODOs left to cleanup. Primarily Safety rationalizations.

The buffer allocation is made as a single allocation, along with the
ring.

The buffer group id, bgid, also sometimes called the provided buffer
group id, is manually selected through the builder api. There is no
mechanism to pick one automatically. That could be added later but is
not really necessary for this feature to be useful.

This first implementation is without traits and without public
interfaces that would let a user create a different kind of buf_ring or
a different kind of `provided buffers` pool.

There's a question to the liburing team outstanding about how to
interpret an unexpected cqe result of res=0 and flags=4.
@FrankReh FrankReh mentioned this pull request Feb 26, 2023
@FrankReh FrankReh mentioned this pull request Feb 28, 2023
4 tasks
@ollie-etl
Copy link
Contributor

@FrankReh Do you an interest in rebasing this? I'd love to see this merged


let align: usize = page_size; // alignment must be at least the page size
let align = align.next_power_of_two();
let layout = std::alloc::Layout::from_size_align(tot_size, align).unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Must the ring and buffers be in contiguous memory? If for instance, I wished to allocate a 1GB aligned buffer, to get tansparent hugepages to kick in, this would push the total allocation to 2GB, which would be enormously wasteful.

@FrankReh
Copy link
Collaborator Author

@FrankReh Do you an interest in rebasing this? I'd love to see this merged

Thanks for considering the PR. Currently I'm not able to rebase it. Feel free to adapt it or even close it so someone wanting to could start over.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants