Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: reserve thread for kernel reads that miss the cache #132

Open
JasonWoof opened this issue Sep 12, 2020 · 2 comments
Open

Feature: reserve thread for kernel reads that miss the cache #132

JasonWoof opened this issue Sep 12, 2020 · 2 comments

Comments

@JasonWoof
Copy link

Thanks for making s3backer! It seems unreasonably cool to me.

I set up a new s3backer drive/mount with these commands:

s3backer --blockSize=4k --size=240g --listBlocks --blockCacheFile=/media/s3backer/cache --blockCacheMaxDirty=500 --blockCacheThreads=4 --blockCacheRecoverDirtyBlocks --debug --debug-http --vhost --baseURL=https://us-east-1.linodeobjects.com/ bucket -- /media/s3backer/s3block
time mkfs.ext4 -E nodiscard -m 0.001 /media/s3backer/s3block/file
mount -o loop -o discard /media/s3backer/s3block/file /media/s3backer/s3files

That should make my write cache be around 2MB right?

Then I used rsync to copy about 200MB of local data onto the s3backer drive. Predictably it went really fast for a while, then started blocking periodically, presumably waiting for space in the s3backer write cache.

While that was running there was a large lag (5-10 seconds) to read from the s3backer drive (eg catting a small text file).

I was not particularly surprised by this, since I did overwhelm the cache. If I use s3backer for serious things I will have a much larger disk cache.

But what surprised me more, and is a problem I don't see how to solve is: After rsync finished, but before my bucket storage stopped rising (so presumably while the s3backer write cache was still being uploaded to S3) there was still a large lag to cat small text files from the s3backer drive. During this period of time (after rsync completed, while S3 storage was going up and reads were laggy) a lot more than 2MB was being uploaded to S3. So maybe there's something I don't understand... am I wrong about my commandline arguments above specifying a ~2MB write cache? Is the kernel caching writes too? Something else?

I'm hoping there's a way that reads from s3backer can be reasonably performant while there are a number of dirty blocks on their way to S3.

I wonder if this could be solved by having s3backer reserve one thread for downloading from S3 immediately when (and only when) the kernel is requesting a read that cannot be fulfilled by the cache.

The kernel only makes one read at a time right? If so it would only take a single thread to handle these reads asap. If the kernel can request multiple reads at once I'd love the ability to set the max number of write threads separately from read threads (or a minimum number of threads to be available for reading, or some such.)

@archiecobbs
Copy link
Owner

First of all, it's definitely true that the kernel caches a bunch of data. Exactly what and how much is a little murky to me so I can't say definitively that explains everything. Note you've got the kernel buffer cache, and then a separate cache relating to the filesystem (e.g., journaling or whatever).

Otherwise, I'm not sure what exactly is happening in your scenario. This is because I'm not an expert on how FUSE works. All of the request threads come from FUSE via its own private kernel interaction and I'm not sure how that translates into the request thread(s) that s3backer ends up seeing.

One would hope that different I/O operations would each get their own threads, but the kernel caches and filesystem create plenty of wiggle room.

What I do believe is that s3backer should be as multi-threaded as possible given whatever kernel request threads are coming into it from FUSE. E.g., if there are two request threads and one is reading and one is writing, and they aren't conflicting (i.e., disjoint blocks), then s3backer should allow them to proceed in parallel.

Maybe a good question for the FUSE mailing list?

@archiecobbs
Copy link
Owner

You might want to try the new NBD support with the --nbd flag in the current master branch. NBD supports more concurrency than FUSE.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants