Feature: reserve thread for kernel reads that miss the cache #132

JasonWoof · 2020-09-12T08:20:57Z

Thanks for making s3backer! It seems unreasonably cool to me.

I set up a new s3backer drive/mount with these commands:

s3backer --blockSize=4k --size=240g --listBlocks --blockCacheFile=/media/s3backer/cache --blockCacheMaxDirty=500 --blockCacheThreads=4 --blockCacheRecoverDirtyBlocks --debug --debug-http --vhost --baseURL=https://us-east-1.linodeobjects.com/ bucket -- /media/s3backer/s3block
time mkfs.ext4 -E nodiscard -m 0.001 /media/s3backer/s3block/file
mount -o loop -o discard /media/s3backer/s3block/file /media/s3backer/s3files

That should make my write cache be around 2MB right?

Then I used rsync to copy about 200MB of local data onto the s3backer drive. Predictably it went really fast for a while, then started blocking periodically, presumably waiting for space in the s3backer write cache.

While that was running there was a large lag (5-10 seconds) to read from the s3backer drive (eg catting a small text file).

I was not particularly surprised by this, since I did overwhelm the cache. If I use s3backer for serious things I will have a much larger disk cache.

But what surprised me more, and is a problem I don't see how to solve is: After rsync finished, but before my bucket storage stopped rising (so presumably while the s3backer write cache was still being uploaded to S3) there was still a large lag to cat small text files from the s3backer drive. During this period of time (after rsync completed, while S3 storage was going up and reads were laggy) a lot more than 2MB was being uploaded to S3. So maybe there's something I don't understand... am I wrong about my commandline arguments above specifying a ~2MB write cache? Is the kernel caching writes too? Something else?

I'm hoping there's a way that reads from s3backer can be reasonably performant while there are a number of dirty blocks on their way to S3.

I wonder if this could be solved by having s3backer reserve one thread for downloading from S3 immediately when (and only when) the kernel is requesting a read that cannot be fulfilled by the cache.

The kernel only makes one read at a time right? If so it would only take a single thread to handle these reads asap. If the kernel can request multiple reads at once I'd love the ability to set the max number of write threads separately from read threads (or a minimum number of threads to be available for reading, or some such.)

The text was updated successfully, but these errors were encountered:

archiecobbs · 2020-09-13T02:29:18Z

First of all, it's definitely true that the kernel caches a bunch of data. Exactly what and how much is a little murky to me so I can't say definitively that explains everything. Note you've got the kernel buffer cache, and then a separate cache relating to the filesystem (e.g., journaling or whatever).

Otherwise, I'm not sure what exactly is happening in your scenario. This is because I'm not an expert on how FUSE works. All of the request threads come from FUSE via its own private kernel interaction and I'm not sure how that translates into the request thread(s) that s3backer ends up seeing.

One would hope that different I/O operations would each get their own threads, but the kernel caches and filesystem create plenty of wiggle room.

What I do believe is that s3backer should be as multi-threaded as possible given whatever kernel request threads are coming into it from FUSE. E.g., if there are two request threads and one is reading and one is writing, and they aren't conflicting (i.e., disjoint blocks), then s3backer should allow them to proceed in parallel.

Maybe a good question for the FUSE mailing list?

archiecobbs · 2022-05-24T14:52:41Z

You might want to try the new NBD support with the --nbd flag in the current master branch. NBD supports more concurrency than FUSE.

brianredbeard mentioned this issue Oct 15, 2020

System hangs on fast concurrent creation of files #133

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: reserve thread for kernel reads that miss the cache #132

Feature: reserve thread for kernel reads that miss the cache #132

JasonWoof commented Sep 12, 2020

archiecobbs commented Sep 13, 2020

archiecobbs commented May 24, 2022

Feature: reserve thread for kernel reads that miss the cache #132

Feature: reserve thread for kernel reads that miss the cache #132

Comments

JasonWoof commented Sep 12, 2020

archiecobbs commented Sep 13, 2020

archiecobbs commented May 24, 2022