[enhancement] zero_cache should write a full block when writing into all-zero block, thus avoiding read-modify-write cycle #201

xmrk-btc · 2023-01-15T16:39:54Z

When zero_cache is doing write_block_part into all-zero block, it passes the request to block cache, which possibly does not have the full block, so it needs to read it from remote storage. Which is unnecessary, zero_cache can reconstruct the whole block, and avoid the reading and delay.

A little test: zero the first 200 block (blocksize of s3backer is 64k)
$ dd if=/dev/zero of=mount/file bs=64k count=200

Write into the middle of 2nd block and see the log file

$ dd if=/dev/urandom of=mount/file bs=1k count=1 seek=100
$ journalctl -f -t s3backer
Jan 15 16:13:12 ubuntu s3backer[3679793]: GET https://objects-us-east-1.dream.io/test3/00000001
Jan 15 16:13:12 ubuntu s3backer[3679793]: rec'd 404 response: GET https://objects-us-east-1.dream.io/test3/00000001
Jan 15 16:13:12 ubuntu s3backer[3679793]: PUT https://objects-us-east-1.dream.io/test3/00000001
Jan 15 16:13:13 ubuntu s3backer[3679793]: success: PUT https://objects-us-east-1.dream.io/test3/00000001

And I want to get rid of that GET request. (Sometimes it does not appear, probably because I did various tests on the same bucket, so there may be stale data which get DELETEd and cached in block_cache.)

The real-world motivation here is latency - when resilvering ZFS after implementing sub-block hole punching, the resilvering was still slow (like 500 kB/s), without much disk activity, and I saw http_zero_blocks_read stat incrementing by 1 or 2 per second. So I guess the latency of the read-modify-write cycle was killing the performance.

The text was updated successfully, but these errors were encountered:

…#201).

archiecobbs · 2023-01-16T21:56:34Z

Good idea! Should be fixed in e0d21a9.

Regarding locking, this is not a problem. The race condition is between two filesystem threads accessing the same s3backer block at the same time; that's now absorbed at the top layer; see d949846.

archiecobbs · 2023-01-16T23:13:03Z

On second thought, I think that this idea is not actually safe.

Instead the safer (and simpler) fix is just to eliminate the function zero_cache_write_block_part(). Then the function block_part_write_block_part() will perform a read-modify-write, but the read will be intercepted by the zero cache, and therefore not result in any network traffic.

Updated in bb65ec2.

…yet.

archiecobbs · 2023-01-17T17:14:32Z

Thinking even more, I'm still a little unsure about this. I think my second patch, while safe from race conditions, might also bring back the write amplification problem.

This needs more thought. E.g., consolidation of the block cache and the zero cache into a single entity, etc.

xmrk-btc mentioned this issue Jan 15, 2023

zero_cache handles partial write to zero block by writing the whole b… #202

Closed

archiecobbs added a commit that referenced this issue Jan 16, 2023

Convert partial writes into zero blocks into whole block writes (issue …

e0d21a9

…#201).

archiecobbs closed this as completed Jan 16, 2023

archiecobbs added a commit that referenced this issue Jan 17, 2023

Revert previous two commits for issue #201. Not completely sure just …

f2ce475

…yet.

archiecobbs reopened this Jan 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[enhancement] zero_cache should write a full block when writing into all-zero block, thus avoiding read-modify-write cycle #201

[enhancement] zero_cache should write a full block when writing into all-zero block, thus avoiding read-modify-write cycle #201

xmrk-btc commented Jan 15, 2023

archiecobbs commented Jan 16, 2023

archiecobbs commented Jan 16, 2023 •

edited

Loading

archiecobbs commented Jan 17, 2023

[enhancement] zero_cache should write a full block when writing into all-zero block, thus avoiding read-modify-write cycle #201

[enhancement] zero_cache should write a full block when writing into all-zero block, thus avoiding read-modify-write cycle #201

Comments

xmrk-btc commented Jan 15, 2023

archiecobbs commented Jan 16, 2023

archiecobbs commented Jan 16, 2023 • edited Loading

archiecobbs commented Jan 17, 2023

archiecobbs commented Jan 16, 2023 •

edited

Loading