Skip to content

Commit a9c8ea9

Browse files
megarikevmw
authored andcommitted
block/blklogwrites: Fix a bug when logging "write zeroes" operations.
There is a bug in the blklogwrites driver pertaining to logging "write zeroes" operations, causing log corruption. This can be easily observed by setting detect-zeroes to something other than "off" for the driver. The issue is caused by a concurrency bug pertaining to the fact that "write zeroes" operations have to be logged in two parts: first the log entry metadata, then the zeroed-out region. While the log entry metadata is being written by bdrv_co_pwritev(), another operation may begin in the meanwhile and modify the state of the blklogwrites driver. This is as intended by the coroutine-driven I/O model in QEMU, of course. Unfortunately, this specific scenario is mishandled. A short example: 1. Initially, in the current operation (#1), the current log sector number in the driver state is only incremented by the number of sectors taken by the log entry metadata, after which the log entry metadata is written. The current operation yields. 2. Another operation (qemu#2) may start while the log entry metadata is being written. It uses the current log position as the start offset for its log entry. This is in the sector right after the operation #1 log entry metadata, which is bad! 3. After bdrv_co_pwritev() returns (#1), the current log sector number is reread from the driver state in order to find out the start offset for bdrv_co_pwrite_zeroes(). This is an obvious blunder, as the offset will be the sector right after the (misplaced) operation qemu#2 log entry, which means that the zeroed-out region begins at the wrong offset. 4. As a result of the above, the log is corrupt. Fix this by only reading the driver metadata once, computing the offsets and sizes in one go (including the optional zeroed-out region) and setting the log sector number to the appropriate value for the next operation in line. Signed-off-by: Ari Sundholm <[email protected]> Cc: [email protected] Message-ID: <[email protected]> Reviewed-by: Kevin Wolf <[email protected]> Signed-off-by: Kevin Wolf <[email protected]>
1 parent 5bab95d commit a9c8ea9

File tree

1 file changed

+26
-9
lines changed

1 file changed

+26
-9
lines changed

block/blklogwrites.c

Lines changed: 26 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -328,22 +328,39 @@ static void coroutine_fn GRAPH_RDLOCK
328328
blk_log_writes_co_do_log(BlkLogWritesLogReq *lr)
329329
{
330330
BDRVBlkLogWritesState *s = lr->bs->opaque;
331-
uint64_t cur_log_offset = s->cur_log_sector << s->sectorbits;
332331

333-
s->nr_entries++;
334-
s->cur_log_sector +=
335-
ROUND_UP(lr->qiov->size, s->sectorsize) >> s->sectorbits;
332+
/*
333+
* Determine the offsets and sizes of different parts of the entry, and
334+
* update the state of the driver.
335+
*
336+
* This needs to be done in one go, before any actual I/O is done, as the
337+
* log entry may have to be written in two parts, and the state of the
338+
* driver may be modified by other driver operations while waiting for the
339+
* I/O to complete.
340+
*/
341+
const uint64_t entry_start_sector = s->cur_log_sector;
342+
const uint64_t entry_offset = entry_start_sector << s->sectorbits;
343+
const uint64_t qiov_aligned_size = ROUND_UP(lr->qiov->size, s->sectorsize);
344+
const uint64_t entry_aligned_size = qiov_aligned_size +
345+
ROUND_UP(lr->zero_size, s->sectorsize);
346+
const uint64_t entry_nr_sectors = entry_aligned_size >> s->sectorbits;
336347

337-
lr->log_ret = bdrv_co_pwritev(s->log_file, cur_log_offset, lr->qiov->size,
348+
s->nr_entries++;
349+
s->cur_log_sector += entry_nr_sectors;
350+
351+
/*
352+
* Write the log entry. Note that if this is a "write zeroes" operation,
353+
* only the entry header is written here, with the zeroing being done
354+
* separately below.
355+
*/
356+
lr->log_ret = bdrv_co_pwritev(s->log_file, entry_offset, lr->qiov->size,
338357
lr->qiov, 0);
339358

340359
/* Logging for the "write zeroes" operation */
341360
if (lr->log_ret == 0 && lr->zero_size) {
342-
cur_log_offset = s->cur_log_sector << s->sectorbits;
343-
s->cur_log_sector +=
344-
ROUND_UP(lr->zero_size, s->sectorsize) >> s->sectorbits;
361+
const uint64_t zeroes_offset = entry_offset + qiov_aligned_size;
345362

346-
lr->log_ret = bdrv_co_pwrite_zeroes(s->log_file, cur_log_offset,
363+
lr->log_ret = bdrv_co_pwrite_zeroes(s->log_file, zeroes_offset,
347364
lr->zero_size, 0);
348365
}
349366

0 commit comments

Comments
 (0)