Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

memcpy_to_pmem_nocache does not persist unaligned bytes #105

Open
pcworld opened this issue Jun 23, 2021 · 4 comments
Open

memcpy_to_pmem_nocache does not persist unaligned bytes #105

pcworld opened this issue Jun 23, 2021 · 4 comments

Comments

@pcworld
Copy link
Contributor

pcworld commented Jun 23, 2021

NOVA's function memcpy_to_pmem_nocache is commonly used in NOVA, and is defined as follows:

linux-nova/fs/nova/nova.h

Lines 259 to 267 in 587e252

static inline int memcpy_to_pmem_nocache(void *dst, const void *src,
unsigned int size)
{
int ret;
ret = __copy_from_user_inatomic_nocache(dst, src, size);
return ret;
}

The naming seems to imply that this function guarantees the cache to be bypassed and thus the data to be persisted without need of explicit flushing. And indeed, this assumption is followed in at least one place of NOVA, as shown later.

Following the definition of __copy_from_user_inatomic_nocache (on x86_64) to __copy_user_nocache, we find the following documentation:

/*
* copy_user_nocache - Uncached memory copy with exception handling
* This will force destination out of cache for more performance.
*
* Note: Cached memory copy is used when destination or size is not
* naturally aligned. That is:
* - Require 8-byte alignment when size is 8 bytes or larger.
* - Require 4-byte alignment when size is 4 bytes.
*/
ENTRY(__copy_user_nocache)

Following the assembly code, we find that:

  • If the size is < 8 bytes, size must be 0 or 4 bytes and aligned at 4 bytes to ensure that everything is copied using non-temporal instructions.
  • If the size is >= 8 bytes, the destination start must be aligned at 8 bytes (or else the ALIGN_DESTINATION macro will perform cached copy) and the size must be a multiple of 4 bytes to ensure that everything is copied using non-temporal instructions.

Some places in NOVA use memcpy_to_pmem_nocache but still perform explicit flushing, e.g. through the nova_update_entry_csum function, which implicitly flushes the buffer. These cases could be considered performance bugs, but do not affect crash consistency.

However, at least one place exists in which use of memcpy_to_pmem_nocache currently leads to a violation of crash consistency guarantees (I have not thoroughly looked for other cases so there might be more):
nova_cow_file_write calls do_nova_cow_file_write which calls memcpy_to_pmem_nocache to copy file contents to persistent memory and does not perform further flushing.

Consider this shell script:

mount -t NOVA -o init /dev/pmem0 /mnt
echo HelloWorld > /mnt/myfile
sync

The string HelloWor (8 bytes) is copied using movnti, however the rest of the bytes l, d, \n are copied using a temporal mov instruction. Thus, the last three bytes are not guaranteed to be persisted. If the system crashes after the sync system call succeeded but the last three bytes have not yet been persisted, recovering the file system and reading /mnt/myfile will only yield HelloWor\0\0\0 as the file content, which violates the guarantees of the sync system call.

Suggested fix

In NOVA's memcpy_to_pmem_nocache, after the call to __copy_from_user_inatomic_nocache: If the function arguments do not fulfill the alignment requirements outlined above, explicitly call nova_flush_buffer with the bytes at the beginning and/or the end of the buffer.

@Andiry
Copy link
Contributor

Andiry commented Jun 29, 2021

Thanks for the findings. Care to send a PR?

@pcworld
Copy link
Contributor Author

pcworld commented Oct 20, 2021

I'm short on time, but there is the following workaround (it performs unnecessary flushes, but is at least correct hopefully):

--- a/fs/nova/nova.h
+++ b/fs/nova/nova.h
@@ -263,6 +263,9 @@ static inline int memcpy_to_pmem_nocache(void *dst, const void *src,
 
        ret = __copy_from_user_inatomic_nocache(dst, src, size);
 
+       nova_flush_buffer(dst, size, 1);
+
        return ret;
 }

A better fix would be to only call nova_flush_buffer on beginning/end where alignment does not match. The code for that should be simple, but needs to be carefully written and tested to be completely sure it works in all cases, which is why I haven't submitted a PR yet.

@Andiry
Copy link
Contributor

Andiry commented Dec 7, 2021

I think the proposed fix is too heavy. I think for most of the time we do meet the alignment requirements, right? Perhaps performing nova_flush_buffer here with a check of alignments?

@pcworld
Copy link
Contributor Author

pcworld commented Dec 7, 2021

I think for most of the time we do meet the alignment requirements, right?

I have also observed other cases where this bug hits: Symlink creation (target name), and also renaming files to long file names (longer than cache line size) or creating new files with long names.

I have found that PMFS already specifically handles this issue: https://github.com/linux-pmfs/pmfs/blob/28aaf606ff076080a72174e9f5a7be13979a8b97/fs/pmfs/xip.c#L81-L84
Perhaps pmfs_flush_edge_cachelines can just be reused in NOVA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants