`read_exact_from` and `write_all_to` from `GuestMemory` don't have the correct semantics #174

sboeuf · 2021-09-27T08:03:25Z

When using the GuestMemory implementation of read_exact_from and write_all_to, I was expecting they would handle the retry, but they simply return an Error saying they couldn't read or write the entire buffer. That's the wrong behavior IMO, especially because of the naming that suggest they should behave exactly as the Rust traits Read and Write.

The proper solution would be to call underlying functions read_exact_from and write_all_to from the GuestRegionMmap implementation after we found the regions affected by the range, leading to a proper retry when not all data has been read or written.

/cc @andreeaflorescu @alexandruag @jiangliu

The text was updated successfully, but these errors were encountered:

Both read_exact_from() and write_all_to() functions from the GuestMemory trait implementation in vm-memory are buggy. They should retry until they wrote or read the amount of data that was expected, but instead they simply return an error when this happens. This causes the migration to fail when trying to send important amount of data through the migration socket, due to large memory regions. This should be eventually fixed in vm-memory, and here is the link to follow up on the issue: rust-vmm/vm-memory#174 Signed-off-by: Sebastien Boeuf <[email protected]>

bonzini · 2021-09-27T08:14:07Z

I agree, they should retry as long as read_from/write_to return Ok.

However, there's the problem of returning the number of bytes processed before an IOError. It would be best to change IOError to { error: io::Error, processed: usize } and implementing From<io::Error> for guest_memory::Error. That change is an API breakage, but somewhat unlikely to cause problems.

andreeaflorescu · 2021-09-27T08:34:10Z

Is this issue related to #171?

sboeuf · 2021-09-27T08:42:33Z

@andreeaflorescu

Is this issue related to #171?

I don't think so.

Both read_exact_from() and write_all_to() functions from the GuestMemory trait implementation in vm-memory are buggy. They should retry until they wrote or read the amount of data that was expected, but instead they simply return an error when this happens. This causes the migration to fail when trying to send important amount of data through the migration socket, due to large memory regions. This should be eventually fixed in vm-memory, and here is the link to follow up on the issue: rust-vmm/vm-memory#174 Signed-off-by: Sebastien Boeuf <[email protected]>

alindima · 2021-09-27T13:35:43Z

That's the wrong behavior IMO, especially because of the naming that suggest they should behave exactly as the Rust traits Read and Write.

The rust trait documentation says that ErrorKind::Interrupted errors are handled with a retry. Any other IOError is not and results in returning an Error from the function. Moreover, it says that the contents of the buffer are undefined in this case.
See: https://doc.rust-lang.org/std/io/trait.Read.html#errors-3
So I'm not sure that this would be the right semantic for read_exact_from and write_all_to.
I don't think we should be hiding unrecoverable errors, because this call could block indefinitely

I agree, they should retry as long as read_from/write_to return Ok.

I agree as well, but this is already true. read_exact_from calls read_from only once, but read_from calls read in a try_access, which performs it in a loop as long as there isn't an error or until we're done reading. This is also the root-cause of this issue: #171

bonzini · 2021-09-27T14:22:30Z

I don't think we should be hiding unrecoverable errors, because this call could block indefinitely

Indeed, absolutely not. So I'm not sure what can/should be changed, except possibly adjusting the error type.

bonzini · 2021-09-27T14:29:46Z

Looking at the cloud-hypervisor commit, that seems like cloud-hypervisor is setting sockets to non-blocking mode? If so, that is certainly something that read_exact cannot handle, and neither should read_exact_from. cloud-hypervisor instead must handle WouldBlock correctly.

sboeuf mentioned this issue Sep 27, 2021

vmm: Fix migration when writing/reading big chunks of data cloud-hypervisor/cloud-hypervisor#3157

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`read_exact_from` and `write_all_to` from `GuestMemory` don't have the correct semantics #174

`read_exact_from` and `write_all_to` from `GuestMemory` don't have the correct semantics #174

sboeuf commented Sep 27, 2021

bonzini commented Sep 27, 2021 •

edited

Loading

andreeaflorescu commented Sep 27, 2021 •

edited

Loading

sboeuf commented Sep 27, 2021

alindima commented Sep 27, 2021

bonzini commented Sep 27, 2021

bonzini commented Sep 27, 2021

read_exact_from and write_all_to from GuestMemory don't have the correct semantics #174

read_exact_from and write_all_to from GuestMemory don't have the correct semantics #174

Comments

sboeuf commented Sep 27, 2021

bonzini commented Sep 27, 2021 • edited Loading

andreeaflorescu commented Sep 27, 2021 • edited Loading

sboeuf commented Sep 27, 2021

alindima commented Sep 27, 2021

bonzini commented Sep 27, 2021

bonzini commented Sep 27, 2021

`read_exact_from` and `write_all_to` from `GuestMemory` don't have the correct semantics #174

`read_exact_from` and `write_all_to` from `GuestMemory` don't have the correct semantics #174

bonzini commented Sep 27, 2021 •

edited

Loading

andreeaflorescu commented Sep 27, 2021 •

edited

Loading