-
Notifications
You must be signed in to change notification settings - Fork 347
Makes the reference types sound via a DST #1532
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
The current implementation of `ArrayRef` and its cousins has them as sized types, which turns out to be a critical and unsound mistake. This commit is large, but its heart is small: change the `ArrayRef` implementation to be unsized by appending empty slices to the end of them.
It's easier inside the library to have ArrayParts be public and have LayoutRef be a type alias. Outside of the library, though, having the ArrayParts be a public interface exposes implementation details. If we ever wanted to switch to a trait object or another approach for DSTs, having ArrayParts be public would make that harder.
The doc tests now contain a check that ensures that mem::swap can't create unsound behavior, so it's critical we continue to monitor that going forward.
/// A possibly-unsized container for array parts. | ||
/// | ||
/// This type only exists to enable holding the array parts in a single | ||
/// type, which needs to be sized inside of `ArrayBase` and unsized inside | ||
/// of the reference types. | ||
#[derive(Debug)] | ||
struct ArrayParts<A, D, T: ?Sized> | ||
{ | ||
/// A non-null pointer into the buffer held by `data`; may point anywhere | ||
/// in its range. If `S: Data`, this pointer must be aligned. | ||
ptr: NonNull<A>, | ||
/// The lengths of the axes. | ||
dim: D, | ||
/// The element count stride per axis. To be parsed as `isize`. | ||
strides: D, | ||
_dst_control: T, | ||
} | ||
|
||
type ArrayPartsSized<A, D> = ArrayParts<A, D, [usize; 0]>; | ||
type ArrayPartsUnsized<A, D> = ArrayParts<A, D, [usize]>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the first major review point: defining a possibly-sized type to hold the array parts.
Thanks to the use of an explicitly 0-length array, the ArrayPartsSized
type is no larger than just the ptr
, dim
, and strides
packaged together.
#[derive(Debug)] | ||
pub struct LayoutRef<A, D> | ||
#[repr(transparent)] | ||
pub struct LayoutRef<A, D>(ArrayPartsUnsized<A, D>); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the second major review point: changing LayoutRef
(which is the "lowest" type in the reference chain) to be a DST. We do this by having it wrap the "unsized" array parts.
The #[repr(transparent)]
gives us the guarantee that the LayoutRef
will have the same layout as ArrayPartsUnsized
, which will be critical for the third major review point.
unsafe { &*(&self.layout as *const LayoutRef<S::Elem, D>).cast::<ArrayRef<S::Elem, D>>() } | ||
let parts: &ArrayPartsUnsized<S::Elem, D> = &self.parts; | ||
let ptr = (parts as *const ArrayPartsUnsized<S::Elem, D>) as *const ArrayRef<S::Elem, D>; | ||
unsafe { &*ptr } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The third, and final, major review point. This particular set of lines is the dereference from ArrayBase
to ArrayRef
; but the core idea is repeated many times across this file for all of the Deref
, DerefMut
, AsRef
, and AsMut
implementations.
Our first line is now an unsizing coercion - going from Sized
to a DST. The next line converts a reference to a pointer, then converts a pointer to ArrayPartsUnsized
to ArrayRef
. Since ArrayRef
is transparent to LayoutRef
which is transparent to ArrayPartsUnsized
, this is valid. Finally, we reborrow the pointer.
/// Tests that a mem::swap can't compile by putting it into a doctest | ||
/// | ||
/// ```compile_fail | ||
/// let mut x = Array1::from_vec(vec![0, 1, 2]); | ||
/// { | ||
/// let mut y = Array1::from_vec(vec![4, 5, 6]); | ||
/// let x_ref = x.as_layout_ref_mut(); | ||
/// let y_ref = y.as_layout_ref_mut(); | ||
/// core::mem::swap(x_ref, y_ref); | ||
/// } | ||
/// ``` | ||
#[allow(dead_code)] | ||
fn test_no_swap_via_doctests() {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And one bonus section: a doctest to assert that the original use-after-free scenario no longer compiles.
The current implementation of
ArrayRef
and its cousins has them as sized types, which turns out to be a critical and unsound mistake. This PR is large, but its heart is small: change theArrayRef
implementation to be unsized.The approach this PR takes is to package the core array "metadata" - the pointer, dim, and strides - into a struct that can either be sized or unsized. This is done by appending a generic "
_dst_control
" field. For the "sized" version of the metadata, that field is a 0-length array. For the "unsized" version of the metadata, that sized field is a struct. This core type is private, so users cannot construct any other variants other than these two.We then put the sized version into the
ArrayBase
types, put the unsized version into the reference types, and perform an unsizing coercion to convert from one to the other. Because Rust has no (safe, supported) "resizing" coercion, this switch is irreversible. Sized types cannot be recovered from the unsized reference types.Although this PR contains considerable code changes, most are renames and minor changes from altering the internal structure of the array types. To make this PR easier to review, I'll give a list of major changes in the order I'd review them. Then, in comments below, I will highlight those changes in the same order:
ArrayParts
definitionLayoutRef
a newtypeCloses #1531.
An alternative method would be to use a trait object to create the DST. I'd originally used this approach, but was worried that embedding a vtable into the heart of the library's type could harm performance. One way around this would be to cast the trait object's fat pointer to a thin pointer, discarding the vtable. However, I could not find any guarantees in the Rust type layout documentation nor in the Rustonomicon on DSTs about the layout of DSTs. Relying on behavior that not even the Rustonomicon will guarantee gives me the heebie-jeebies, and the current method appears equally viable.