In honor of the second birthday of stable Rust, here is a small contribution to Rust documentation.
Many concepts in Rust come in matching sets or have some pleasing symmetry. This is a summary or "cheat sheet" for some of these.
Some of these tables may look intimidating, but they reflect the reality of systems programming with fine-grained control over memory. If you're just getting started with Rust, don't panic and do set this article aside! It's intended as a reference and a guide to how to organize these things in your head once you vaguely know what they are.
Contributions are welcome! Just open a pull request.
Can Copy ? |
Can mutate through? | |
---|---|---|
&T |
yes | no |
&mut T |
no | yes |
This demonstrates that neither &T
nor &mut T
is a subtype of the other, in
the Liskov sense.
Ownership controls when a value is destroyed. A value can have either a unique owner, or a number of references which collectively share ownership. The latter case usually involves reference counting.
Interior mutability refers to any wrapper type Wrapper
such that we can go
from &Wrapper<T>
to &mut T
, or at least have some of the capabilities of
&mut T
.
The column headings here refer (more or less) to the point in time at which
Rust's safety invariants are checked. Note that no unsafety can occur due to
sharing the thread-unsafe structures between threads. The compiler will simply
reject your code, through the magic of the Sync
trait.
Static | Dynamic | Dynamic, thread-safe |
|
---|---|---|---|
Direct ownership | T |
||
Ownership via heap | Box<T> |
||
Shared ownership | Rc<T> |
Arc<T> |
|
Get, set, compare & swap, etc. |
&mut T |
Cell<T> |
AtomicFoo |
Borrow immutably | &T |
||
Borrow mutably, or single reader |
Mutex<T> |
||
Borrow mutably, or multiple readers |
&mut T |
RefCell<T> |
RwLock<T> |
Borrow mutably, unsafe |
static mut |
UnsafeCell<T> |
UnsafeCell<T> |
Unsigned integer | Signed integer | Floating-point | |
---|---|---|---|
8 bits | u8 |
i8 |
|
16 bits | u16 |
i16 |
|
32 bits | u32 |
i32 |
f32 |
64 bits | u64 |
i64 |
f64 |
128 bits | u128 α |
i128 α |
|
Pointer-sized | usize |
isize |
α Nightly-only, as of rustc 1.18.
Format | Borrowβ | Borrow substr? | Mutate | Copy on write | Owned, in heap |
---|---|---|---|---|---|
Any bytes | &[u8] |
yes | &mut [u8] |
Cow<[u8]> |
Vec<u8> |
UTF-8 | &str |
yes | &mut str α |
Cow<str> |
String |
Platform-dependent | &OsStr |
no | Cow<OsStr> |
OsString |
|
Filesystem path | &Path |
no | Cow<Path> |
PathBuf |
|
NUL -terminated, safe |
&CStr |
no | Cow<CStr> |
CString |
|
NUL -terminated, raw |
*const c_char γ |
yesδ | *mut c_char γ |
*mut c_char γε |
α Nearly useless, because most mutations could change the length of a UTF-8 codepoint. One exception is ASCII-only case conversion.
β In most cases, you can borrow static memory (e.g. a string literal) with a type
like &'static str
.
γ With raw pointers, you are on your own regarding ownership / borrowing semantics. Any good C library will document its expectations.
δ You can slice off the front of a NUL
-terminated string, but not the end.
ε On the general principle that if you own something you can mutate
it. But you could use *const c_char
instead.
The captures or free variables of a closure are the variables used in a lambda expression which are not defined in the lambda or its arguments list. The captures come from the surrounding environment of a lambda expression.
Rust infers which trait(s) a closure can implement from how the captures are
used. You can force values to be moved into a closure by prefixing the move
keyword.
Each lambda and fn
has its own unique, un-nameable type (Voldemort type?).
This enables static dispatch and inlining. Each of these un-nameable types can
be coerced to the appropriate fn
/ Fn
/ FnMut
/ FnOnce
.
Is a | Can mutate captures? | Can move out of captures? | |
---|---|---|---|
fn(A) -> B |
type | no captures | no captures |
Fn(A) -> B |
trait | no | no |
FnMut(A) -> B |
trait | yes | no |
FnOnce(A) -> B |
trait | yes | yes |
FnBox(A) -> B α |
trait | yes | yes |
α Nightly-only, as of rustc 1.18.
This table describes the sizes of some common types.
A word is a pointer or a pointer-sized integer.
The first size is the size of the value itself: the stuff that ends up on the
stack if you put it in a let
variable. The second size is the size of any
owned data in the heap.
We assume T
, A
, B
, C
are Sized
.
Type | Value size | Contents | Heap size |
---|---|---|---|
bool |
1 byte | 0 or 1 | |
() |
empty! | ||
(A, B, C) struct |
sum of A , B , C + pad / align |
values of type A , B , C |
anything owned by A , B , or C |
enum |
size of tag + max of variants + pad / align |
tag + one variant | anything owned by variant |
[T; n] |
n × size of T |
n elements of type T |
anything owned by T |
&T , &mut T *const T , *mut T |
1 word | pointer | |
Box<T> |
1 word | pointer | size of T |
Option<T> |
1 word + size of T + pad / align (but see below) |
tag + optionally T |
anything owned by T , if Some |
Option<&T> Option<&mut T> |
1 wordβ | pointer or NULL |
|
Option<Box<T>> |
1 wordβ | pointer or NULL |
size of T , if Some |
[T] , str |
dynamic size | elements or codepoints | |
&[T] |
2 words | pointer, length (in elements) | |
&str |
2 words | pointer, length (in bytes) | |
Box<[T]> |
2 words | pointer, length (in elements) | length × size of T |
Box<str> |
2 words | pointer, length (in bytes) | length (bytes) |
Vec<T> |
3 words | pointer, length, capacity | capacity × size of T |
String |
3 words | pointer, length, capacity | capacity (bytes) |
Trait |
dynamic size | fields of concrete type | anything owned by fields |
&Trait |
2 words | pointer to concrete value, pointer to vtable | |
Box<Trait> |
2 words | pointer to concrete value, pointer to vtable | size of concrete value |
Specific fn used as a valueα |
empty! | ||
Specific lambda | depends on captures, but known statically |
captures | anything owned by captures |
fn(A) -> B unsafe fn(A) -> B extern fn(A) -> B |
1 wordα | pointer to code | |
PhantomData<T> |
empty! | ||
Rc<T> Arc<T> |
1 word | pointer | 2 words + size of T + pad / align |
Cell<T> |
size of T |
T |
anything owned by T |
AtomicT |
size of T |
T |
|
RefCell<T> |
1 word + size of T + pad / align |
borrow flag, T |
anything owned by T |
Mutex<T> RwLock<T> |
2 words + size of T + pad / align |
poison flag, pointer to OS mutex, T |
anything owned by T + OS mutex |
α These are function pointers. Technically, they can have a different size from a data pointer, but this does not happen on common architectures.
β This optimization actually applies to any Option
-shaped enum which contains,
somewhere, a field which cannot be 0.