Update drop-related stuff based on improvements to the language

drop
Alexis Beingessner 7 years ago
parent 680e284b0a
commit 6cbff7eaa6

@ -19,6 +19,7 @@
* [Higher-Rank Trait Bounds](hrtb.md) * [Higher-Rank Trait Bounds](hrtb.md)
* [Subtyping and Variance](subtyping.md) * [Subtyping and Variance](subtyping.md)
* [Drop Check](dropck.md) * [Drop Check](dropck.md)
* [Drop Check Escape Patch](dropck-eyepatch.md)
* [PhantomData](phantom-data.md) * [PhantomData](phantom-data.md)
* [Splitting Borrows](borrow-splitting.md) * [Splitting Borrows](borrow-splitting.md)
* [Type Conversions](conversions.md) * [Type Conversions](conversions.md)

@ -17,36 +17,36 @@ boilerplate" to drop children. If a struct has no special logic for being
dropped other than dropping its children, then it means `Drop` doesn't need to dropped other than dropping its children, then it means `Drop` doesn't need to
be implemented at all! be implemented at all!
**There is no stable way to prevent this behavior in Rust 1.0.** If this behaviour is unacceptable, it can be supressed by placing each field
you don't want to drop in a `union`. The standard library provides the
[`mem::ManuallyDrop`][ManuallyDrop] wrapper type as a convience for doing this.
Note that taking `&mut self` means that even if you could suppress recursive
Drop, Rust will prevent you from e.g. moving fields out of self. For most types,
this is totally fine.
For instance, a custom implementation of `Box` might write `Drop` like this:
Consider a custom implementation of `Box`, which might write `Drop` like this:
```rust ```rust
#![feature(unique, allocator_api)] #![feature(unique, allocator_api)]
use std::heap::{Heap, Alloc, Layout}; use std::heap::{Heap, Alloc, Layout};
use std::mem; use std::mem;
use std::ptr::{drop_in_place, Unique}; use std::ptr::drop_in_place;
struct Box<T>{ ptr: Unique<T> } struct Box<T>{ ptr: *mut T }
impl<T> Drop for Box<T> { impl<T> Drop for Box<T> {
fn drop(&mut self) { fn drop(&mut self) {
unsafe { unsafe {
drop_in_place(self.ptr.as_ptr()); drop_in_place(self.ptr);
Heap.dealloc(self.ptr.as_ptr() as *mut u8, Layout::new::<T>()) Heap.dealloc(self.ptr as *mut u8, Layout::new::<T>())
} }
} }
} }
# fn main() {} # fn main() {}
``` ```
and this works fine because when Rust goes to drop the `ptr` field it just sees This works fine because when Rust goes to drop the `ptr` field it just sees
a [Unique] that has no actual `Drop` implementation. Similarly nothing can a `*mut T` that has no actual `Drop` implementation. Similarly nothing can
use-after-free the `ptr` because when drop exits, it becomes inaccessible. use-after-free the `ptr` because when drop exits, it becomes inaccessible.
However this wouldn't work: However this wouldn't work:
@ -55,16 +55,16 @@ However this wouldn't work:
#![feature(allocator_api, unique)] #![feature(allocator_api, unique)]
use std::heap::{Heap, Alloc, Layout}; use std::heap::{Heap, Alloc, Layout};
use std::ptr::{drop_in_place, Unique}; use std::ptr::drop_in_place;
use std::mem; use std::mem;
struct Box<T>{ ptr: Unique<T> } struct Box<T>{ ptr: *mut T }
impl<T> Drop for Box<T> { impl<T> Drop for Box<T> {
fn drop(&mut self) { fn drop(&mut self) {
unsafe { unsafe {
drop_in_place(self.ptr.as_ptr()); drop_in_place(self.ptr);
Heap.dealloc(self.ptr.as_ptr() as *mut u8, Layout::new::<T>()); Heap.dealloc(self.ptr as *mut u8, Layout::new::<T>());
} }
} }
} }
@ -74,17 +74,17 @@ struct SuperBox<T> { my_box: Box<T> }
impl<T> Drop for SuperBox<T> { impl<T> Drop for SuperBox<T> {
fn drop(&mut self) { fn drop(&mut self) {
unsafe { unsafe {
// Hyper-optimized: deallocate the box's contents for it // """Hyper-optimized""": deallocate the box's contents for it
// without `drop`ing the contents // without `drop`ing the contents
Heap.dealloc(self.my_box.ptr.as_ptr() as *mut u8, Layout::new::<T>()); Heap.dealloc(self.my_box.ptr as *mut u8, Layout::new::<T>());
} }
} }
} }
# fn main() {} # fn main() {}
``` ```
After we deallocate the `box`'s ptr in SuperBox's destructor, Rust will After we deallocate `my_box`'s ptr in SuperBox's destructor, Rust will
happily proceed to tell the box to Drop itself and everything will blow up with happily proceed to tell `my_box` to Drop itself and everything will blow up with
use-after-frees and double-frees. use-after-frees and double-frees.
Note that the recursive drop behavior applies to all structs and enums Note that the recursive drop behavior applies to all structs and enums
@ -98,9 +98,10 @@ struct Boxy<T> {
} }
``` ```
will have its data1 and data2's fields destructors whenever it "would" be will have its `data1` and `data2` fields' destructors run whenever it "would" be
dropped, even though it itself doesn't implement Drop. We say that such a type dropped, even though it itself doesn't implement Drop. We say that such a type
*needs Drop*, even though it is not itself Drop. *needs Drop*, even though it is not itself Drop. This property can be checked
for with the [`mem::needs_drop()`][needs_drop] function.
Similarly, Similarly,
@ -115,27 +116,27 @@ will have its inner Box field dropped if and only if an instance stores the
Next variant. Next variant.
In general this works really nicely because you don't need to worry about In general this works really nicely because you don't need to worry about
adding/removing drops when you refactor your data layout. Still there's adding/removing drops when you refactor your data layout. But there's
certainly many valid usecases for needing to do trickier things with certainly many valid usecases for needing to do trickier things with
destructors. destructors.
The classic safe solution to overriding recursive drop and allowing moving out The classic safe solution to preventing recursive drop and allowing moving out
of Self during `drop` is to use an Option: of Self during `drop` is to use an Option:
```rust ```rust
#![feature(allocator_api, unique)] #![feature(allocator_api, unique)]
use std::heap::{Alloc, Heap, Layout}; use std::heap::{Alloc, Heap, Layout};
use std::ptr::{drop_in_place, Unique}; use std::ptr::drop_in_place;
use std::mem; use std::mem;
struct Box<T>{ ptr: Unique<T> } struct Box<T>{ ptr: *mut T }
impl<T> Drop for Box<T> { impl<T> Drop for Box<T> {
fn drop(&mut self) { fn drop(&mut self) {
unsafe { unsafe {
drop_in_place(self.ptr.as_ptr()); drop_in_place(self.ptr);
Heap.dealloc(self.ptr.as_ptr() as *mut u8, Layout::new::<T>()); Heap.dealloc(self.ptr as *mut u8, Layout::new::<T>());
} }
} }
} }
@ -149,7 +150,7 @@ impl<T> Drop for SuperBox<T> {
// without `drop`ing the contents. Need to set the `box` // without `drop`ing the contents. Need to set the `box`
// field as `None` to prevent Rust from trying to Drop it. // field as `None` to prevent Rust from trying to Drop it.
let my_box = self.my_box.take().unwrap(); let my_box = self.my_box.take().unwrap();
Heap.dealloc(my_box.ptr.as_ptr() as *mut u8, Layout::new::<T>()); Heap.dealloc(my_box.ptr as *mut u8, Layout::new::<T>());
mem::forget(my_box); mem::forget(my_box);
} }
} }
@ -165,7 +166,10 @@ deinitializing the field. Not that it will prevent you from producing any other
arbitrarily invalid state in there. arbitrarily invalid state in there.
On balance this is an ok choice. Certainly what you should reach for by default. On balance this is an ok choice. Certainly what you should reach for by default.
However, in the future we expect there to be a first-class way to announce that
a field shouldn't be automatically dropped.
[Unique]: phantom-data.html Should using Option be unacceptable, [`ManuallyDrop`][ManuallyDrop] is always
available.
[ManuallyDrop]: https://doc.rust-lang.org/std/mem/union.ManuallyDrop.html
[needs_drop]: https://doc.rust-lang.org/nightly/std/mem/fn.needs_drop.html

@ -1,7 +1,6 @@
# Drop Flags # Drop Flags
The examples in the previous section introduce an interesting problem for Rust. We've seen that it's possible to conditionally initialize, deinitialize, and
We have seen that it's possible to conditionally initialize, deinitialize, and
reinitialize locations of memory totally safely. For Copy types, this isn't reinitialize locations of memory totally safely. For Copy types, this isn't
particularly notable since they're just a random pile of bits. However types particularly notable since they're just a random pile of bits. However types
with destructors are a different story: Rust needs to know whether to call a with destructors are a different story: Rust needs to know whether to call a
@ -79,5 +78,8 @@ if condition {
} }
``` ```
The drop flags are tracked on the stack and no longer stashed in types that At Rust 1.0, these flags were stored in the actual values that needed to
implement drop. be tracked. This was a big mess and you had to worry about it in unsafe code.
As of Rust 1.13, these flags are stored seperately on the stack, so you no
longer need to worry about them.

@ -0,0 +1,434 @@
# Drop Check: The Escape Patch
In spite of everything stated in the previous section, this code compiles:
```rust
fn main() {
let (day, inspector);
day = Box::new(0);
inspector = vec![&*day];
println!("{:?}", inspector);
}
```
Here instead of storing a reference in our own custom type, we use a Vec, which
is of course generic and implements Drop. Surprisingly, the compiler thinks it's
fine that the Vec stores a reference that lives exactly as long as it.
What happened?
Well, there's an unsafe escape hatch, and the standard library uses it for its
collections and owning pointer types such as Box, Rc, Vec, and BTreeMap. With
this escape hatch, we can tell the compiler that we *promise* that it's safe
for a given generic argument to dangle.
Before we proceed, I must emphasize that this is an incredibly obscure feature.
As in, many people who work on the Rust standard library and rustc itself don't
even know that this exists. So with all likelihood, no one will notice if you
don't use this feature.
**In other words: please don't use the unstable feature we're about to describe!**
This feature primarily exists because some tricky parts of rustc itself use it.
We document it here primarily for the purposes of maintaining the rust-lang
codebase itself.
So let's say we want to write this modified inspector code:
```rust,ignore
struct Inspector<'a, T: 'a> { data: &'a T }
impl<'a, T> Drop for Inspector<'a, T> {
fn drop(&mut self) {
println!("I was only one day from retirement!");
}
}
fn main() {
let (data, inspector);
data = Box::new(0u8);
inspector = Inspector { data: &*data };
}
```
This Inspector is perfectly safe, because it doesn't actually access its
generic data in its destructor. Sadly, the code still doesn't compile:
```text
error: `*data` does not live long enough
--> src/main.rs:13:1
|
12 | inspector = Inspector { data: &*data };
| ----- borrow occurs here
13 | }
| ^ `*data` dropped here while still borrowed
|
= note: values in a scope are dropped in the opposite order they are created
```
This is because as far as Rust is concerned you *could have* accessed it, and
Rust refuses to inspect your drop implementation to be sure.
With [the eyepatch RFC][eyepatch], we can *partially blind* dropck, by hiding one of our
generic parameters from it. (...by covering it with a patch. Get it? ...Eyepatch?)
The patch we apply is as follows:
```rust
#![feature(generic_param_attrs, dropck_eyepatch)]
struct Inspector<'a, T: 'a> { data: &'a T }
// Changes here:
unsafe impl<#[may_dangle] 'a, T> Drop for Inspector<'a, T> {
fn drop(&mut self) {
println!("I was only one day from retirement!");
}
}
fn main() {
let (data, inspector);
data = Box::new(0u8);
inspector = Inspector { data: &*data };
}
```
...and it compiles and runs!
There are two changes:
* We add `#[may_dangle]` to one of our type parameters
* We add `unsafe` to the impl block (which is required by may_dangle to emphasize the risks)
Note also that `#[may_dangle]` requires both the `generic_param_attrs`, and the
`dropck_eyepatch` features.
The may_dangle attribute tells the dropck to ignore `'a` in its analysis. Since this was
the only reason the Inspector was considered unsound (T is just a u8), our code compiles.
`#[may_dangle]` may be applied to any type parameter. For instance, if we change
`data` to just `T` (so `T = &'a u8`), then we need to blind dropck from `T`:
```rust
#![feature(generic_param_attrs, dropck_eyepatch)]
struct Inspector<T> { data: T }
// Changes here:
unsafe impl<#[may_dangle] T> Drop for Inspector<T> {
fn drop(&mut self) {
println!("I was only one day from retirement!");
}
}
fn main() {
let (data, inspector);
data = Box::new(0u8);
inspector = Inspector { data: &*data };
}
```
# When The Eyepatch is (Un)Sound
The general rule for when it's safe to apply the dropck eyepatch to a type parameter
`T` is that the destructor must only do things to values of type `T` that could be
done with *all* types. Basically: we can move (or copy) the values around, take
references to them, get their size/align, and drop them. Just to be clear
on why these are fine:
* Moving and Copying is just bitwise, and it's perfectly safe to copy the bits
representing a dangling pointer.
* Static size/align computation (as with `size_of`) doesn't involve actually
looking at instances of the type, so dangling doesn't matter.
* Dynamic size/align computation (as with `size_of_val`) is also fine, because
it only looks at the trait object's vtable. This vtable is statically
allocated, and can be found without looking at the actual instance's data.
* Dropping a pointer is a noop, so it doesn't matter if they're actually
dangling.
In theory, a function that's generic over all `T` (like `mem::replace`) must also
follow these rules, but in a world with specialization that isn't necessarily true.
For instance any totally-generic function may specialize on `T: Display` to print
the values when possible (please file 100 bugs if `mem::replace` ever does this).
Also note that the following closure isn't actually generic over all values of
type `T`; its body knows the exact type of `T` and therefore can dereference
any dangling pointers `T` might contain:
```rust,ignore
impl<T, F> where F: Fn(T) { ... }
```
All `Vec<T>` and friends do in their destructors is traverse themselves using their
own structure, drop all of the `T`'s they contain, and free themselves. This is
why it's sound for them to apply the eyepatch to their parameters.
# When The Eyepatch Needs Help
Applying the eyepatch correctly isn't sufficient to get a sound drop checking.
To see why, consider this example:
```rust
#![feature(generic_param_attrs, dropck_eyepatch)]
use std::fmt::Debug;
struct Inspector<T: Debug> { data: T }
// Doesn't use eyepatch, but clearly looks at its payload. This is fine.
// Dropck will correctly require that this strictly outlives its payload.
impl<T: Debug> Drop for Inspector<T> {
fn drop(&mut self) {
println!("I was only {:?} days from retirement!", self.data);
}
}
// Our own custom implementation of Box.
struct MyBox<T> {
data: *mut T,
}
// This is uninteresting
impl<T> MyBox<T> {
fn new(t: T) -> MyBox<T> {
MyBox { data: Box::into_raw(Box::new(t)) }
}
}
// The stdlib's Box impl uses may_dangle, so it should be fine for us!
// (This is true... almost)
unsafe impl<#[may_dangle] T> Drop for MyBox<T> {
fn drop(&mut self) {
unsafe { Box::from_raw(self.data); }
}
}
fn inspect() {
let (data, inspector);
// We store this in a std box to avoid distractions
data = Box::new(7u8);
// This time we store an Inspector in our custom Box type
inspector = MyBox::new(Inspector { data: &*data });
// !!! If this compiles, the Inspector will read the dangling data here !!!
}
```
**This compiles, and will perform a use-after-free.**
Something has gone wrong. Just to check, let's replace our use of MyBox with
std's Box.
```rust,ignore
fn inspect() {
let (data, inspector);
data = Box::new(7u8);
// This time we use std's Box type
inspector = Box::new(Inspector { data: &*data });
// !!! If this compiles, the Inspector will read the dangling data here !!!
}
```
```text
error[E0597]: `*data` does not live long enough
--> src/main.rs:45:1
|
42 | inspector = Box::new(Inspector { data: &*data });
| ----- borrow occurs here
...
45 | }
| ^ `*data` dropped here while still borrowed
|
= note: values in a scope are dropped in the opposite order they are created
```
Somehow, dropck now notices that we're doing something bad, and catches us.
Here's the problem: dropck doesn't know that MyBox will drop an Inspector, and
it really needs to know that to perform a proper analysis. To understand the
analysis it's trying to perform, let's step back to the "generic but no
destructor" case:
```rust,ignore
// Our own custom implementation of a Box, which doesn't actually box.
struct MyFakeBox<T> {
data: T,
}
// This is uninteresting
impl<T> MyFakeBox<T> {
fn new(t: T) -> MyFakeBox<T> {
MyFakeBox { data: t }
}
}
fn inspect() {
let (data, inspector);
data = Box::new(7u8);
// This time we store an Inspector in our custom FakeBox type
inspector = MyFakeBox::new(Inspector { data: &*data });
// !!! If this compiles, the Inspector will read the dangling data here !!!
}
```
```
error[E0597]: `*data` does not live long enough
--> src/main.rs:37:1
|
34 | inspector = MyFakeBox::new(Inspector { data: &*data });
| ----- borrow occurs here
...
37 | }
| ^ `*data` dropped here while still borrowed
|
= note: values in a scope are dropped in the opposite order they are created
```
Ok, without a destructor the compiler also performs the right analysis, even
though it should let MyFakeBox contain strictly equal lifetimes when possible.
How does it know that an Inspector will be dropped when a MyFakeBox will be?
Quite simply: it looks at MyFakeBox's fields.
Without an explicit destructor, the compiler is the one providing the destructor
implementation, and so it knows exactly what will be dropped: the fields.
It turns out that this is also the exact analysis that is applied to our MyBox
type. The *problem* is that MyBox stores a `*mut T`, and the compiler knows
dropping a `*mut T` is a noop. So it decides no Inspectors are involved in the
destruction of a MyBox, and lets this code compile (which would be correct if
that conclusion were true).
# Fixing Your Eyepatches
The solution to this problem is fairly simple: if the compiler is going to check
our fields for what we drop, let's add some more fields!
In particular, we will use the PhantomData type to simulate a stored T.
(We'll discuss PhantomData more in the next section. For now, take it for
granted.)
```rust,ignore
use std::marker::PhantomData;
// Our own custom implementation of Box.
struct MyBox<T> {
data: *mut T,
_boo: PhantomData<T>, // Tell the compiler we drop a T!
}
// This is still uninteresting
impl<T> MyBox<T> {
fn new(t: T) -> MyBox<T> {
MyBox {
data: Box::into_raw(Box::new(t)),
_boo: PhantomData,
}
}
}
// Completely unchanged! We got this part right!
unsafe impl<#[may_dangle] T> Drop for MyBox<T> {
fn drop(&mut self) {
unsafe { Box::from_raw(self.data); }
}
}
fn inspect() {
let (data, inspector);
data = Box::new(7u8);
// Back to our MyBox type
inspector = MyBox::new(Inspector { data: &*data });
// !!! If this compiles, the Inspector will read the dangling data here !!!
}
```
```text
Compiling playground v0.0.1 (file:///playground)
error[E0597]: `*data` does not live long enough
--> src/main.rs:50:1
|
47 | inspector = MyBox::new(Inspector { data: &*data });
| ----- borrow occurs here
...
50 | }
| ^ `*data` dropped here while still borrowed
|
= note: values in a scope are dropped in the opposite order they are created
```
Hurray! It worked!
And just to check that we can still store dangling things when it's sound:
```rust,ignore
fn inspect() {
let (data, non_inspector);
data = Box::new(7u8);
// Our custom box type, but no inspector.
non_inspector = MyBox::new(&*data);
}
```
Compiles fine! Great! 🎉
# Dropck Eyepatch Summary (TL;DR)
When a generic type provides a destructor, the compiler will conservatively
disallow any of the type parameters living exactly as long as that type.
With the dropck eyepatch, we can tell it to ignore certain type parameters
which the destructor only does "trivial" things with. Which is to say, `MyType<T>`
doesn't do anything that `Vec<T>` wouldn't do with `T`.
However we then also become responsible for telling dropck about all the types
*related* to T that we drop. It knows we will drop anything in our fields, but
things like raw pointers "trick" it, as dropping a raw pointer does nothing.
To solve this, you should include a `PhantomData` field that stores each of the
types related to T that you may Drop.
**Note that this includes any associated items that you may drop in the destructor.**
For instance, if you have a destructor for `MyType<I: IntoIter>` that calls
`into_iter()`, you should probably include `PhantomData<I::IntoIter>`.
Yes, this is a big hassle and easy to get wrong. Please don't use the eyepatch.
[eyepatch]: https://github.com/rust-lang/rfcs/blob/master/text/1327-dropck-param-eyepatch.md

@ -1,12 +1,8 @@
# Drop Check # Drop Check
We have seen how lifetimes provide us some fairly simple rules for ensuring When looking at the *outlives* relationship in previous sections, we never
that we never read dangling references. However up to this point we have only ever considered the case where two values have the *exact* same lifetime. We made
interacted with the *outlives* relationship in an inclusive manner. That is, this clear by desugarring each let statement into its own scope:
when we talked about `'a: 'b`, it was ok for `'a` to live *exactly* as long as
`'b`. At first glance, this seems to be a meaningless distinction. Nothing ever
gets dropped at the same time as another, right? This is why we used the
following desugaring of `let` statements:
```rust,ignore ```rust,ignore
let x; let x;
@ -22,268 +18,152 @@ let y;
} }
``` ```
Each creates its own scope, clearly establishing that one drops before the But what if we write the following let statement?
other. However, what if we do the following?
```rust,ignore ```rust,ignore
let (x, y) = (vec![], vec![]); let (x, y) = (vec![], vec![]);
``` ```
Does either value strictly outlive the other? The answer is in fact *no*, Does either value strictly outlive the other? The answer is in fact *no*,
neither value strictly outlives the other. Of course, one of x or y will be neither value strictly outlives the other. At least, as far as the type system
dropped before the other, but the actual order is not specified. Tuples aren't is concerned.
special in this regard; composite structures just don't guarantee their
destruction order as of Rust 1.0.
We *could* specify this for the fields of built-in composites like tuples and In actual execution, Rust guarantees that `x` will be dropped before `y`.
structs. However, what about something like Vec? Vec has to manually drop its This is because they are stored in a composite value (a tuple), and composite
elements via pure-library code. In general, anything that implements Drop has values have their fields destroyed [in declaration order][drop-order].
a chance to fiddle with its innards during its final death knell. Therefore
the compiler can't sufficiently reason about the actual destruction order
of the contents of any type that implements Drop.
So why do we care? We care because if the type system isn't careful, it could So why do we care if the compiler considers `x` and `y` to live for the same
accidentally make dangling pointers. Consider the following simple program: amount of time? Well, there's a special trick the compiler can do with equal
lifetimes: it can let us hold onto dangling pointers during destruction! But
we must be careful where we allow this, because any mistake can lead to a
use-after-free.
Consider the following simple program:
```rust ```rust
struct Inspector<'a>(&'a u8); struct Inspector<'a>(&'a u8);
fn main() { fn main() {
let (inspector, days); let (days, inspector);
days = Box::new(1); days = Box::new(1);
inspector = Inspector(&days); inspector = Inspector(&days);
} }
``` ```
This program is totally sound and compiles today. The fact that `days` does This program is perfectly sound, and even compiles today! The fact that `days`
not *strictly* outlive `inspector` doesn't matter. As long as the `inspector` is dropped, and therefore freed, while `inspector` holds a pointer into it doesn't
is alive, so is days. matter because `inspector` will *also* be destroyed before any code gets a chance
to dereference that dangling pointer.
However if we add a destructor, the program will no longer compile! Just to make it clear that something special is happening here, this code
(which should behave identically at runtime) *doesn't* compile:
```rust,ignore ```rust,ignore
struct Inspector<'a>(&'a u8); struct Inspector<'a>(&'a u8);
impl<'a> Drop for Inspector<'a> {
fn drop(&mut self) {
println!("I was only {} days from retirement!", self.0);
}
}
fn main() { fn main() {
let (inspector, days); let inspector;
let days;
days = Box::new(1); days = Box::new(1);
inspector = Inspector(&days); inspector = Inspector(&days);
// Let's say `days` happens to get dropped first.
// Then when Inspector is dropped, it will try to read free'd memory!
} }
``` ```
```text ```text
error: `days` does not live long enough error: `days` does not live long enough
--> <anon>:15:1 --> src/main.rs:8:1
| |
12 | inspector = Inspector(&days); 7 | inspector = Inspector(&days);
| ---- borrow occurs here | ---- borrow occurs here
... 8 | }
15 | }
| ^ `days` dropped here while still borrowed | ^ `days` dropped here while still borrowed
| |
= note: values in a scope are dropped in the opposite order they are created = note: values in a scope are dropped in the opposite order they are created
error: aborting due to previous error
``` ```
Implementing Drop lets the Inspector execute some arbitrary code during its The fact that `inspector` and `days` are stored in the same composite is letting
death. This means it can potentially observe that types that are supposed to the compiler apply this special trick.
live as long as it does actually were destroyed first.
Interestingly, only generic types need to worry about this. If they aren't
generic, then the only lifetimes they can harbor are `'static`, which will truly
live *forever*. This is why this problem is referred to as *sound generic drop*.
Sound generic drop is enforced by the *drop checker*. As of this writing, some
of the finer details of how the drop checker validates types is totally up in
the air. However The Big Rule is the subtlety that we have focused on this whole
section:
**For a generic type to soundly implement drop, its generics arguments must
strictly outlive it.**
Obeying this rule is (usually) necessary to satisfy the borrow Now the *really* interesting part is that if we add a destructor to Inspector,
checker; obeying it is sufficient but not necessary to be the program will *also* stop compiling!
sound. That is, if your type obeys this rule then it's definitely
sound to drop.
The reason that it is not always necessary to satisfy the above rule
is that some Drop implementations will not access borrowed data even
though their type gives them the capability for such access.
For example, this variant of the above `Inspector` example will never
access borrowed data:
```rust,ignore ```rust,ignore
struct Inspector<'a>(&'a u8, &'static str); struct Inspector<'a>(&'a u8);
impl<'a> Drop for Inspector<'a> { impl<'a> Drop for Inspector<'a> {
fn drop(&mut self) { fn drop(&mut self) {
println!("Inspector(_, {}) knows when *not* to inspect.", self.1); println!("I was only {} days from retirement!", self.0);
}
}
fn main() {
let (inspector, days);
days = Box::new(1);
inspector = Inspector(&days, "gadget");
// Let's say `days` happens to get dropped first.
// Even when Inspector is dropped, its destructor will not access the
// borrowed `days`.
}
```
Likewise, this variant will also never access borrowed data:
```rust,ignore
use std::fmt;
struct Inspector<T: fmt::Display>(T, &'static str);
impl<T: fmt::Display> Drop for Inspector<T> {
fn drop(&mut self) {
println!("Inspector(_, {}) knows when *not* to inspect.", self.1);
} }
} }
fn main() { fn main() {
let (inspector, days): (Inspector<&u8>, Box<u8>); let (days, inspector);
days = Box::new(1); days = Box::new(1);
inspector = Inspector(&days, "gadget"); inspector = Inspector(&days);
// Let's say `days` happens to get dropped first. // When Inspector is dropped here, it will try to read free'd memory!
// Even when Inspector is dropped, its destructor will not access the
// borrowed `days`.
}
```
However, *both* of the above variants are rejected by the borrow
checker during the analysis of `fn main`, saying that `days` does not
live long enough.
The reason is that the borrow checking analysis of `main` does not
know about the internals of each Inspector's Drop implementation. As
far as the borrow checker knows while it is analyzing `main`, the body
of an inspector's destructor might access that borrowed data.
Therefore, the drop checker forces all borrowed data in a value to
strictly outlive that value.
# An Escape Hatch
The precise rules that govern drop checking may be less restrictive in
the future.
The current analysis is deliberately conservative and trivial; it forces all
borrowed data in a value to outlive that value, which is certainly sound.
Future versions of the language may make the analysis more precise, to
reduce the number of cases where sound code is rejected as unsafe.
This would help address cases such as the two Inspectors above that
know not to inspect during destruction.
In the meantime, there is an unstable attribute that one can use to
assert (unsafely) that a generic type's destructor is *guaranteed* to
not access any expired data, even if its type gives it the capability
to do so.
That attribute is called `may_dangle` and was introduced in [RFC 1327]
(https://github.com/rust-lang/rfcs/blob/master/text/1327-dropck-param-eyepatch.md).
To deploy it on the Inspector example from above, we would write:
```rust,ignore
struct Inspector<'a>(&'a u8, &'static str);
unsafe impl<#[may_dangle] 'a> Drop for Inspector<'a> {
fn drop(&mut self) {
println!("Inspector(_, {}) knows when *not* to inspect.", self.1);
}
} }
``` ```
Use of this attribute requires the `Drop` impl to be marked `unsafe` because the ```text
compiler is not checking the implicit assertion that no potentially expired data error: `days` does not live long enough
(e.g. `self.0` above) is accessed. --> <anon>:15:1
|
The attribute can be applied to any number of lifetime and type parameters. In 12 | inspector = Inspector(&days);
the following example, we assert that we access no data behind a reference of | ---- borrow occurs here
lifetime `'b` and that the only uses of `T` will be moves or drops, but omit ...
the attribute from `'a` and `U`, because we do access data with that lifetime 15 | }
and that type: | ^ `days` dropped here while still borrowed
|
```rust,ignore = note: values in a scope are dropped in the opposite order they are created
use std::fmt::Display;
struct Inspector<'a, 'b, T, U: Display>(&'a u8, &'b u8, T, U);
unsafe impl<'a, #[may_dangle] 'b, #[may_dangle] T, U: Display> Drop for Inspector<'a, 'b, T, U> { error: aborting due to previous error
fn drop(&mut self) {
println!("Inspector({}, _, _, {})", self.0, self.3);
}
}
``` ```
It is sometimes obvious that no such access can occur, like the case above. Implementing Drop lets the Inspector execute arbitrary code during its
However, when dealing with a generic type parameter, such access can death, which means it can dereference any dangling pointer that it contains.
occur indirectly. Examples of such indirect access are: If we allowed this program to compile, it would perform a use-after-free.
* invoking a callback, This is the *sound generic drop* issue. We call it that because it only applies
* via a trait method call. to destructors of generic types. That is, if `Inspector` weren't generic, it
couldn't store any lifetime other than `'static', and dangling pointers would
never be a concern. The enforcement of sound generic drop is handled by the
*drop check*, which is more commonly known as *dropck*.
(Future changes to the language, such as impl specialization, may add It turns out that getting dropck's design exactly right has been very difficult.
other avenues for such indirect access.) This is because, as we'll see, we don't want it to give up completely on generic
destructors. In particular: what if we knew `Inspector` *didn't* or even *couldn't*
dereference the pointer in its destructor? Wouldn't it be nice if that meant our
code compiled again?
Here is an example of invoking a callback: Since Rust 1.0, and as of Rust 1.18, there have been two changes to how dropck
works due to soundness issues. At least one more is planned, as the latest
version was intended to be a temporary hack.
```rust,ignore * [non-parametric dropck](https://github.com/rust-lang/rfcs/blob/master/text/1238-nonparametric-dropck.md)
struct Inspector<T>(T, &'static str, Box<for <'r> fn(&'r T) -> String>); * [dropck eyepatch](https://github.com/rust-lang/rfcs/blob/master/text/1327-dropck-param-eyepatch.md)
impl<T> Drop for Inspector<T> { Old versions of this document were based on the original 1.0 design, which
fn drop(&mut self) { was unsafe by default, and therefore expected unsafe Rust programmers to manually
// The `self.2` call could access a borrow e.g. if `T` is `&'a _`. opt out of it.
println!("Inspector({}, {}) unwittingly inspects expired data.",
(self.2)(&self.0), self.1);
}
}
```
Here is an example of a trait method call: The good news is that, unlike the 1.0 design, the 1.18 design is *safe by default*:
you can completely ignore that dropck exists, and nothing bad can be done with
```rust,ignore your types. But sometimes you will be able to leverage your knowledge of dropck
use std::fmt; to make transiently dangling references work, and that's nice.
struct Inspector<T: fmt::Display>(T, &'static str); So here's all you should need to know about dropck these days:
impl<T: fmt::Display> Drop for Inspector<T> { * If a type isn't generic, then it cannot contain borrows that expire, and
fn drop(&mut self) { is therefore uninteresting.
// There is a hidden call to `<T as Display>::fmt` below, which * If a type is generic, and doesn't have a destructor, its generic arguments
// could access a borrow e.g. if `T` is `&'a _` must must live *at least* as long as it.
println!("Inspector({}, {}) unwittingly inspects expired data.", * If a type is generic, and *does* have a destructor, its generic arguments
self.0, self.1); must live *strictly* longer than it.
}
}
```
And of course, all of these accesses could be further hidden within Or to put it another way: if you want to be able to store references to things
some other method invoked by the destructor, rather than being written that live **exactly** as long as yourself, you can't have a destructor.
directly within it.
In all of the above cases where the `&'a u8` is accessed in the
destructor, adding the `#[may_dangle]`
attribute makes the type vulnerable to misuse that the borrower
checker will not catch, inviting havoc. It is better to avoid adding
the attribute.
# Is that all about drop checker?
It turns out that when writing unsafe code, we generally don't need to
worry at all about doing the right thing for the drop checker. However there
is one special case that you need to worry about, which we will look at in
the next section.
[drop-order]: https://github.com/rust-lang/rfcs/blob/master/text/1857-stabilize-drop-order.md

@ -21,8 +21,13 @@ correct variance and drop checking.
We do this using `PhantomData`, which is a special marker type. `PhantomData` We do this using `PhantomData`, which is a special marker type. `PhantomData`
consumes no space, but simulates a field of the given type for the purpose of consumes no space, but simulates a field of the given type for the purpose of
static analysis. This was deemed to be less error-prone than explicitly telling static analysis. This was deemed to be less error-prone than explicitly telling
the type-system the kind of variance that you want, while also providing other the type system the kind of variance that you want, while also being useful
useful such as the information needed by drop check. for secondary concerns like deriving Send and Sync.
When using the *drop check eyepatch*, PhantomData also becomes important for
telling the compiler about all types that you drop that it can't see. See the
[the previous section][dropck-eyepatch] for details. This can be ignored if you
don't know what the eyepatch is.
Iter logically contains a bunch of `&'a T`s, so this is exactly what we tell Iter logically contains a bunch of `&'a T`s, so this is exactly what we tell
the PhantomData to simulate: the PhantomData to simulate:
@ -40,65 +45,39 @@ struct Iter<'a, T: 'a> {
and that's it. The lifetime will be bounded, and your iterator will be variant and that's it. The lifetime will be bounded, and your iterator will be variant
over `'a` and `T`. Everything Just Works. over `'a` and `T`. Everything Just Works.
Another important example is Vec, which is (approximately) defined as follows: Here's a more extreme example based on HashMap which stores a single opaque
allocation which is used for multiple arrays of different types:
```
struct Vec<T> {
data: *const T, // *const for variance!
len: usize,
cap: usize,
}
```
Unlike the previous example, it *appears* that everything is exactly as we
want. Every generic argument to Vec shows up in at least one field.
Good to go!
Nope.
The drop checker will generously determine that `Vec<T>` does not own any values
of type T. This will in turn make it conclude that it doesn't need to worry
about Vec dropping any T's in its destructor for determining drop check
soundness. This will in turn allow people to create unsoundness using
Vec's destructor.
In order to tell dropck that we *do* own values of type T, and therefore may
drop some T's when *we* drop, we must add an extra PhantomData saying exactly
that:
``` ```
use std::marker; use std::marker;
struct Vec<T> { struct HashMap {
data: *const T, // *const for covariance! ptr: *mut u8,
len: usize, // The pointer actually stores keys and values
cap: usize, // (and hashes, but those aren't generic)
_marker: marker::PhantomData<T>, _marker: marker::PhantomData<(K, V)>,
} }
``` ```
Raw pointers that own an allocation is such a pervasive pattern that the
standard library made a utility for itself called `Unique<T>` which:
* wraps a `*const T` for variance
* includes a `PhantomData<T>`
* auto-derives `Send`/`Sync` as if T was contained
* marks the pointer as `NonZero` for the null-pointer optimization
## Table of `PhantomData` patterns ## Table of `PhantomData` patterns
Heres a table of all the wonderful ways `PhantomData` could be used: Heres a table of all the most common ways `PhantomData` is used:
| Phantom type | `'a` | `T` | | Phantom type | `'a` | `T` |
|-----------------------------|-----------|---------------------------| |-----------------------------|-----------|---------------------------|
| `PhantomData<T>` | - | variant (with drop check) | | `PhantomData<T>` | - | variant (and drop check T)|
| `PhantomData<&'a T>` | variant | variant | | `PhantomData<&'a T>` | variant | variant |
| `PhantomData<&'a mut T>` | variant | invariant | | `PhantomData<&'a mut T>` | variant | invariant |
| `PhantomData<*const T>` | - | variant | | `PhantomData<*const T>` | - | variant |
| `PhantomData<*mut T>` | - | invariant | | `PhantomData<*mut T>` | - | invariant |
| `PhantomData<fn(T)>` | - | contravariant (*) | | `PhantomData<fn(T)>` | - | contravariant |
| `PhantomData<fn() -> T>` | - | variant | | `PhantomData<fn() -> T>` | - | variant |
| `PhantomData<fn(T) -> T>` | - | invariant | | `PhantomData<fn(T) -> T>` | - | invariant |
| `PhantomData<Cell<&'a ()>>` | invariant | - | | `PhantomData<Cell<&'a ()>>` | invariant | - |
(*) If contravariance gets scrapped, this would be invariant.
[dropck-eyepatch]: dropck-eyepatch.html

@ -1,15 +1,15 @@
# Allocating Memory # Allocating Memory
Using Unique throws a wrench in an important feature of Vec (and indeed all of Using Shared throws a wrench in an important feature of Vec (and indeed all of
the std collections): an empty Vec doesn't actually allocate at all. So if we the std collections): an empty Vec doesn't actually allocate at all. So if we
can't allocate, but also can't put a null pointer in `ptr`, what do we do in can't allocate, but also can't put a null pointer in `ptr`, what do we do in
`Vec::new`? Well, we just put some other garbage in there! `Vec::new`? Well, we just put some other garbage in there!
This is perfectly fine because we already have `cap == 0` as our sentinel for no This is perfectly fine because we already have `cap == 0` as our sentinel for no
allocation. We don't even need to handle it specially in almost any code because allocation. We don't need to handle it specially in almost any code because
we usually need to check if `cap > len` or `len > 0` anyway. The recommended we usually need to check if `cap > len` or `len > 0` anyway. The recommended
Rust value to put here is `mem::align_of::<T>()`. Unique provides a convenience Rust value to put here is `mem::align_of::<T>()`. Shared provides a convenience
for this: `Unique::empty()`. There are quite a few places where we'll for this: `Shared::empty()`. There are quite a few places where we'll
want to use `empty` because there's no real allocation to talk about but want to use `empty` because there's no real allocation to talk about but
`null` would make the compiler do bad things. `null` would make the compiler do bad things.
@ -23,7 +23,7 @@ use std::mem;
impl<T> Vec<T> { impl<T> Vec<T> {
fn new() -> Self { fn new() -> Self {
assert!(mem::size_of::<T>() != 0, "We're not ready to handle ZSTs"); assert!(mem::size_of::<T>() != 0, "We're not ready to handle ZSTs");
Vec { ptr: Unique::empty(), len: 0, cap: 0 } Vec { ptr: Shared::empty(), len: 0, cap: 0 }
} }
} }
``` ```
@ -202,7 +202,7 @@ fn grow(&mut self) {
// If allocate or reallocate fail, we'll get `null` back // If allocate or reallocate fail, we'll get `null` back
if ptr.is_null() { oom(); } if ptr.is_null() { oom(); }
self.ptr = Unique::new(ptr as *mut _); self.ptr = Shared::new(ptr as *mut _);
self.cap = new_cap; self.cap = new_cap;
} }
} }

@ -4,14 +4,14 @@
#![feature(unique)] #![feature(unique)]
#![feature(allocator_api)] #![feature(allocator_api)]
use std::ptr::{Unique, self}; use std::ptr::{Shared, self};
use std::mem; use std::mem;
use std::ops::{Deref, DerefMut}; use std::ops::{Deref, DerefMut};
use std::marker::PhantomData; use std::marker::PhantomData;
use std::heap::{Alloc, Layout, Heap}; use std::heap::{Alloc, Layout, Heap};
struct RawVec<T> { struct RawVec<T> {
ptr: Unique<T>, ptr: Shared<T>,
cap: usize, cap: usize,
} }
@ -20,8 +20,8 @@ impl<T> RawVec<T> {
// !0 is usize::MAX. This branch should be stripped at compile time. // !0 is usize::MAX. This branch should be stripped at compile time.
let cap = if mem::size_of::<T>() == 0 { !0 } else { 0 }; let cap = if mem::size_of::<T>() == 0 { !0 } else { 0 };
// Unique::empty() doubles as "unallocated" and "zero-sized allocation" // Shared::empty() doubles as "unallocated" and "zero-sized allocation"
RawVec { ptr: Unique::empty(), cap: cap } RawVec { ptr: Shared::empty(), cap: cap }
} }
fn grow(&mut self) { fn grow(&mut self) {
@ -49,7 +49,7 @@ impl<T> RawVec<T> {
Err(err) => Heap.oom(err), Err(err) => Heap.oom(err),
}; };
self.ptr = Unique::new_unchecked(ptr as *mut _); self.ptr = Shared::new_unchecked(ptr as *mut _);
self.cap = new_cap; self.cap = new_cap;
} }
} }

@ -44,7 +44,7 @@ So we're going to use the following struct:
```rust,ignore ```rust,ignore
struct IntoIter<T> { struct IntoIter<T> {
buf: Unique<T>, buf: Shared<T>,
cap: usize, cap: usize,
start: *const T, start: *const T,
end: *const T, end: *const T,

@ -15,68 +15,41 @@ pub struct Vec<T> {
# fn main() {} # fn main() {}
``` ```
And indeed this would compile. Unfortunately, it would be incorrect. First, the And indeed this would compile and work correctly. However it comes with a semantic
compiler will give us too strict variance. So a `&Vec<&'static str>` limitation and a missed optimization opportunity.
couldn't be used where an `&Vec<&'a str>` was expected. More importantly, it
will give incorrect ownership information to the drop checker, as it will
conservatively assume we don't own any values of type `T`. See [the chapter
on ownership and lifetimes][ownership] for all the details on variance and
drop check.
As we saw in the ownership chapter, we should use `Unique<T>` in place of In terms of semantics, this implementation of Vec would be [invariant over T][variance].
`*mut T` when we have a raw pointer to an allocation we own. Unique is unstable, So a `&Vec<&'static str>` couldn't be used where an `&Vec<&'a str>` was expected.
so we'd like to not use it if possible, though.
As a recap, Unique is a wrapper around a raw pointer that declares that: In terms of optimization, this implementation of Vec wouldn't be eligible for the
*null pointer optimization*, meaning `Option<Vec<T>>` would take up more space
than `Vec<T>`.
* We are variant over `T` These are fairly common problems because the raw pointer types in Rust aren't
* We may own a value of type `T` (for drop check) very well optimized for this use-case. They're more tuned to make it easier to
* We are Send/Sync if `T` is Send/Sync express C APIs. This is why the standard library provides a pointer type that
* Our pointer is never null (so `Option<Vec<T>>` is null-pointer-optimized) better matches the semantics pure-Rust abstractions want: `Shared<T>`.
We can implement all of the above requirements except for the last Compared to `*mut T`, `Shared<T>` provides three benefits:
one in stable Rust:
```rust * Variant over `T` (dangerous in general, but desirable for collections)
use std::marker::PhantomData; * Null-pointer optimizes (so `Option<Shared<T>>` is pointer-sized)
use std::ops::Deref;
use std::mem;
struct Unique<T> {
ptr: *const T, // *const for variance
_marker: PhantomData<T>, // For the drop checker
}
// Deriving Send and Sync is safe because we are the Unique owners
// of this data. It's like Unique<T> is "just" T.
unsafe impl<T: Send> Send for Unique<T> {}
unsafe impl<T: Sync> Sync for Unique<T> {}
impl<T> Unique<T> {
pub fn new(ptr: *mut T) -> Self {
Unique { ptr: ptr, _marker: PhantomData }
}
pub fn as_ptr(&self) -> *mut T { We could get the variance requirement ourselves using `*const T` and casts, but
self.ptr as *mut T the API for expressing a value is non-zero is unstable, and that isn't expected
} to change any time soon.
}
# fn main() {}
```
Unfortunately the mechanism for stating that your value is non-zero is Shared should be stabilized in some form very soon, so we're just going to use
unstable and unlikely to be stabilized soon. As such we're just going to that.
take the hit and use std's Unique:
```rust ```rust
#![feature(unique)] #![feature(shared)]
use std::ptr::{Unique, self}; use std::ptr::Shared;
pub struct Vec<T> { pub struct Vec<T> {
ptr: Unique<T>, ptr: Shared<T>,
cap: usize, cap: usize,
len: usize, len: usize,
} }
@ -84,11 +57,14 @@ pub struct Vec<T> {
# fn main() {} # fn main() {}
``` ```
If you don't care about the null-pointer optimization, then you can use the If you don't care about the null-pointer optimization, then you can use `*const T`.
stable code. However we will be designing the rest of the code around enabling For most code, using `*mut T` would also be perfectly reasonable.
this optimization. It should be noted that `Unique::new` is unsafe to call, because However this chapter is focused on providing an implementation that matches the
putting `null` inside of it is Undefined Behavior. Our stable Unique doesn't quality of the one in the standard library, so we will be designing the rest of
need `new` to be unsafe because it doesn't make any interesting guarantees about the code around using Shared.
its contents.
Lastly, it should be noted that `Shared::new` is unsafe to call, because
putting `null` inside of it is Undefined Behavior. Code that doesn't use Shared
has no such concern.
[ownership]: ownership.html [variance]: variance.html

@ -11,14 +11,14 @@ allocating, growing, and freeing:
```rust,ignore ```rust,ignore
struct RawVec<T> { struct RawVec<T> {
ptr: Unique<T>, ptr: Shared<T>,
cap: usize, cap: usize,
} }
impl<T> RawVec<T> { impl<T> RawVec<T> {
fn new() -> Self { fn new() -> Self {
assert!(mem::size_of::<T>() != 0, "TODO: implement ZST support"); assert!(mem::size_of::<T>() != 0, "TODO: implement ZST support");
RawVec { ptr: Unique::empty(), cap: 0 } RawVec { ptr: Shared::empty(), cap: 0 }
} }
// unchanged from Vec // unchanged from Vec
@ -42,7 +42,7 @@ impl<T> RawVec<T> {
// If allocate or reallocate fail, we'll get `null` back // If allocate or reallocate fail, we'll get `null` back
if ptr.is_null() { oom() } if ptr.is_null() { oom() }
self.ptr = Unique::new(ptr as *mut _); self.ptr = Shared::new(ptr as *mut _);
self.cap = new_cap; self.cap = new_cap;
} }
} }

@ -19,7 +19,7 @@ RawValIter and RawVec respectively. How mysteriously convenient.
## Allocating Zero-Sized Types ## Allocating Zero-Sized Types
So if the allocator API doesn't support zero-sized allocations, what on earth So if the allocator API doesn't support zero-sized allocations, what on earth
do we store as our allocation? `Unique::empty()` of course! Almost every operation do we store as our allocation? `Shared::empty()` of course! Almost every operation
with a ZST is a no-op since ZSTs have exactly one value, and therefore no state needs with a ZST is a no-op since ZSTs have exactly one value, and therefore no state needs
to be considered to store or load them. This actually extends to `ptr::read` and to be considered to store or load them. This actually extends to `ptr::read` and
`ptr::write`: they won't actually look at the pointer at all. As such we never need `ptr::write`: they won't actually look at the pointer at all. As such we never need
@ -38,8 +38,8 @@ impl<T> RawVec<T> {
// !0 is usize::MAX. This branch should be stripped at compile time. // !0 is usize::MAX. This branch should be stripped at compile time.
let cap = if mem::size_of::<T>() == 0 { !0 } else { 0 }; let cap = if mem::size_of::<T>() == 0 { !0 } else { 0 };
// Unique::empty() doubles as "unallocated" and "zero-sized allocation" // Shared::empty() doubles as "unallocated" and "zero-sized allocation"
RawVec { ptr: Unique::empty(), cap: cap } RawVec { ptr: Shared::empty(), cap: cap }
} }
fn grow(&mut self) { fn grow(&mut self) {
@ -67,7 +67,7 @@ impl<T> RawVec<T> {
// If allocate or reallocate fail, we'll get `null` back // If allocate or reallocate fail, we'll get `null` back
if ptr.is_null() { oom() } if ptr.is_null() { oom() }
self.ptr = Unique::new(ptr as *mut _); self.ptr = Shared::new(ptr as *mut _);
self.cap = new_cap; self.cap = new_cap;
} }
} }

Loading…
Cancel
Save