frob emphasis

pull/10/head
Alexis Beingessner 9 years ago committed by Manish Goregaokar
parent 85a6d02a45
commit 42582a28ed

@ -9,15 +9,15 @@ nitty-gritty details of the language. You want to know those weird corner-cases.
You want to know what the heck `unsafe` really means, and how to properly use You want to know what the heck `unsafe` really means, and how to properly use
it. This is the book for you. it. This is the book for you.
To be clear, this book goes into *serious* detail. We're going to dig into To be clear, this book goes into serious detail. We're going to dig into
exception-safety and pointer aliasing. We're going to talk about memory exception-safety and pointer aliasing. We're going to talk about memory
models. We're even going to do some type-theory. This is stuff that you models. We're even going to do some type-theory. This is stuff that you
absolutely *don't* need to know to write fast and safe Rust programs. absolutely don't need to know to write fast and safe Rust programs.
You could probably close this book *right now* and still have a productive You could probably close this book *right now* and still have a productive
and happy career in Rust. and happy career in Rust.
However if you intend to write unsafe code -- or just *really* want to dig into However if you intend to write unsafe code -- or just really want to dig into
the guts of the language -- this book contains *invaluable* information. the guts of the language -- this book contains invaluable information.
Unlike TRPL we will be assuming considerable prior knowledge. In particular, you Unlike TRPL we will be assuming considerable prior knowledge. In particular, you
should be comfortable with basic systems programming and basic Rust. If you should be comfortable with basic systems programming and basic Rust. If you

@ -17,7 +17,7 @@ face.
The C11 memory model is fundamentally about trying to bridge the gap between the The C11 memory model is fundamentally about trying to bridge the gap between the
semantics we want, the optimizations compilers want, and the inconsistent chaos semantics we want, the optimizations compilers want, and the inconsistent chaos
our hardware wants. *We* would like to just write programs and have them do our hardware wants. *We* would like to just write programs and have them do
exactly what we said but, you know, *fast*. Wouldn't that be great? exactly what we said but, you know, fast. Wouldn't that be great?
@ -35,20 +35,20 @@ y = 3;
x = 2; x = 2;
``` ```
The compiler may conclude that it would *really* be best if your program did The compiler may conclude that it would be best if your program did
```rust,ignore ```rust,ignore
x = 2; x = 2;
y = 3; y = 3;
``` ```
This has inverted the order of events *and* completely eliminated one event. This has inverted the order of events and completely eliminated one event.
From a single-threaded perspective this is completely unobservable: after all From a single-threaded perspective this is completely unobservable: after all
the statements have executed we are in exactly the same state. But if our the statements have executed we are in exactly the same state. But if our
program is multi-threaded, we may have been relying on `x` to *actually* be program is multi-threaded, we may have been relying on `x` to actually be
assigned to 1 before `y` was assigned. We would *really* like the compiler to be assigned to 1 before `y` was assigned. We would like the compiler to be
able to make these kinds of optimizations, because they can seriously improve able to make these kinds of optimizations, because they can seriously improve
performance. On the other hand, we'd really like to be able to depend on our performance. On the other hand, we'd also like to be able to depend on our
program *doing the thing we said*. program *doing the thing we said*.
@ -57,15 +57,15 @@ program *doing the thing we said*.
# Hardware Reordering # Hardware Reordering
On the other hand, even if the compiler totally understood what we wanted and On the other hand, even if the compiler totally understood what we wanted and
respected our wishes, our *hardware* might instead get us in trouble. Trouble respected our wishes, our hardware might instead get us in trouble. Trouble
comes from CPUs in the form of memory hierarchies. There is indeed a global comes from CPUs in the form of memory hierarchies. There is indeed a global
shared memory space somewhere in your hardware, but from the perspective of each shared memory space somewhere in your hardware, but from the perspective of each
CPU core it is *so very far away* and *so very slow*. Each CPU would rather work CPU core it is *so very far away* and *so very slow*. Each CPU would rather work
with its local cache of the data and only go through all the *anguish* of with its local cache of the data and only go through all the anguish of
talking to shared memory *only* when it doesn't actually have that memory in talking to shared memory only when it doesn't actually have that memory in
cache. cache.
After all, that's the whole *point* of the cache, right? If every read from the After all, that's the whole point of the cache, right? If every read from the
cache had to run back to shared memory to double check that it hadn't changed, cache had to run back to shared memory to double check that it hadn't changed,
what would the point be? The end result is that the hardware doesn't guarantee what would the point be? The end result is that the hardware doesn't guarantee
that events that occur in the same order on *one* thread, occur in the same that events that occur in the same order on *one* thread, occur in the same
@ -99,13 +99,13 @@ provides weak ordering guarantees. This has two consequences for concurrent
programming: programming:
* Asking for stronger guarantees on strongly-ordered hardware may be cheap or * Asking for stronger guarantees on strongly-ordered hardware may be cheap or
even *free* because they already provide strong guarantees unconditionally. even free because they already provide strong guarantees unconditionally.
Weaker guarantees may only yield performance wins on weakly-ordered hardware. Weaker guarantees may only yield performance wins on weakly-ordered hardware.
* Asking for guarantees that are *too* weak on strongly-ordered hardware is * Asking for guarantees that are too weak on strongly-ordered hardware is
more likely to *happen* to work, even though your program is strictly more likely to *happen* to work, even though your program is strictly
incorrect. If possible, concurrent algorithms should be tested on weakly- incorrect. If possible, concurrent algorithms should be tested on
ordered hardware. weakly-ordered hardware.
@ -115,10 +115,10 @@ programming:
The C11 memory model attempts to bridge the gap by allowing us to talk about the The C11 memory model attempts to bridge the gap by allowing us to talk about the
*causality* of our program. Generally, this is by establishing a *happens *causality* of our program. Generally, this is by establishing a *happens
before* relationships between parts of the program and the threads that are before* relationship between parts of the program and the threads that are
running them. This gives the hardware and compiler room to optimize the program running them. This gives the hardware and compiler room to optimize the program
more aggressively where a strict happens-before relationship isn't established, more aggressively where a strict happens-before relationship isn't established,
but forces them to be more careful where one *is* established. The way we but forces them to be more careful where one is established. The way we
communicate these relationships are through *data accesses* and *atomic communicate these relationships are through *data accesses* and *atomic
accesses*. accesses*.
@ -130,8 +130,10 @@ propagate the changes made in data accesses to other threads as lazily and
inconsistently as it wants. Mostly critically, data accesses are how data races inconsistently as it wants. Mostly critically, data accesses are how data races
happen. Data accesses are very friendly to the hardware and compiler, but as happen. Data accesses are very friendly to the hardware and compiler, but as
we've seen they offer *awful* semantics to try to write synchronized code with. we've seen they offer *awful* semantics to try to write synchronized code with.
Actually, that's too weak. *It is literally impossible to write correct Actually, that's too weak.
synchronized code using only data accesses*.
**It is literally impossible to write correct synchronized code using only data
accesses.**
Atomic accesses are how we tell the hardware and compiler that our program is Atomic accesses are how we tell the hardware and compiler that our program is
multi-threaded. Each atomic access can be marked with an *ordering* that multi-threaded. Each atomic access can be marked with an *ordering* that
@ -141,7 +143,10 @@ they *can't* do. For the compiler, this largely revolves around re-ordering of
instructions. For the hardware, this largely revolves around how writes are instructions. For the hardware, this largely revolves around how writes are
propagated to other threads. The set of orderings Rust exposes are: propagated to other threads. The set of orderings Rust exposes are:
* Sequentially Consistent (SeqCst) Release Acquire Relaxed * Sequentially Consistent (SeqCst)
* Release
* Acquire
* Relaxed
(Note: We explicitly do not expose the C11 *consume* ordering) (Note: We explicitly do not expose the C11 *consume* ordering)
@ -154,13 +159,13 @@ synchronize"
Sequentially Consistent is the most powerful of all, implying the restrictions Sequentially Consistent is the most powerful of all, implying the restrictions
of all other orderings. Intuitively, a sequentially consistent operation of all other orderings. Intuitively, a sequentially consistent operation
*cannot* be reordered: all accesses on one thread that happen before and after a cannot be reordered: all accesses on one thread that happen before and after a
SeqCst access *stay* before and after it. A data-race-free program that uses SeqCst access stay before and after it. A data-race-free program that uses
only sequentially consistent atomics and data accesses has the very nice only sequentially consistent atomics and data accesses has the very nice
property that there is a single global execution of the program's instructions property that there is a single global execution of the program's instructions
that all threads agree on. This execution is also particularly nice to reason that all threads agree on. This execution is also particularly nice to reason
about: it's just an interleaving of each thread's individual executions. This about: it's just an interleaving of each thread's individual executions. This
*does not* hold if you start using the weaker atomic orderings. does not hold if you start using the weaker atomic orderings.
The relative developer-friendliness of sequential consistency doesn't come for The relative developer-friendliness of sequential consistency doesn't come for
free. Even on strongly-ordered platforms sequential consistency involves free. Even on strongly-ordered platforms sequential consistency involves
@ -170,8 +175,8 @@ In practice, sequential consistency is rarely necessary for program correctness.
However sequential consistency is definitely the right choice if you're not However sequential consistency is definitely the right choice if you're not
confident about the other memory orders. Having your program run a bit slower confident about the other memory orders. Having your program run a bit slower
than it needs to is certainly better than it running incorrectly! It's also than it needs to is certainly better than it running incorrectly! It's also
*mechanically* trivial to downgrade atomic operations to have a weaker mechanically trivial to downgrade atomic operations to have a weaker
consistency later on. Just change `SeqCst` to e.g. `Relaxed` and you're done! Of consistency later on. Just change `SeqCst` to `Relaxed` and you're done! Of
course, proving that this transformation is *correct* is a whole other matter. course, proving that this transformation is *correct* is a whole other matter.
@ -183,15 +188,15 @@ Acquire and Release are largely intended to be paired. Their names hint at their
use case: they're perfectly suited for acquiring and releasing locks, and use case: they're perfectly suited for acquiring and releasing locks, and
ensuring that critical sections don't overlap. ensuring that critical sections don't overlap.
Intuitively, an acquire access ensures that every access after it *stays* after Intuitively, an acquire access ensures that every access after it stays after
it. However operations that occur before an acquire are free to be reordered to it. However operations that occur before an acquire are free to be reordered to
occur after it. Similarly, a release access ensures that every access before it occur after it. Similarly, a release access ensures that every access before it
*stays* before it. However operations that occur after a release are free to be stays before it. However operations that occur after a release are free to be
reordered to occur before it. reordered to occur before it.
When thread A releases a location in memory and then thread B subsequently When thread A releases a location in memory and then thread B subsequently
acquires *the same* location in memory, causality is established. Every write acquires *the same* location in memory, causality is established. Every write
that happened *before* A's release will be observed by B *after* its release. that happened before A's release will be observed by B after its release.
However no causality is established with any other threads. Similarly, no However no causality is established with any other threads. Similarly, no
causality is established if A and B access *different* locations in memory. causality is established if A and B access *different* locations in memory.
@ -230,7 +235,7 @@ weakly-ordered platforms.
# Relaxed # Relaxed
Relaxed accesses are the absolute weakest. They can be freely re-ordered and Relaxed accesses are the absolute weakest. They can be freely re-ordered and
provide no happens-before relationship. Still, relaxed operations *are* still provide no happens-before relationship. Still, relaxed operations are still
atomic. That is, they don't count as data accesses and any read-modify-write atomic. That is, they don't count as data accesses and any read-modify-write
operations done to them occur atomically. Relaxed operations are appropriate for operations done to them occur atomically. Relaxed operations are appropriate for
things that you definitely want to happen, but don't particularly otherwise care things that you definitely want to happen, but don't particularly otherwise care

@ -2,7 +2,7 @@
The mutual exclusion property of mutable references can be very limiting when The mutual exclusion property of mutable references can be very limiting when
working with a composite structure. The borrow checker understands some basic working with a composite structure. The borrow checker understands some basic
stuff, but will fall over pretty easily. It *does* understand structs stuff, but will fall over pretty easily. It does understand structs
sufficiently to know that it's possible to borrow disjoint fields of a struct sufficiently to know that it's possible to borrow disjoint fields of a struct
simultaneously. So this works today: simultaneously. So this works today:
@ -50,7 +50,7 @@ to the same value.
In order to "teach" borrowck that what we're doing is ok, we need to drop down In order to "teach" borrowck that what we're doing is ok, we need to drop down
to unsafe code. For instance, mutable slices expose a `split_at_mut` function to unsafe code. For instance, mutable slices expose a `split_at_mut` function
that consumes the slice and returns *two* mutable slices. One for everything to that consumes the slice and returns two mutable slices. One for everything to
the left of the index, and one for everything to the right. Intuitively we know the left of the index, and one for everything to the right. Intuitively we know
this is safe because the slices don't overlap, and therefore alias. However this is safe because the slices don't overlap, and therefore alias. However
the implementation requires some unsafety: the implementation requires some unsafety:
@ -93,10 +93,10 @@ completely incompatible with this API, as it would produce multiple mutable
references to the same object! references to the same object!
However it actually *does* work, exactly because iterators are one-shot objects. However it actually *does* work, exactly because iterators are one-shot objects.
Everything an IterMut yields will be yielded *at most* once, so we don't Everything an IterMut yields will be yielded at most once, so we don't
*actually* ever yield multiple mutable references to the same piece of data. actually ever yield multiple mutable references to the same piece of data.
Perhaps surprisingly, mutable iterators *don't* require unsafe code to be Perhaps surprisingly, mutable iterators don't require unsafe code to be
implemented for many types! implemented for many types!
For instance here's a singly linked list: For instance here's a singly linked list:

@ -1,13 +1,13 @@
% Casts % Casts
Casts are a superset of coercions: every coercion can be explicitly Casts are a superset of coercions: every coercion can be explicitly
invoked via a cast. However some conversions *require* a cast. invoked via a cast. However some conversions require a cast.
While coercions are pervasive and largely harmless, these "true casts" While coercions are pervasive and largely harmless, these "true casts"
are rare and potentially dangerous. As such, casts must be explicitly invoked are rare and potentially dangerous. As such, casts must be explicitly invoked
using the `as` keyword: `expr as Type`. using the `as` keyword: `expr as Type`.
True casts generally revolve around raw pointers and the primitive numeric True casts generally revolve around raw pointers and the primitive numeric
types. Even though they're dangerous, these casts are *infallible* at runtime. types. Even though they're dangerous, these casts are infallible at runtime.
If a cast triggers some subtle corner case no indication will be given that If a cast triggers some subtle corner case no indication will be given that
this occurred. The cast will simply succeed. That said, casts must be valid this occurred. The cast will simply succeed. That said, casts must be valid
at the type level, or else they will be prevented statically. For instance, at the type level, or else they will be prevented statically. For instance,

@ -80,7 +80,7 @@ loop {
// because it relies on actual values. // because it relies on actual values.
if true { if true {
// But it does understand that it will only be taken once because // But it does understand that it will only be taken once because
// we *do* unconditionally break out of it. Therefore `x` doesn't // we unconditionally break out of it. Therefore `x` doesn't
// need to be marked as mutable. // need to be marked as mutable.
x = 0; x = 0;
break; break;

@ -2,12 +2,12 @@
Rust as a language doesn't *really* have an opinion on how to do concurrency or Rust as a language doesn't *really* have an opinion on how to do concurrency or
parallelism. The standard library exposes OS threads and blocking sys-calls parallelism. The standard library exposes OS threads and blocking sys-calls
because *everyone* has those, and they're uniform enough that you can provide because everyone has those, and they're uniform enough that you can provide
an abstraction over them in a relatively uncontroversial way. Message passing, an abstraction over them in a relatively uncontroversial way. Message passing,
green threads, and async APIs are all diverse enough that any abstraction over green threads, and async APIs are all diverse enough that any abstraction over
them tends to involve trade-offs that we weren't willing to commit to for 1.0. them tends to involve trade-offs that we weren't willing to commit to for 1.0.
However the way Rust models concurrency makes it relatively easy design your own However the way Rust models concurrency makes it relatively easy design your own
concurrency paradigm as a library and have *everyone else's* code Just Work concurrency paradigm as a library and have everyone else's code Just Work
with yours. Just require the right lifetimes and Send and Sync where appropriate with yours. Just require the right lifetimes and Send and Sync where appropriate
and you're off to the races. Or rather, off to the... not... having... races. and you're off to the races. Or rather, off to the... not... having... races.

@ -37,14 +37,14 @@ blindly memcopied to somewhere else in memory. This means pure on-the-stack-but-
still-movable intrusive linked lists are simply not happening in Rust (safely). still-movable intrusive linked lists are simply not happening in Rust (safely).
Assignment and copy constructors similarly don't exist because move semantics Assignment and copy constructors similarly don't exist because move semantics
are the *only* semantics in Rust. At most `x = y` just moves the bits of y into are the only semantics in Rust. At most `x = y` just moves the bits of y into
the x variable. Rust *does* provide two facilities for providing C++'s copy- the x variable. Rust does provide two facilities for providing C++'s copy-
oriented semantics: `Copy` and `Clone`. Clone is our moral equivalent of a copy oriented semantics: `Copy` and `Clone`. Clone is our moral equivalent of a copy
constructor, but it's never implicitly invoked. You have to explicitly call constructor, but it's never implicitly invoked. You have to explicitly call
`clone` on an element you want to be cloned. Copy is a special case of Clone `clone` on an element you want to be cloned. Copy is a special case of Clone
where the implementation is just "copy the bits". Copy types *are* implicitly where the implementation is just "copy the bits". Copy types *are* implicitly
cloned whenever they're moved, but because of the definition of Copy this just cloned whenever they're moved, but because of the definition of Copy this just
means *not* treating the old copy as uninitialized -- a no-op. means not treating the old copy as uninitialized -- a no-op.
While Rust provides a `Default` trait for specifying the moral equivalent of a While Rust provides a `Default` trait for specifying the moral equivalent of a
default constructor, it's incredibly rare for this trait to be used. This is default constructor, it's incredibly rare for this trait to be used. This is

@ -8,7 +8,7 @@ a different type. Because Rust encourages encoding important properties in the
type system, these problems are incredibly pervasive. As such, Rust type system, these problems are incredibly pervasive. As such, Rust
consequently gives you several ways to solve them. consequently gives you several ways to solve them.
First we'll look at the ways that *Safe Rust* gives you to reinterpret values. First we'll look at the ways that Safe Rust gives you to reinterpret values.
The most trivial way to do this is to just destructure a value into its The most trivial way to do this is to just destructure a value into its
constituent parts and then build a new type out of them. e.g. constituent parts and then build a new type out of them. e.g.

@ -1,5 +1,5 @@
% Data Representation in Rust % Data Representation in Rust
Low-level programming cares a lot about data layout. It's a big deal. It also pervasively Low-level programming cares a lot about data layout. It's a big deal. It also
influences the rest of the language, so we're going to start by digging into how data is pervasively influences the rest of the language, so we're going to start by
represented in Rust. digging into how data is represented in Rust.

@ -7,16 +7,19 @@ What the language *does* provide is full-blown automatic destructors through the
fn drop(&mut self); fn drop(&mut self);
``` ```
This method gives the type time to somehow finish what it was doing. **After This method gives the type time to somehow finish what it was doing.
`drop` is run, Rust will recursively try to drop all of the fields of `self`**.
**After `drop` is run, Rust will recursively try to drop all of the fields
of `self`.**
This is a convenience feature so that you don't have to write "destructor This is a convenience feature so that you don't have to write "destructor
boilerplate" to drop children. If a struct has no special logic for being boilerplate" to drop children. If a struct has no special logic for being
dropped other than dropping its children, then it means `Drop` doesn't need to dropped other than dropping its children, then it means `Drop` doesn't need to
be implemented at all! be implemented at all!
**There is no stable way to prevent this behaviour in Rust 1.0. **There is no stable way to prevent this behaviour in Rust 1.0.**
Note that taking `&mut self` means that even if you *could* suppress recursive Note that taking `&mut self` means that even if you could suppress recursive
Drop, Rust will prevent you from e.g. moving fields out of self. For most types, Drop, Rust will prevent you from e.g. moving fields out of self. For most types,
this is totally fine. this is totally fine.
@ -90,7 +93,7 @@ After we deallocate the `box`'s ptr in SuperBox's destructor, Rust will
happily proceed to tell the box to Drop itself and everything will blow up with happily proceed to tell the box to Drop itself and everything will blow up with
use-after-frees and double-frees. use-after-frees and double-frees.
Note that the recursive drop behaviour applies to *all* structs and enums Note that the recursive drop behaviour applies to all structs and enums
regardless of whether they implement Drop. Therefore something like regardless of whether they implement Drop. Therefore something like
```rust ```rust
@ -114,7 +117,7 @@ enum Link {
} }
``` ```
will have its inner Box field dropped *if and only if* an instance stores the will have its inner Box field dropped if and only if an instance stores the
Next variant. Next variant.
In general this works really nice because you don't need to worry about In general this works really nice because you don't need to worry about
@ -165,7 +168,7 @@ impl<T> Drop for SuperBox<T> {
``` ```
However this has fairly odd semantics: you're saying that a field that *should* However this has fairly odd semantics: you're saying that a field that *should*
always be Some may be None, just because that happens in the destructor. Of always be Some *may* be None, just because that happens in the destructor. Of
course this conversely makes a lot of sense: you can call arbitrary methods on course this conversely makes a lot of sense: you can call arbitrary methods on
self during the destructor, and this should prevent you from ever doing so after self during the destructor, and this should prevent you from ever doing so after
deinitializing the field. Not that it will prevent you from producing any other deinitializing the field. Not that it will prevent you from producing any other

@ -10,7 +10,7 @@ How can it do this with conditional initialization?
Note that this is not a problem that all assignments need worry about. In Note that this is not a problem that all assignments need worry about. In
particular, assigning through a dereference unconditionally drops, and assigning particular, assigning through a dereference unconditionally drops, and assigning
in a `let` unconditionally *doesn't* drop: in a `let` unconditionally doesn't drop:
``` ```
let mut x = Box::new(0); // let makes a fresh variable, so never need to drop let mut x = Box::new(0); // let makes a fresh variable, so never need to drop
@ -23,11 +23,11 @@ one of its subfields.
It turns out that Rust actually tracks whether a type should be dropped or not It turns out that Rust actually tracks whether a type should be dropped or not
*at runtime*. As a variable becomes initialized and uninitialized, a *drop flag* *at runtime*. As a variable becomes initialized and uninitialized, a *drop flag*
for that variable is toggled. When a variable *might* need to be dropped, this for that variable is toggled. When a variable might need to be dropped, this
flag is evaluated to determine if it *should* be dropped. flag is evaluated to determine if it should be dropped.
Of course, it is *often* the case that a value's initialization state can be Of course, it is often the case that a value's initialization state can be
*statically* known at every point in the program. If this is the case, then the statically known at every point in the program. If this is the case, then the
compiler can theoretically generate more efficient code! For instance, straight- compiler can theoretically generate more efficient code! For instance, straight-
line code has such *static drop semantics*: line code has such *static drop semantics*:
@ -40,8 +40,8 @@ y = x; // y was init; Drop y, overwrite it, and make x uninit!
// x goes out of scope; x was uninit; do nothing. // x goes out of scope; x was uninit; do nothing.
``` ```
And even branched code where all branches have the same behaviour with respect Similarly, branched code where all branches have the same behaviour with respect
to initialization: to initialization has static drop semantics:
```rust ```rust
# let condition = true; # let condition = true;
@ -65,7 +65,7 @@ if condition {
x = Box::new(0); // x was uninit; just overwrite. x = Box::new(0); // x was uninit; just overwrite.
println!("{}", x); println!("{}", x);
} }
// x goes out of scope; x *might* be uninit; // x goes out of scope; x might be uninit;
// check the flag! // check the flag!
``` ```
@ -81,7 +81,7 @@ if condition {
As of Rust 1.0, the drop flags are actually not-so-secretly stashed in a hidden As of Rust 1.0, the drop flags are actually not-so-secretly stashed in a hidden
field of any type that implements Drop. Rust sets the drop flag by overwriting field of any type that implements Drop. Rust sets the drop flag by overwriting
the *entire* value with a particular bit pattern. This is pretty obviously Not the entire value with a particular bit pattern. This is pretty obviously Not
The Fastest and causes a bunch of trouble with optimizing code. It's legacy from The Fastest and causes a bunch of trouble with optimizing code. It's legacy from
a time when you could do much more complex conditional initialization. a time when you could do much more complex conditional initialization.
@ -92,4 +92,4 @@ as it requires fairly substantial changes to the compiler.
Regardless, Rust programs don't need to worry about uninitialized values on Regardless, Rust programs don't need to worry about uninitialized values on
the stack for correctness. Although they might care for performance. Thankfully, the stack for correctness. Although they might care for performance. Thankfully,
Rust makes it easy to take control here! Uninitialized values are there, and Rust makes it easy to take control here! Uninitialized values are there, and
you can work with them in Safe Rust, but you're *never* in danger. you can work with them in Safe Rust, but you're never in danger.

@ -100,11 +100,11 @@ fn main() {
<anon>:15 } <anon>:15 }
``` ```
Implementing Drop lets the Inspector execute some arbitrary code *during* its Implementing Drop lets the Inspector execute some arbitrary code during its
death. This means it can potentially observe that types that are supposed to death. This means it can potentially observe that types that are supposed to
live as long as it does actually were destroyed first. live as long as it does actually were destroyed first.
Interestingly, only *generic* types need to worry about this. If they aren't Interestingly, only generic types need to worry about this. If they aren't
generic, then the only lifetimes they can harbor are `'static`, which will truly generic, then the only lifetimes they can harbor are `'static`, which will truly
live *forever*. This is why this problem is referred to as *sound generic drop*. live *forever*. This is why this problem is referred to as *sound generic drop*.
Sound generic drop is enforced by the *drop checker*. As of this writing, some Sound generic drop is enforced by the *drop checker*. As of this writing, some
@ -116,12 +116,12 @@ section:
strictly outlive it.** strictly outlive it.**
This rule is sufficient but not necessary to satisfy the drop checker. That is, This rule is sufficient but not necessary to satisfy the drop checker. That is,
if your type obeys this rule then it's *definitely* sound to drop. However if your type obeys this rule then it's definitely sound to drop. However
there are special cases where you can fail to satisfy this, but still there are special cases where you can fail to satisfy this, but still
successfully pass the borrow checker. These are the precise rules that are successfully pass the borrow checker. These are the precise rules that are
currently up in the air. currently up in the air.
It turns out that when writing unsafe code, we generally don't need to It turns out that when writing unsafe code, we generally don't need to
worry at all about doing the right thing for the drop checker. However there worry at all about doing the right thing for the drop checker. However there
is *one* special case that you need to worry about, which we will look at in is one special case that you need to worry about, which we will look at in
the next section. the next section.

@ -1,8 +1,8 @@
% Exception Safety % Exception Safety
Although programs should use unwinding sparingly, there's *a lot* of code that Although programs should use unwinding sparingly, there's a lot of code that
*can* panic. If you unwrap a None, index out of bounds, or divide by 0, your *can* panic. If you unwrap a None, index out of bounds, or divide by 0, your
program *will* panic. On debug builds, *every* arithmetic operation can panic program will panic. On debug builds, every arithmetic operation can panic
if it overflows. Unless you are very careful and tightly control what code runs, if it overflows. Unless you are very careful and tightly control what code runs,
pretty much everything can unwind, and you need to be ready for it. pretty much everything can unwind, and you need to be ready for it.
@ -22,7 +22,7 @@ unsound states must be careful that a panic does not cause that state to be
used. Generally this means ensuring that only non-panicking code is run while used. Generally this means ensuring that only non-panicking code is run while
these states exist, or making a guard that cleans up the state in the case of these states exist, or making a guard that cleans up the state in the case of
a panic. This does not necessarily mean that the state a panic witnesses is a a panic. This does not necessarily mean that the state a panic witnesses is a
fully *coherent* state. We need only guarantee that it's a *safe* state. fully coherent state. We need only guarantee that it's a *safe* state.
Most Unsafe code is leaf-like, and therefore fairly easy to make exception-safe. Most Unsafe code is leaf-like, and therefore fairly easy to make exception-safe.
It controls all the code that runs, and most of that code can't panic. However It controls all the code that runs, and most of that code can't panic. However
@ -58,17 +58,16 @@ impl<T: Clone> Vec<T> {
We bypass `push` in order to avoid redundant capacity and `len` checks on the We bypass `push` in order to avoid redundant capacity and `len` checks on the
Vec that we definitely know has capacity. The logic is totally correct, except Vec that we definitely know has capacity. The logic is totally correct, except
there's a subtle problem with our code: it's not exception-safe! `set_len`, there's a subtle problem with our code: it's not exception-safe! `set_len`,
`offset`, and `write` are all fine, but *clone* is the panic bomb we over- `offset`, and `write` are all fine; `clone` is the panic bomb we over-looked.
looked.
Clone is completely out of our control, and is totally free to panic. If it Clone is completely out of our control, and is totally free to panic. If it
does, our function will exit early with the length of the Vec set too large. If does, our function will exit early with the length of the Vec set too large. If
the Vec is looked at or dropped, uninitialized memory will be read! the Vec is looked at or dropped, uninitialized memory will be read!
The fix in this case is fairly simple. If we want to guarantee that the values The fix in this case is fairly simple. If we want to guarantee that the values
we *did* clone are dropped we can set the len *in* the loop. If we just want to we *did* clone are dropped, we can set the `len` every loop iteration. If we
guarantee that uninitialized memory can't be observed, we can set the len just want to guarantee that uninitialized memory can't be observed, we can set
*after* the loop. the `len` after the loop.
@ -89,7 +88,7 @@ bubble_up(heap, index):
A literal transcription of this code to Rust is totally fine, but has an annoying A literal transcription of this code to Rust is totally fine, but has an annoying
performance characteristic: the `self` element is swapped over and over again performance characteristic: the `self` element is swapped over and over again
uselessly. We would *rather* have the following: uselessly. We would rather have the following:
```text ```text
bubble_up(heap, index): bubble_up(heap, index):
@ -128,7 +127,7 @@ actually touched the state of the heap yet. Once we do start messing with the
heap, we're working with only data and functions that we trust, so there's no heap, we're working with only data and functions that we trust, so there's no
concern of panics. concern of panics.
Perhaps you're not happy with this design. Surely, it's cheating! And we have Perhaps you're not happy with this design. Surely it's cheating! And we have
to do the complex heap traversal *twice*! Alright, let's bite the bullet. Let's to do the complex heap traversal *twice*! Alright, let's bite the bullet. Let's
intermix untrusted and unsafe code *for reals*. intermix untrusted and unsafe code *for reals*.

@ -48,7 +48,7 @@ a variable position based on its alignment][dst-issue].**
# Zero Sized Types (ZSTs) # Zero Sized Types (ZSTs)
Rust actually allows types to be specified that occupy *no* space: Rust actually allows types to be specified that occupy no space:
```rust ```rust
struct Foo; // No fields = no size struct Foo; // No fields = no size
@ -124,7 +124,7 @@ let res: Result<u32, Void> = Ok(0);
let Ok(num) = res; let Ok(num) = res;
``` ```
But neither of these tricks work today, so all Void types get you today is But neither of these tricks work today, so all Void types get you is
the ability to be confident that certain situations are statically impossible. the ability to be confident that certain situations are statically impossible.
One final subtle detail about empty types is that raw pointers to them are One final subtle detail about empty types is that raw pointers to them are

@ -55,7 +55,7 @@ fn main() {
How on earth are we supposed to express the lifetimes on `F`'s trait bound? We How on earth are we supposed to express the lifetimes on `F`'s trait bound? We
need to provide some lifetime there, but the lifetime we care about can't be need to provide some lifetime there, but the lifetime we care about can't be
named until we enter the body of `call`! Also, that isn't some fixed lifetime; named until we enter the body of `call`! Also, that isn't some fixed lifetime;
call works with *any* lifetime `&self` happens to have at that point. `call` works with *any* lifetime `&self` happens to have at that point.
This job requires The Magic of Higher-Rank Trait Bounds (HRTBs). The way we This job requires The Magic of Higher-Rank Trait Bounds (HRTBs). The way we
desugar this is as follows: desugar this is as follows:

@ -21,21 +21,21 @@ uselessly, holding on to its precious resources until the program terminates (at
which point all those resources would have been reclaimed by the OS anyway). which point all those resources would have been reclaimed by the OS anyway).
We may consider a more restricted form of leak: failing to drop a value that is We may consider a more restricted form of leak: failing to drop a value that is
unreachable. Rust also doesn't prevent this. In fact Rust has a *function for unreachable. Rust also doesn't prevent this. In fact Rust *has a function for
doing this*: `mem::forget`. This function consumes the value it is passed *and doing this*: `mem::forget`. This function consumes the value it is passed *and
then doesn't run its destructor*. then doesn't run its destructor*.
In the past `mem::forget` was marked as unsafe as a sort of lint against using In the past `mem::forget` was marked as unsafe as a sort of lint against using
it, since failing to call a destructor is generally not a well-behaved thing to it, since failing to call a destructor is generally not a well-behaved thing to
do (though useful for some special unsafe code). However this was generally do (though useful for some special unsafe code). However this was generally
determined to be an untenable stance to take: there are *many* ways to fail to determined to be an untenable stance to take: there are many ways to fail to
call a destructor in safe code. The most famous example is creating a cycle of call a destructor in safe code. The most famous example is creating a cycle of
reference-counted pointers using interior mutability. reference-counted pointers using interior mutability.
It is reasonable for safe code to assume that destructor leaks do not happen, as It is reasonable for safe code to assume that destructor leaks do not happen, as
any program that leaks destructors is probably wrong. However *unsafe* code any program that leaks destructors is probably wrong. However *unsafe* code
cannot rely on destructors to be run to be *safe*. For most types this doesn't cannot rely on destructors to be run in order to be safe. For most types this
matter: if you leak the destructor then the type is *by definition* doesn't matter: if you leak the destructor then the type is by definition
inaccessible, so it doesn't matter, right? For instance, if you leak a `Box<u8>` inaccessible, so it doesn't matter, right? For instance, if you leak a `Box<u8>`
then you waste some memory but that's hardly going to violate memory-safety. then you waste some memory but that's hardly going to violate memory-safety.
@ -64,7 +64,7 @@ uninitialized data! We could backshift all the elements in the Vec every time we
remove a value, but this would have pretty catastrophic performance remove a value, but this would have pretty catastrophic performance
consequences. consequences.
Instead, we would like Drain to *fix* the Vec's backing storage when it is Instead, we would like Drain to fix the Vec's backing storage when it is
dropped. It should run itself to completion, backshift any elements that weren't dropped. It should run itself to completion, backshift any elements that weren't
removed (drain supports subranges), and then fix Vec's `len`. It's even removed (drain supports subranges), and then fix Vec's `len`. It's even
unwinding-safe! Easy! unwinding-safe! Easy!
@ -97,13 +97,13 @@ consistent state gives us Undefined Behaviour in safe code (making the API
unsound). unsound).
So what can we do? Well, we can pick a trivially consistent state: set the Vec's So what can we do? Well, we can pick a trivially consistent state: set the Vec's
len to be 0 when we *start* the iteration, and fix it up if necessary in the len to be 0 when we start the iteration, and fix it up if necessary in the
destructor. That way, if everything executes like normal we get the desired destructor. That way, if everything executes like normal we get the desired
behaviour with minimal overhead. But if someone has the *audacity* to behaviour with minimal overhead. But if someone has the *audacity* to
mem::forget us in the middle of the iteration, all that does is *leak even more* mem::forget us in the middle of the iteration, all that does is *leak even more*
(and possibly leave the Vec in an *unexpected* but consistent state). Since (and possibly leave the Vec in an unexpected but otherwise consistent state).
we've accepted that mem::forget is safe, this is definitely safe. We call leaks Since we've accepted that mem::forget is safe, this is definitely safe. We call
causing more leaks a *leak amplification*. leaks causing more leaks a *leak amplification*.
@ -167,16 +167,16 @@ impl<T> Drop for Rc<T> {
} }
``` ```
This code contains an implicit and subtle assumption: ref_count can fit in a This code contains an implicit and subtle assumption: `ref_count` can fit in a
`usize`, because there can't be more than `usize::MAX` Rcs in memory. However `usize`, because there can't be more than `usize::MAX` Rcs in memory. However
this itself assumes that the ref_count accurately reflects the number of Rcs this itself assumes that the `ref_count` accurately reflects the number of Rcs
in memory, which we know is false with mem::forget. Using mem::forget we can in memory, which we know is false with `mem::forget`. Using `mem::forget` we can
overflow the ref_count, and then get it down to 0 with outstanding Rcs. Then we overflow the `ref_count`, and then get it down to 0 with outstanding Rcs. Then
can happily use-after-free the inner data. Bad Bad Not Good. we can happily use-after-free the inner data. Bad Bad Not Good.
This can be solved by *saturating* the ref_count, which is sound because This can be solved by just checking the `ref_count` and doing *something*. The
decreasing the refcount by `n` still requires `n` Rcs simultaneously living standard library's stance is to just abort, because your program has become
in memory. horribly degenerate. Also *oh my gosh* it's such a ridiculous corner case.
@ -237,7 +237,7 @@ In principle, this totally works! Rust's ownership system perfectly ensures it!
let mut data = Box::new(0); let mut data = Box::new(0);
{ {
let guard = thread::scoped(|| { let guard = thread::scoped(|| {
// This is at best a data race. At worst, it's *also* a use-after-free. // This is at best a data race. At worst, it's also a use-after-free.
*data += 1; *data += 1;
}); });
// Because the guard is forgotten, expiring the loan without blocking this // Because the guard is forgotten, expiring the loan without blocking this

@ -18,7 +18,7 @@ fn main() {
``` ```
One might expect it to compile. We call `mutate_and_share`, which mutably borrows One might expect it to compile. We call `mutate_and_share`, which mutably borrows
`foo` *temporarily*, but then returns *only* a shared reference. Therefore we `foo` temporarily, but then returns only a shared reference. Therefore we
would expect `foo.share()` to succeed as `foo` shouldn't be mutably borrowed. would expect `foo.share()` to succeed as `foo` shouldn't be mutably borrowed.
However when we try to compile it: However when we try to compile it:
@ -69,7 +69,7 @@ due to the lifetime of `loan` and mutate_and_share's signature. Then when we
try to call `share`, and it sees we're trying to alias that `&'c mut foo` and try to call `share`, and it sees we're trying to alias that `&'c mut foo` and
blows up in our face! blows up in our face!
This program is clearly correct according to the reference semantics we *actually* This program is clearly correct according to the reference semantics we actually
care about, but the lifetime system is too coarse-grained to handle that. care about, but the lifetime system is too coarse-grained to handle that.

@ -6,11 +6,11 @@ and anything that contains a reference, is tagged with a lifetime specifying
the scope it's valid for. the scope it's valid for.
Within a function body, Rust generally doesn't let you explicitly name the Within a function body, Rust generally doesn't let you explicitly name the
lifetimes involved. This is because it's generally not really *necessary* lifetimes involved. This is because it's generally not really necessary
to talk about lifetimes in a local context; Rust has all the information and to talk about lifetimes in a local context; Rust has all the information and
can work out everything as optimally as possible. Many anonymous scopes and can work out everything as optimally as possible. Many anonymous scopes and
temporaries that you would otherwise have to write are often introduced to temporaries that you would otherwise have to write are often introduced to
make your code *just work*. make your code Just Work.
However once you cross the function boundary, you need to start talking about However once you cross the function boundary, you need to start talking about
lifetimes. Lifetimes are denoted with an apostrophe: `'a`, `'static`. To dip lifetimes. Lifetimes are denoted with an apostrophe: `'a`, `'static`. To dip
@ -42,7 +42,7 @@ likely desugar to the following:
'a: { 'a: {
let x: i32 = 0; let x: i32 = 0;
'b: { 'b: {
// lifetime used is 'b because that's *good enough*. // lifetime used is 'b because that's good enough.
let y: &'b i32 = &'b x; let y: &'b i32 = &'b x;
'c: { 'c: {
// ditto on 'c // ditto on 'c
@ -107,8 +107,9 @@ fn as_str<'a>(data: &'a u32) -> &'a str {
This signature of `as_str` takes a reference to a u32 with *some* lifetime, and This signature of `as_str` takes a reference to a u32 with *some* lifetime, and
promises that it can produce a reference to a str that can live *just as long*. promises that it can produce a reference to a str that can live *just as long*.
Already we can see why this signature might be trouble. That basically implies Already we can see why this signature might be trouble. That basically implies
that we're going to *find* a str somewhere in the scope the scope the reference that we're going to find a str somewhere in the scope the reference
to the u32 originated in, or somewhere *even* earlier. That's a *bit* of a big ask. to the u32 originated in, or somewhere *even earlier*. That's a bit of a big
ask.
We then proceed to compute the string `s`, and return a reference to it. Since We then proceed to compute the string `s`, and return a reference to it. Since
the contract of our function says the reference must outlive `'a`, that's the the contract of our function says the reference must outlive `'a`, that's the
@ -135,7 +136,7 @@ fn main() {
'd: { 'd: {
// An anonymous scope is introduced because the borrow does not // An anonymous scope is introduced because the borrow does not
// need to last for the whole scope x is valid for. The return // need to last for the whole scope x is valid for. The return
// of as_str must find a str somewhere *before* this function // of as_str must find a str somewhere before this function
// call. Obviously not happening. // call. Obviously not happening.
println!("{}", as_str::<'d>(&'d x)); println!("{}", as_str::<'d>(&'d x));
} }
@ -195,21 +196,21 @@ println!("{}", x);
The problem here is is bit more subtle and interesting. We want Rust to The problem here is is bit more subtle and interesting. We want Rust to
reject this program for the following reason: We have a live shared reference `x` reject this program for the following reason: We have a live shared reference `x`
to a descendent of `data` when try to take a *mutable* reference to `data` to a descendent of `data` when we try to take a mutable reference to `data`
when we call `push`. This would create an aliased mutable reference, which would to `push`. This would create an aliased mutable reference, which would
violate the *second* rule of references. violate the *second* rule of references.
However this is *not at all* how Rust reasons that this program is bad. Rust However this is *not at all* how Rust reasons that this program is bad. Rust
doesn't understand that `x` is a reference to a subpath of `data`. It doesn't doesn't understand that `x` is a reference to a subpath of `data`. It doesn't
understand Vec at all. What it *does* see is that `x` has to live for `'b` to understand Vec at all. What it *does* see is that `x` has to live for `'b` to
be printed. The signature of `Index::index` subsequently demands that the be printed. The signature of `Index::index` subsequently demands that the
reference we take to *data* has to survive for `'b`. When we try to call `push`, reference we take to `data` has to survive for `'b`. When we try to call `push`,
it then sees us try to make an `&'c mut data`. Rust knows that `'c` is contained it then sees us try to make an `&'c mut data`. Rust knows that `'c` is contained
within `'b`, and rejects our program because the `&'b data` must still be live! within `'b`, and rejects our program because the `&'b data` must still be live!
Here we see that the lifetime system is *much* more coarse than the reference Here we see that the lifetime system is much more coarse than the reference
semantics we're actually interested in preserving. For the most part, *that's semantics we're actually interested in preserving. For the most part, *that's
totally ok*, because it keeps us from spending all day explaining our program totally ok*, because it keeps us from spending all day explaining our program
to the compiler. However it does mean that several programs that are *totally* to the compiler. However it does mean that several programs that are totally
correct with respect to Rust's *true* semantics are rejected because lifetimes correct with respect to Rust's *true* semantics are rejected because lifetimes
are too dumb. are too dumb.

@ -29,7 +29,7 @@ Rust, you will never have to worry about type-safety or memory-safety. You will
never endure a null or dangling pointer, or any of that Undefined Behaviour never endure a null or dangling pointer, or any of that Undefined Behaviour
nonsense. nonsense.
*That's totally awesome*. *That's totally awesome.*
The standard library also gives you enough utilities out-of-the-box that you'll The standard library also gives you enough utilities out-of-the-box that you'll
be able to write awesome high-performance applications and libraries in pure be able to write awesome high-performance applications and libraries in pure
@ -41,7 +41,7 @@ low-level abstraction not exposed by the standard library. Maybe you're
need to do something the type-system doesn't understand and just *frob some dang need to do something the type-system doesn't understand and just *frob some dang
bits*. Maybe you need Unsafe Rust. bits*. Maybe you need Unsafe Rust.
Unsafe Rust is exactly like Safe Rust with *all* the same rules and semantics. Unsafe Rust is exactly like Safe Rust with all the same rules and semantics.
However Unsafe Rust lets you do some *extra* things that are Definitely Not Safe. However Unsafe Rust lets you do some *extra* things that are Definitely Not Safe.
The only things that are different in Unsafe Rust are that you can: The only things that are different in Unsafe Rust are that you can:

@ -12,7 +12,7 @@ language?
Regardless of your feelings on GC, it is pretty clearly a *massive* boon to Regardless of your feelings on GC, it is pretty clearly a *massive* boon to
making code safe. You never have to worry about things going away *too soon* making code safe. You never have to worry about things going away *too soon*
(although whether you still *wanted* to be pointing at that thing is a different (although whether you still wanted to be pointing at that thing is a different
issue...). This is a pervasive problem that C and C++ programs need to deal issue...). This is a pervasive problem that C and C++ programs need to deal
with. Consider this simple mistake that all of us who have used a non-GC'd with. Consider this simple mistake that all of us who have used a non-GC'd
language have made at one point: language have made at one point:

@ -14,11 +14,11 @@ struct Iter<'a, T: 'a> {
However because `'a` is unused within the struct's body, it's *unbounded*. However because `'a` is unused within the struct's body, it's *unbounded*.
Because of the troubles this has historically caused, unbounded lifetimes and Because of the troubles this has historically caused, unbounded lifetimes and
types are *illegal* in struct definitions. Therefore we must somehow refer types are *forbidden* in struct definitions. Therefore we must somehow refer
to these types in the body. Correctly doing this is necessary to have to these types in the body. Correctly doing this is necessary to have
correct variance and drop checking. correct variance and drop checking.
We do this using *PhantomData*, which is a special marker type. PhantomData We do this using `PhantomData`, which is a special marker type. `PhantomData`
consumes no space, but simulates a field of the given type for the purpose of consumes no space, but simulates a field of the given type for the purpose of
static analysis. This was deemed to be less error-prone than explicitly telling static analysis. This was deemed to be less error-prone than explicitly telling
the type-system the kind of variance that you want, while also providing other the type-system the kind of variance that you want, while also providing other
@ -57,7 +57,7 @@ Good to go!
Nope. Nope.
The drop checker will generously determine that Vec<T> does not own any values The drop checker will generously determine that Vec<T> does not own any values
of type T. This will in turn make it conclude that it does *not* need to worry of type T. This will in turn make it conclude that it doesn't need to worry
about Vec dropping any T's in its destructor for determining drop check about Vec dropping any T's in its destructor for determining drop check
soundness. This will in turn allow people to create unsoundness using soundness. This will in turn allow people to create unsoundness using
Vec's destructor. Vec's destructor.

@ -20,7 +20,7 @@ standard library's Mutex type. A Mutex will poison itself if one of its
MutexGuards (the thing it returns when a lock is obtained) is dropped during a MutexGuards (the thing it returns when a lock is obtained) is dropped during a
panic. Any future attempts to lock the Mutex will return an `Err` or panic. panic. Any future attempts to lock the Mutex will return an `Err` or panic.
Mutex poisons not for *true* safety in the sense that Rust normally cares about. It Mutex poisons not for true safety in the sense that Rust normally cares about. It
poisons as a safety-guard against blindly using the data that comes out of a Mutex poisons as a safety-guard against blindly using the data that comes out of a Mutex
that has witnessed a panic while locked. The data in such a Mutex was likely in the that has witnessed a panic while locked. The data in such a Mutex was likely in the
middle of being modified, and as such may be in an inconsistent or incomplete state. middle of being modified, and as such may be in an inconsistent or incomplete state.

Loading…
Cancel
Save