nits and realigning

pull/10/head
Alexis Beingessner 10 years ago committed by Manish Goregaokar
parent b6cf288741
commit 4b9d71becf

@ -34,6 +34,6 @@ Due to the nature of advanced Rust programming, we will be spending a lot of tim
talking about *safety* and *guarantees*. In particular, a significant portion of talking about *safety* and *guarantees*. In particular, a significant portion of
the book will be dedicated to correctly writing and understanding Unsafe Rust. the book will be dedicated to correctly writing and understanding Unsafe Rust.
[trpl]: https://doc.rust-lang.org/book/ [trpl]: ../book/
[The stack and heap]: https://doc.rust-lang.org/book/the-stack-and-the-heap.html [The stack and heap]: ../book/the-stack-and-the-heap.html
[Basic Rust]: https://doc.rust-lang.org/book/syntax-and-semantics.html [Basic Rust]: ../book/syntax-and-semantics.html

@ -1,6 +1,6 @@
% Implementing Arc and Mutex % Implementing Arc and Mutex
Knowing the theory is all fine and good, but the *best* was to understand Knowing the theory is all fine and good, but the *best* way to understand
something is to use it. To better understand atomics and interior mutability, something is to use it. To better understand atomics and interior mutability,
we'll be implementing versions of the standard library's Arc and Mutex types. we'll be implementing versions of the standard library's Arc and Mutex types.

@ -2,21 +2,22 @@
Rust pretty blatantly just inherits C11's memory model for atomics. This is not Rust pretty blatantly just inherits C11's memory model for atomics. This is not
due this model being particularly excellent or easy to understand. Indeed, this due this model being particularly excellent or easy to understand. Indeed, this
model is quite complex and known to have [several flaws][C11-busted]. Rather, model is quite complex and known to have [several flaws][C11-busted]. Rather, it
it is a pragmatic concession to the fact that *everyone* is pretty bad at modeling is a pragmatic concession to the fact that *everyone* is pretty bad at modeling
atomics. At very least, we can benefit from existing tooling and research around atomics. At very least, we can benefit from existing tooling and research around
C. C.
Trying to fully explain the model in this book is fairly hopeless. It's defined Trying to fully explain the model in this book is fairly hopeless. It's defined
in terms of madness-inducing causality graphs that require a full book to properly in terms of madness-inducing causality graphs that require a full book to
understand in a practical way. If you want all the nitty-gritty details, you properly understand in a practical way. If you want all the nitty-gritty
should check out [C's specification (Section 7.17)][C11-model]. Still, we'll try details, you should check out [C's specification (Section 7.17)][C11-model].
to cover the basics and some of the problems Rust developers face. Still, we'll try to cover the basics and some of the problems Rust developers
face.
The C11 memory model is fundamentally about trying to bridge the gap between The C11 memory model is fundamentally about trying to bridge the gap between the
the semantics we want, the optimizations compilers want, and the inconsistent semantics we want, the optimizations compilers want, and the inconsistent chaos
chaos our hardware wants. *We* would like to just write programs and have them our hardware wants. *We* would like to just write programs and have them do
do exactly what we said but, you know, *fast*. Wouldn't that be great? exactly what we said but, you know, *fast*. Wouldn't that be great?
@ -41,13 +42,14 @@ x = 2;
y = 3; y = 3;
``` ```
This has inverted the order of events *and* completely eliminated one event. From This has inverted the order of events *and* completely eliminated one event.
a single-threaded perspective this is completely unobservable: after all the From a single-threaded perspective this is completely unobservable: after all
statements have executed we are in exactly the same state. But if our program is the statements have executed we are in exactly the same state. But if our
multi-threaded, we may have been relying on `x` to *actually* be assigned to 1 before program is multi-threaded, we may have been relying on `x` to *actually* be
`y` was assigned. We would *really* like the compiler to be able to make these kinds assigned to 1 before `y` was assigned. We would *really* like the compiler to be
of optimizations, because they can seriously improve performance. On the other hand, able to make these kinds of optimizations, because they can seriously improve
we'd really like to be able to depend on our program *doing the thing we said*. performance. On the other hand, we'd really like to be able to depend on our
program *doing the thing we said*.
@ -55,19 +57,20 @@ we'd really like to be able to depend on our program *doing the thing we said*.
# Hardware Reordering # Hardware Reordering
On the other hand, even if the compiler totally understood what we wanted and On the other hand, even if the compiler totally understood what we wanted and
respected our wishes, our *hardware* might instead get us in trouble. Trouble comes respected our wishes, our *hardware* might instead get us in trouble. Trouble
from CPUs in the form of memory hierarchies. There is indeed a global shared memory comes from CPUs in the form of memory hierarchies. There is indeed a global
space somewhere in your hardware, but from the perspective of each CPU core it is shared memory space somewhere in your hardware, but from the perspective of each
*so very far away* and *so very slow*. Each CPU would rather work with its local CPU core it is *so very far away* and *so very slow*. Each CPU would rather work
cache of the data and only go through all the *anguish* of talking to shared with its local cache of the data and only go through all the *anguish* of
memory *only* when it doesn't actually have that memory in cache. talking to shared memory *only* when it doesn't actually have that memory in
cache.
After all, that's the whole *point* of the cache, right? If every read from the After all, that's the whole *point* of the cache, right? If every read from the
cache had to run back to shared memory to double check that it hadn't changed, cache had to run back to shared memory to double check that it hadn't changed,
what would the point be? The end result is that the hardware doesn't guarantee what would the point be? The end result is that the hardware doesn't guarantee
that events that occur in the same order on *one* thread, occur in the same order that events that occur in the same order on *one* thread, occur in the same
on *another* thread. To guarantee this, we must issue special instructions to order on *another* thread. To guarantee this, we must issue special instructions
the CPU telling it to be a bit less smart. to the CPU telling it to be a bit less smart.
For instance, say we convince the compiler to emit this logic: For instance, say we convince the compiler to emit this logic:
@ -82,27 +85,27 @@ x = 1; y *= 2;
Ideally this program has 2 possible final states: Ideally this program has 2 possible final states:
* `y = 3`: (thread 2 did the check before thread 1 completed) * `y = 3`: (thread 2 did the check before thread 1 completed) y = 6`: (thread 2
* `y = 6`: (thread 2 did the check after thread 1 completed) * `did the check after thread 1 completed)
However there's a third potential state that the hardware enables: However there's a third potential state that the hardware enables:
* `y = 2`: (thread 2 saw `x = 2`, but not `y = 3`, and then overwrote `y = 3`) * `y = 2`: (thread 2 saw `x = 2`, but not `y = 3`, and then overwrote `y = 3`)
It's worth noting that different kinds of CPU provide different guarantees. It It's worth noting that different kinds of CPU provide different guarantees. It
is common to seperate hardware into two categories: strongly-ordered and weakly- is common to separate hardware into two categories: strongly-ordered and weakly-
ordered. Most notably x86/64 provides strong ordering guarantees, while ARM and ordered. Most notably x86/64 provides strong ordering guarantees, while ARM
provides weak ordering guarantees. This has two consequences for provides weak ordering guarantees. This has two consequences for concurrent
concurrent programming: programming:
* Asking for stronger guarantees on strongly-ordered hardware may be cheap or * Asking for stronger guarantees on strongly-ordered hardware may be cheap or
even *free* because they already provide strong guarantees unconditionally. even *free* because they already provide strong guarantees unconditionally.
Weaker guarantees may only yield performance wins on weakly-ordered hardware. Weaker guarantees may only yield performance wins on weakly-ordered hardware.
* Asking for guarantees that are *too* weak on strongly-ordered hardware * Asking for guarantees that are *too* weak on strongly-ordered hardware is
is more likely to *happen* to work, even though your program is strictly more likely to *happen* to work, even though your program is strictly
incorrect. If possible, concurrent algorithms should be tested on incorrect. If possible, concurrent algorithms should be tested on weakly-
weakly-ordered hardware. ordered hardware.
@ -110,58 +113,54 @@ concurrent programming:
# Data Accesses # Data Accesses
The C11 memory model attempts to bridge the gap by allowing us to talk about The C11 memory model attempts to bridge the gap by allowing us to talk about the
the *causality* of our program. Generally, this is by establishing a *causality* of our program. Generally, this is by establishing a *happens
*happens before* relationships between parts of the program and the threads before* relationships between parts of the program and the threads that are
that are running them. This gives the hardware and compiler room to optimize the running them. This gives the hardware and compiler room to optimize the program
program more aggressively where a strict happens-before relationship isn't more aggressively where a strict happens-before relationship isn't established,
established, but forces them to be more careful where one *is* established. but forces them to be more careful where one *is* established. The way we
The way we communicate these relationships are through *data accesses* and communicate these relationships are through *data accesses* and *atomic
*atomic accesses*. accesses*.
Data accesses are the bread-and-butter of the programming world. They are Data accesses are the bread-and-butter of the programming world. They are
fundamentally unsynchronized and compilers are free to aggressively optimize fundamentally unsynchronized and compilers are free to aggressively optimize
them. In particular, data accesses are free to be reordered by the compiler them. In particular, data accesses are free to be reordered by the compiler on
on the assumption that the program is single-threaded. The hardware is also free the assumption that the program is single-threaded. The hardware is also free to
to propagate the changes made in data accesses to other threads propagate the changes made in data accesses to other threads as lazily and
as lazily and inconsistently as it wants. Mostly critically, data accesses are inconsistently as it wants. Mostly critically, data accesses are how data races
how data races happen. Data accesses are very friendly to the hardware and happen. Data accesses are very friendly to the hardware and compiler, but as
compiler, but as we've seen they offer *awful* semantics to try to we've seen they offer *awful* semantics to try to write synchronized code with.
write synchronized code with. Actually, that's too weak. *It is literally Actually, that's too weak. *It is literally impossible to write correct
impossible to write correct synchronized code using only data accesses*. synchronized code using only data accesses*.
Atomic accesses are how we tell the hardware and compiler that our program is Atomic accesses are how we tell the hardware and compiler that our program is
multi-threaded. Each atomic access can be marked with multi-threaded. Each atomic access can be marked with an *ordering* that
an *ordering* that specifies what kind of relationship it establishes with specifies what kind of relationship it establishes with other accesses. In
other accesses. In practice, this boils down to telling the compiler and hardware practice, this boils down to telling the compiler and hardware certain things
certain things they *can't* do. For the compiler, this largely revolves they *can't* do. For the compiler, this largely revolves around re-ordering of
around re-ordering of instructions. For the hardware, this largely revolves instructions. For the hardware, this largely revolves around how writes are
around how writes are propagated to other threads. The set of orderings Rust propagated to other threads. The set of orderings Rust exposes are:
exposes are:
* Sequentially Consistent (SeqCst) Release Acquire Relaxed
* Sequentially Consistent (SeqCst)
* Release
* Acquire
* Relaxed
(Note: We explicitly do not expose the C11 *consume* ordering) (Note: We explicitly do not expose the C11 *consume* ordering)
TODO: negative reasoning vs positive reasoning? TODO: negative reasoning vs positive reasoning? TODO: "can't forget to
TODO: "can't forget to synchronize" synchronize"
# Sequentially Consistent # Sequentially Consistent
Sequentially Consistent is the most powerful of all, implying the restrictions Sequentially Consistent is the most powerful of all, implying the restrictions
of all other orderings. Intuitively, a sequentially consistent operation *cannot* of all other orderings. Intuitively, a sequentially consistent operation
be reordered: all accesses on one thread that happen before and after it *stay* *cannot* be reordered: all accesses on one thread that happen before and after a
before and after it. A data-race-free program that uses only sequentially consistent SeqCst access *stay* before and after it. A data-race-free program that uses
atomics and data accesses has the very nice property that there is a single global only sequentially consistent atomics and data accesses has the very nice
execution of the program's instructions that all threads agree on. This execution property that there is a single global execution of the program's instructions
is also particularly nice to reason about: it's just an interleaving of each thread's that all threads agree on. This execution is also particularly nice to reason
individual executions. This *does not* hold if you start using the weaker atomic about: it's just an interleaving of each thread's individual executions. This
orderings. *does not* hold if you start using the weaker atomic orderings.
The relative developer-friendliness of sequential consistency doesn't come for The relative developer-friendliness of sequential consistency doesn't come for
free. Even on strongly-ordered platforms sequential consistency involves free. Even on strongly-ordered platforms sequential consistency involves
@ -173,26 +172,26 @@ confident about the other memory orders. Having your program run a bit slower
than it needs to is certainly better than it running incorrectly! It's also than it needs to is certainly better than it running incorrectly! It's also
*mechanically* trivial to downgrade atomic operations to have a weaker *mechanically* trivial to downgrade atomic operations to have a weaker
consistency later on. Just change `SeqCst` to e.g. `Relaxed` and you're done! Of consistency later on. Just change `SeqCst` to e.g. `Relaxed` and you're done! Of
course, proving that this transformation is *correct* is whole other matter. course, proving that this transformation is *correct* is a whole other matter.
# Acquire-Release # Acquire-Release
Acquire and Release are largely intended to be paired. Their names hint at Acquire and Release are largely intended to be paired. Their names hint at their
their use case: they're perfectly suited for acquiring and releasing locks, use case: they're perfectly suited for acquiring and releasing locks, and
and ensuring that critical sections don't overlap. ensuring that critical sections don't overlap.
Intuitively, an acquire access ensures that every access after it *stays* after Intuitively, an acquire access ensures that every access after it *stays* after
it. However operations that occur before an acquire are free to be reordered to it. However operations that occur before an acquire are free to be reordered to
occur after it. Similarly, a release access ensures that every access before it occur after it. Similarly, a release access ensures that every access before it
*stays* before it. However operations that occur after a release are free to *stays* before it. However operations that occur after a release are free to be
be reordered to occur before it. reordered to occur before it.
When thread A releases a location in memory and then thread B subsequently When thread A releases a location in memory and then thread B subsequently
acquires *the same* location in memory, causality is established. Every write acquires *the same* location in memory, causality is established. Every write
that happened *before* A's release will be observed by B *after* it's release. that happened *before* A's release will be observed by B *after* its release.
However no causality is established with any other threads. Similarly, no However no causality is established with any other threads. Similarly, no
causality is established if A and B access *different* locations in memory. causality is established if A and B access *different* locations in memory.

@ -1,12 +1,13 @@
% Casts % Casts
Casts are a superset of coercions: every coercion can be explicitly invoked via a Casts are a superset of coercions: every coercion can be explicitly invoked via
cast, but some conversions *require* a cast. These "true casts" are generally regarded a cast, but some conversions *require* a cast. These "true casts" are generally
as dangerous or problematic actions. True casts revolve around raw pointers and regarded as dangerous or problematic actions. True casts revolve around raw
the primitive numeric types. True casts aren't checked. pointers and the primitive numeric types. True casts aren't checked.
Here's an exhaustive list of all the true casts. For brevity, we will use `*` Here's an exhaustive list of all the true casts. For brevity, we will use `*`
to denote either a `*const` or `*mut`, and `integer` to denote any integral primitive: to denote either a `*const` or `*mut`, and `integer` to denote any integral
primitive:
* `*T as *U` where `T, U: Sized` * `*T as *U` where `T, U: Sized`
* `*T as *U` TODO: explain unsized situation * `*T as *U` TODO: explain unsized situation
@ -37,19 +38,21 @@ expression, `e as U2` is not necessarily so (in fact it will only be valid if
For numeric casts, there are quite a few cases to consider: For numeric casts, there are quite a few cases to consider:
* casting between two integers of the same size (e.g. i32 -> u32) is a no-op * casting between two integers of the same size (e.g. i32 -> u32) is a no-op
* casting from a larger integer to a smaller integer (e.g. u32 -> u8) will truncate * casting from a larger integer to a smaller integer (e.g. u32 -> u8) will
truncate
* casting from a smaller integer to a larger integer (e.g. u8 -> u32) will * casting from a smaller integer to a larger integer (e.g. u8 -> u32) will
* zero-extend if the source is unsigned * zero-extend if the source is unsigned
* sign-extend if the source is signed * sign-extend if the source is signed
* casting from a float to an integer will round the float towards zero * casting from a float to an integer will round the float towards zero
* **NOTE: currently this will cause Undefined Behaviour if the rounded * **NOTE: currently this will cause Undefined Behaviour if the rounded
value cannot be represented by the target integer type**. This is a bug value cannot be represented by the target integer type**. This includes
and will be fixed. (TODO: figure out what Inf and NaN do) Inf and NaN. This is a bug and will be fixed.
* casting from an integer to float will produce the floating point representation * casting from an integer to float will produce the floating point
of the integer, rounded if necessary (rounding strategy unspecified). representation of the integer, rounded if necessary (rounding strategy
* casting from an f32 to an f64 is perfect and lossless. unspecified)
* casting from an f32 to an f64 is perfect and lossless
* casting from an f64 to an f32 will produce the closest possible value * casting from an f64 to an f32 will produce the closest possible value
(rounding strategy unspecified). (rounding strategy unspecified)
* **NOTE: currently this will cause Undefined Behaviour if the value * **NOTE: currently this will cause Undefined Behaviour if the value
is finite but larger or smaller than the largest or smallest finite is finite but larger or smaller than the largest or smallest finite
value representable by f32**. This is a bug and will be fixed. value representable by f32**. This is a bug and will be fixed.

@ -1,8 +1,8 @@
% Checked Uninitialized Memory % Checked Uninitialized Memory
Like C, all stack variables in Rust are uninitialized until a Like C, all stack variables in Rust are uninitialized until a value is
value is explicitly assigned to them. Unlike C, Rust statically prevents you explicitly assigned to them. Unlike C, Rust statically prevents you from ever
from ever reading them until you do: reading them until you do:
```rust ```rust
fn main() { fn main() {

@ -15,21 +15,21 @@ enum Bar {
Y(bool), Y(bool),
} }
struct Empty; struct Unit;
let foo = Foo { a: 0, b: 1, c: false }; let foo = Foo { a: 0, b: 1, c: false };
let bar = Bar::X(0); let bar = Bar::X(0);
let empty = Empty; let empty = Unit;
``` ```
That's it. Every other way you make an instance of a type is just calling a That's it. Every other way you make an instance of a type is just calling a
totally vanilla function that does some stuff and eventually bottoms out to The totally vanilla function that does some stuff and eventually bottoms out to The
One True Constructor. One True Constructor.
Unlike C++, Rust does not come with a slew of built in kinds of constructor. Unlike C++, Rust does not come with a slew of built-in kinds of constructor.
There are no Copy, Default, Assignment, Move, or whatever constructors. The There are no Copy, Default, Assignment, Move, or whatever constructors. The
reasons for this are varied, but it largely boils down to Rust's philosophy reasons for this are varied, but it largely boils down to Rust's philosophy of
of *being explicit*. *being explicit*.
Move constructors are meaningless in Rust because we don't enable types to Move constructors are meaningless in Rust because we don't enable types to
"care" about their location in memory. Every type must be ready for it to be "care" about their location in memory. Every type must be ready for it to be
@ -37,9 +37,9 @@ blindly memcopied to somewhere else in memory. This means pure on-the-stack-but-
still-movable intrusive linked lists are simply not happening in Rust (safely). still-movable intrusive linked lists are simply not happening in Rust (safely).
Assignment and copy constructors similarly don't exist because move semantics Assignment and copy constructors similarly don't exist because move semantics
are the *only* semantics in Rust. At most `x = y` just moves the bits of y into the x are the *only* semantics in Rust. At most `x = y` just moves the bits of y into
variable. Rust *does* provide two facilities for providing C++'s copy-oriented the x variable. Rust *does* provide two facilities for providing C++'s copy-
semantics: `Copy` and `Clone`. Clone is our moral equivalent of a copy oriented semantics: `Copy` and `Clone`. Clone is our moral equivalent of a copy
constructor, but it's never implicitly invoked. You have to explicitly call constructor, but it's never implicitly invoked. You have to explicitly call
`clone` on an element you want to be cloned. Copy is a special case of Clone `clone` on an element you want to be cloned. Copy is a special case of Clone
where the implementation is just "copy the bits". Copy types *are* implicitly where the implementation is just "copy the bits". Copy types *are* implicitly
@ -53,3 +53,7 @@ only useful for generic programming. In concrete contexts, a type will provide a
static `new` method for any kind of "default" constructor. This has no relation static `new` method for any kind of "default" constructor. This has no relation
to `new` in other languages and has no special meaning. It's just a naming to `new` in other languages and has no special meaning. It's just a naming
convention. convention.
TODO: talk about "placement new"?
[uninit]: uninitialized.html

@ -1,13 +1,13 @@
% Type Conversions % Type Conversions
At the end of the day, everything is just a pile of bits somewhere, and type systems At the end of the day, everything is just a pile of bits somewhere, and type
are just there to help us use those bits right. Needing to reinterpret those piles systems are just there to help us use those bits right. Needing to reinterpret
of bits as different types is a common problem and Rust consequently gives you those piles of bits as different types is a common problem and Rust consequently
several ways to do that. gives you several ways to do that.
First we'll look at the ways that *Safe Rust* gives you to reinterpret values. The First we'll look at the ways that *Safe Rust* gives you to reinterpret values.
most trivial way to do this is to just destructure a value into its constituent The most trivial way to do this is to just destructure a value into its
parts and then build a new type out of them. e.g. constituent parts and then build a new type out of them. e.g.
```rust ```rust
struct Foo { struct Foo {
@ -26,6 +26,6 @@ fn reinterpret(foo: Foo) -> Bar {
} }
``` ```
But this is, at best, annoying to do. For common conversions, rust provides But this is, at best, annoying to do. For common conversions, Rust provides
more ergonomic alternatives. more ergonomic alternatives.

@ -1,23 +1,24 @@
% Destructors % Destructors
What the language *does* provide is full-blown automatic destructors through the `Drop` trait, What the language *does* provide is full-blown automatic destructors through the
which provides the following method: `Drop` trait, which provides the following method:
```rust ```rust
fn drop(&mut self); fn drop(&mut self);
``` ```
This method gives the type time to somehow finish what it was doing. **After `drop` is run, This method gives the type time to somehow finish what it was doing. **After
Rust will recursively try to drop all of the fields of `self`**. This is a `drop` is run, Rust will recursively try to drop all of the fields of `self`**.
convenience feature so that you don't have to write "destructor boilerplate" to drop This is a convenience feature so that you don't have to write "destructor
children. If a struct has no special logic for being dropped other than dropping its boilerplate" to drop children. If a struct has no special logic for being
children, then it means `Drop` doesn't need to be implemented at all! dropped other than dropping its children, then it means `Drop` doesn't need to
be implemented at all!
**There is no stable way to prevent this behaviour in Rust 1.0**. **There is no stable way to prevent this behaviour in Rust 1.0.
Note that taking `&mut self` means that even if you *could* suppress recursive Drop, Note that taking `&mut self` means that even if you *could* suppress recursive
Rust will prevent you from e.g. moving fields out of self. For most types, this Drop, Rust will prevent you from e.g. moving fields out of self. For most types,
is totally fine. this is totally fine.
For instance, a custom implementation of `Box` might write `Drop` like this: For instance, a custom implementation of `Box` might write `Drop` like this:
@ -34,9 +35,9 @@ impl<T> Drop for Box<T> {
} }
``` ```
and this works fine because when Rust goes to drop the `ptr` field it just sees a *mut that and this works fine because when Rust goes to drop the `ptr` field it just sees
has no actual `Drop` implementation. Similarly nothing can use-after-free the `ptr` because a *mut that has no actual `Drop` implementation. Similarly nothing can use-
the Box is immediately marked as uninitialized. after-free the `ptr` because the Box is immediately marked as uninitialized.
However this wouldn't work: However this wouldn't work:
@ -93,11 +94,13 @@ enum Link {
} }
``` ```
will have its inner Box field dropped *if and only if* an instance stores the Next variant. will have its inner Box field dropped *if and only if* an instance stores the
Next variant.
In general this works really nice because you don't need to worry about adding/removing In general this works really nice because you don't need to worry about
drops when you refactor your data layout. Still there's certainly many valid usecases for adding/removing drops when you refactor your data layout. Still there's
needing to do trickier things with destructors. certainly many valid usecases for needing to do trickier things with
destructors.
The classic safe solution to overriding recursive drop and allowing moving out The classic safe solution to overriding recursive drop and allowing moving out
of Self during `drop` is to use an Option: of Self during `drop` is to use an Option:
@ -128,11 +131,11 @@ impl<T> Drop for SuperBox<T> {
} }
``` ```
However this has fairly odd semantics: you're saying that a field that *should* always However this has fairly odd semantics: you're saying that a field that *should*
be Some may be None, just because that happens in the destructor. Of course this always be Some may be None, just because that happens in the destructor. Of
conversely makes a lot of sense: you can call arbitrary methods on self during course this conversely makes a lot of sense: you can call arbitrary methods on
the destructor, and this should prevent you from ever doing so after deinitializing self during the destructor, and this should prevent you from ever doing so after
the field. Not that it will prevent you from producing any other deinitializing the field. Not that it will prevent you from producing any other
arbitrarily invalid state in there. arbitrarily invalid state in there.
On balance this is an ok choice. Certainly what you should reach for by default. On balance this is an ok choice. Certainly what you should reach for by default.

@ -3,42 +3,42 @@
The examples in the previous section introduce an interesting problem for Rust. The examples in the previous section introduce an interesting problem for Rust.
We have seen that's possible to conditionally initialize, deinitialize, and We have seen that's possible to conditionally initialize, deinitialize, and
*reinitialize* locations of memory totally safely. For Copy types, this isn't *reinitialize* locations of memory totally safely. For Copy types, this isn't
particularly notable since they're just a random pile of bits. However types with particularly notable since they're just a random pile of bits. However types
destructors are a different story: Rust needs to know whether to call a destructor with destructors are a different story: Rust needs to know whether to call a
whenever a variable is assigned to, or a variable goes out of scope. How can it destructor whenever a variable is assigned to, or a variable goes out of scope.
do this with conditional initialization? How can it do this with conditional initialization?
It turns out that Rust actually tracks whether a type should be dropped or not *at It turns out that Rust actually tracks whether a type should be dropped or not
runtime*. As a variable becomes initialized and uninitialized, a *drop flag* for *at runtime*. As a variable becomes initialized and uninitialized, a *drop flag*
that variable is toggled. When a variable *might* need to be dropped, this flag for that variable is toggled. When a variable *might* need to be dropped, this
is evaluated to determine if it *should* be dropped. flag is evaluated to determine if it *should* be dropped.
Of course, it is *often* the case that a value's initialization state can be Of course, it is *often* the case that a value's initialization state can be
*statically* known at every point in the program. If this is the case, then the *statically* known at every point in the program. If this is the case, then the
compiler can theoretically generate more effecient code! For instance, compiler can theoretically generate more effecient code! For instance, straight-
straight-line code has such *static drop semantics*: line code has such *static drop semantics*:
```rust ```rust
let mut x = Box::new(0); // x was uninit let mut x = Box::new(0); // x was uninit; just overwrite.
let mut y = x; // y was uninit let mut y = x; // y was uninit; just overwrite and make x uninit.
x = Box::new(0); // x was uninit x = Box::new(0); // x was uninit; just overwrite.
y = x; // y was init; Drop y! y = x; // y was init; Drop y, overwrite it, and make x uninit!
// y was init; Drop y! // y was init; Drop y!
// x was uninit // x was uninit; do nothing.
``` ```
And even branched code where all branches have the same behaviour with respect And even branched code where all branches have the same behaviour with respect
to initialization: to initialization:
```rust ```rust
let mut x = Box::new(0); // x was uninit let mut x = Box::new(0); // x was uninit; just overwrite.
if condition { if condition {
drop(x) // x gets moved out drop(x) // x gets moved out; make x uninit.
} else { } else {
println!("{}", x); println!("{}", x);
drop(x) // x gets moved out drop(x) // x gets moved out; make x uninit.
} }
x = Box::new(0); // x was uninit x = Box::new(0); // x was uninit; just overwrite.
// x was init; Drop x! // x was init; Drop x!
``` ```
@ -47,10 +47,10 @@ However code like this *requires* runtime information to correctly Drop:
```rust ```rust
let x; let x;
if condition { if condition {
x = Box::new(0); // x was uninit x = Box::new(0); // x was uninit; just overwrite.
println!("{}", x); println!("{}", x);
} }
// x might be uninit; check the flag! // x *might* be uninit; check the flag!
``` ```
Of course, in this case it's trivial to retrieve static drop semantics: Of course, in this case it's trivial to retrieve static drop semantics:

@ -7,7 +7,7 @@ if it overflows. Unless you are very careful and tightly control what code runs,
pretty much everything can unwind, and you need to be ready for it. pretty much everything can unwind, and you need to be ready for it.
Being ready for unwinding is often referred to as *exception safety* Being ready for unwinding is often referred to as *exception safety*
in the broader programming world. In Rust, their are two levels of exception in the broader programming world. In Rust, there are two levels of exception
safety that one may concern themselves with: safety that one may concern themselves with:
* In unsafe code, we *must* be exception safe to the point of not violating * In unsafe code, we *must* be exception safe to the point of not violating
@ -58,16 +58,17 @@ impl<T: Clone> Vec<T> {
We bypass `push` in order to avoid redundant capacity and `len` checks on the We bypass `push` in order to avoid redundant capacity and `len` checks on the
Vec that we definitely know has capacity. The logic is totally correct, except Vec that we definitely know has capacity. The logic is totally correct, except
there's a subtle problem with our code: it's not exception-safe! `set_len`, there's a subtle problem with our code: it's not exception-safe! `set_len`,
`offset`, and `write` are all fine, but *clone* is the panic bomb we over-looked. `offset`, and `write` are all fine, but *clone* is the panic bomb we over-
looked.
Clone is completely out of our control, and is totally free to panic. If it does, Clone is completely out of our control, and is totally free to panic. If it
our function will exit early with the length of the Vec set too large. If does, our function will exit early with the length of the Vec set too large. If
the Vec is looked at or dropped, uninitialized memory will be read! the Vec is looked at or dropped, uninitialized memory will be read!
The fix in this case is fairly simple. If we want to guarantee that the values The fix in this case is fairly simple. If we want to guarantee that the values
we *did* clone are dropped we can set the len *in* the loop. If we just want to we *did* clone are dropped we can set the len *in* the loop. If we just want to
guarantee that uninitialized memory can't be observed, we can set the len *after* guarantee that uninitialized memory can't be observed, we can set the len
the loop. *after* the loop.

@ -9,18 +9,19 @@ is not always the case, however.
# Dynamically Sized Types (DSTs) # Dynamically Sized Types (DSTs)
Rust also supports types without a statically known size. On the surface, Rust also supports types without a statically known size. On the surface, this
this is a bit nonsensical: Rust *must* know the size of something in order to is a bit nonsensical: Rust *must* know the size of something in order to work
work with it! DSTs are generally produced as views, or through type-erasure with it! DSTs are generally produced as views, or through type-erasure of types
of types that *do* have a known size. Due to their lack of a statically known that *do* have a known size. Due to their lack of a statically known size, these
size, these types can only exist *behind* some kind of pointer. They consequently types can only exist *behind* some kind of pointer. They consequently produce a
produce a *fat* pointer consisting of the pointer and the information that *fat* pointer consisting of the pointer and the information that *completes*
*completes* them. them.
For instance, the slice type, `[T]`, is some statically unknown number of elements For instance, the slice type, `[T]`, is some statically unknown number of
stored contiguously. `&[T]` consequently consists of a `(&T, usize)` pair that specifies elements stored contiguously. `&[T]` consequently consists of a `(&T, usize)`
where the slice starts, and how many elements it contains. Similarly, Trait Objects pair that specifies where the slice starts, and how many elements it contains.
support interface-oriented type erasure through a `(data_ptr, vtable_ptr)` pair. Similarly, Trait Objects support interface-oriented type erasure through a
`(data_ptr, vtable_ptr)` pair.
Structs can actually store a single DST directly as their last field, but this Structs can actually store a single DST directly as their last field, but this
makes them a DST as well: makes them a DST as well:
@ -55,33 +56,34 @@ struct Baz {
} }
``` ```
On their own, ZSTs are, for obvious reasons, pretty useless. However On their own, ZSTs are, for obvious reasons, pretty useless. However as with
as with many curious layout choices in Rust, their potential is realized in a generic many curious layout choices in Rust, their potential is realized in a generic
context. context.
Rust largely understands that any operation that produces or stores a ZST Rust largely understands that any operation that produces or stores a ZST can be
can be reduced to a no-op. For instance, a `HashSet<T>` can be effeciently implemented reduced to a no-op. For instance, a `HashSet<T>` can be effeciently implemented
as a thin wrapper around `HashMap<T, ()>` because all the operations `HashMap` normally as a thin wrapper around `HashMap<T, ()>` because all the operations `HashMap`
does to store and retrieve keys will be completely stripped in monomorphization. normally does to store and retrieve keys will be completely stripped in
monomorphization.
Similarly `Result<(), ()>` and `Option<()>` are effectively just fancy `bool`s. Similarly `Result<(), ()>` and `Option<()>` are effectively just fancy `bool`s.
Safe code need not worry about ZSTs, but *unsafe* code must be careful about the Safe code need not worry about ZSTs, but *unsafe* code must be careful about the
consequence of types with no size. In particular, pointer offsets are no-ops, and consequence of types with no size. In particular, pointer offsets are no-ops,
standard allocators (including jemalloc, the one used by Rust) generally consider and standard allocators (including jemalloc, the one used by Rust) generally
passing in `0` as Undefined Behaviour. consider passing in `0` as Undefined Behaviour.
# Void Types # Empty Types
Rust also enables types to be declared that *cannot even be instantiated*. These Rust also enables types to be declared that *cannot even be instantiated*. These
types can only be talked about at the type level, and never at the value level. types can only be talked about at the type level, and never at the value level.
```rust ```rust
enum Foo { } // No variants = VOID enum Foo { } // No variants = EMPTY
``` ```
TODO: WHY?! TODO: WHY?!

@ -1,46 +1,46 @@
% Leaking % Leaking
Ownership based resource management is intended to simplify composition. You Ownership-based resource management is intended to simplify composition. You
acquire resources when you create the object, and you release the resources acquire resources when you create the object, and you release the resources when
when it gets destroyed. Since destruction is handled for you, it means you it gets destroyed. Since destruction is handled for you, it means you can't
can't forget to release the resources, and it happens as soon as possible! forget to release the resources, and it happens as soon as possible! Surely this
Surely this is perfect and all of our problems are solved. is perfect and all of our problems are solved.
Everything is terrible and we have new and exotic problems to try to solve. Everything is terrible and we have new and exotic problems to try to solve.
Many people like to believe that Rust eliminates resource leaks, but this Many people like to believe that Rust eliminates resource leaks, but this is
is absolutely not the case, no matter how you look at it. In the strictest absolutely not the case, no matter how you look at it. In the strictest sense,
sense, "leaking" is so abstract as to be unpreventable. It's quite trivial "leaking" is so abstract as to be unpreventable. It's quite trivial to
to initialize a collection at the start of a program, fill it with tons of initialize a collection at the start of a program, fill it with tons of objects
objects with destructors, and then enter an infinite event loop that never with destructors, and then enter an infinite event loop that never refers to it.
refers to it. The collection will sit around uselessly, holding on to its The collection will sit around uselessly, holding on to its precious resources
precious resources until the program terminates (at which point all those until the program terminates (at which point all those resources would have been
resources would have been reclaimed by the OS anyway). reclaimed by the OS anyway).
We may consider a more restricted form of leak: failing to drop a value that We may consider a more restricted form of leak: failing to drop a value that is
is unreachable. Rust also doesn't prevent this. In fact Rust has a *function unreachable. Rust also doesn't prevent this. In fact Rust has a *function for
for doing this*: `mem::forget`. This function consumes the value it is passed doing this*: `mem::forget`. This function consumes the value it is passed *and
*and then doesn't run its destructor*. then doesn't run its destructor*.
In the past `mem::forget` was marked as unsafe as a sort of lint against using In the past `mem::forget` was marked as unsafe as a sort of lint against using
it, since failing to call a destructor is generally not a well-behaved thing to it, since failing to call a destructor is generally not a well-behaved thing to
do (though useful for some special unsafe code). However this was generally do (though useful for some special unsafe code). However this was generally
determined to be an untenable stance to take: there are *many* ways to fail to determined to be an untenable stance to take: there are *many* ways to fail to
call a destructor in safe code. The most famous example is creating a cycle call a destructor in safe code. The most famous example is creating a cycle of
of reference counted pointers using interior mutability. reference-counted pointers using interior mutability.
It is reasonable for safe code to assume that destructor leaks do not happen, It is reasonable for safe code to assume that destructor leaks do not happen, as
as any program that leaks destructors is probably wrong. However *unsafe* code any program that leaks destructors is probably wrong. However *unsafe* code
cannot rely on destructors to be run to be *safe*. For most types this doesn't cannot rely on destructors to be run to be *safe*. For most types this doesn't
matter: if you leak the destructor then the type is *by definition* inaccessible, matter: if you leak the destructor then the type is *by definition*
so it doesn't matter, right? For instance, if you leak a `Box<u8>` then you inaccessible, so it doesn't matter, right? For instance, if you leak a `Box<u8>`
waste some memory but that's hardly going to violate memory-safety. then you waste some memory but that's hardly going to violate memory-safety.
However where we must be careful with destructor leaks are *proxy* types. However where we must be careful with destructor leaks are *proxy* types. These
These are types which manage access to a distinct object, but don't actually are types which manage access to a distinct object, but don't actually own it.
own it. Proxy objects are quite rare. Proxy objects you'll need to care about Proxy objects are quite rare. Proxy objects you'll need to care about are even
are even rarer. However we'll focus on three interesting examples in the rarer. However we'll focus on three interesting examples in the standard
standard library: library:
* `vec::Drain` * `vec::Drain`
* `Rc` * `Rc`
@ -58,7 +58,8 @@ after claiming ownership over all of its contents. It produces an iterator
Now, consider Drain in the middle of iteration: some values have been moved out, Now, consider Drain in the middle of iteration: some values have been moved out,
and others haven't. This means that part of the Vec is now full of logically and others haven't. This means that part of the Vec is now full of logically
uninitialized data! We could backshift all the elements in the Vec every time we uninitialized data! We could backshift all the elements in the Vec every time we
remove a value, but this would have pretty catastrophic performance consequences. remove a value, but this would have pretty catastrophic performance
consequences.
Instead, we would like Drain to *fix* the Vec's backing storage when it is Instead, we would like Drain to *fix* the Vec's backing storage when it is
dropped. It should run itself to completion, backshift any elements that weren't dropped. It should run itself to completion, backshift any elements that weren't
@ -86,20 +87,20 @@ let mut vec = vec![Box::new(0); 4];
println!("{}", vec[0]); println!("{}", vec[0]);
``` ```
This is pretty clearly Not Good. Unfortunately, we're kind've stuck between This is pretty clearly Not Good. Unfortunately, we're kind've stuck between a
a rock and a hard place: maintaining consistent state at every step has rock and a hard place: maintaining consistent state at every step has an
an enormous cost (and would negate any benefits of the API). Failing to maintain enormous cost (and would negate any benefits of the API). Failing to maintain
consistent state gives us Undefined Behaviour in safe code (making the API consistent state gives us Undefined Behaviour in safe code (making the API
unsound). unsound).
So what can we do? Well, we can pick a trivially consistent state: set the Vec's So what can we do? Well, we can pick a trivially consistent state: set the Vec's
len to be 0 when we *start* the iteration, and fix it up if necessary in the len to be 0 when we *start* the iteration, and fix it up if necessary in the
destructor. That way, if everything executes like normal we get the desired destructor. That way, if everything executes like normal we get the desired
behaviour with minimal overhead. But if someone has the *audacity* to mem::forget behaviour with minimal overhead. But if someone has the *audacity* to
us in the middle of the iteration, all that does is *leak even more* (and possibly mem::forget us in the middle of the iteration, all that does is *leak even more*
leave the Vec in an *unexpected* but consistent state). Since we've (and possibly leave the Vec in an *unexpected* but consistent state). Since
accepted that mem::forget is safe, this is definitely safe. We call leaks causing we've accepted that mem::forget is safe, this is definitely safe. We call leaks
more leaks a *leak amplification*. causing more leaks a *leak amplification*.
@ -108,8 +109,8 @@ more leaks a *leak amplification*.
Rc is an interesting case because at first glance it doesn't appear to be a Rc is an interesting case because at first glance it doesn't appear to be a
proxy value at all. After all, it manages the data it points to, and dropping proxy value at all. After all, it manages the data it points to, and dropping
all the Rcs for a value will drop that value. leaking an Rc doesn't seem like all the Rcs for a value will drop that value. Leaking an Rc doesn't seem like it
it would be particularly dangerous. It will leave the refcount permanently would be particularly dangerous. It will leave the refcount permanently
incremented and prevent the data from being freed or dropped, but that seems incremented and prevent the data from being freed or dropped, but that seems
just like Box, right? just like Box, right?

@ -8,19 +8,19 @@ Rust allows you to specify alternative data layout strategies from the default.
# repr(C) # repr(C)
This is the most important `repr`. It has fairly simple intent: do what C does. This is the most important `repr`. It has fairly simple intent: do what C does.
The order, size, and alignment of fields is exactly what you would expect from The order, size, and alignment of fields is exactly what you would expect from C
C or C++. Any type you expect to pass through an FFI boundary should have `repr(C)`, or C++. Any type you expect to pass through an FFI boundary should have
as C is the lingua-franca of the programming world. This is also necessary `repr(C)`, as C is the lingua-franca of the programming world. This is also
to soundly do more elaborate tricks with data layout such as reintepretting values necessary to soundly do more elaborate tricks with data layout such as
as a different type. reintepretting values as a different type.
However, the interaction with Rust's more exotic data layout features must be kept However, the interaction with Rust's more exotic data layout features must be
in mind. Due to its dual purpose as "for FFI" and "for layout control", `repr(C)` kept in mind. Due to its dual purpose as "for FFI" and "for layout control",
can be applied to types that will be nonsensical or problematic if passed through `repr(C)` can be applied to types that will be nonsensical or problematic if
the FFI boundary. passed through the FFI boundary.
* ZSTs are still zero-sized, even though this is not a standard behaviour * ZSTs are still zero-sized, even though this is not a standard behaviour in
in C, and is explicitly contrary to the behaviour of an empty type in C++, which C, and is explicitly contrary to the behaviour of an empty type in C++, which
still consumes a byte of space. still consumes a byte of space.
* DSTs, tuples, and tagged unions are not a concept in C and as such are never * DSTs, tuples, and tagged unions are not a concept in C and as such are never
@ -30,8 +30,9 @@ the FFI boundary.
* This is equivalent to one of `repr(u*)` (see the next section) for enums. The * This is equivalent to one of `repr(u*)` (see the next section) for enums. The
chosen size is the default enum size for the target platform's C ABI. Note that chosen size is the default enum size for the target platform's C ABI. Note that
enum representation in C is undefined, and this may be incorrect when the C enum representation in C is implementation defined, so this is really a "best
code is compiled with certain flags. guess". In particular, this may be incorrect when the C code of interest is
compiled with certain flags.
@ -40,10 +41,11 @@ the FFI boundary.
These specify the size to make a C-like enum. If the discriminant overflows the These specify the size to make a C-like enum. If the discriminant overflows the
integer it has to fit in, it will be an error. You can manually ask Rust to integer it has to fit in, it will be an error. You can manually ask Rust to
allow this by setting the overflowing element to explicitly be 0. However Rust allow this by setting the overflowing element to explicitly be 0. However Rust
will not allow you to create an enum where two variants have the same discriminant. will not allow you to create an enum where two variants have the same
discriminant.
On non-C-like enums, this will inhibit certain optimizations like the null-pointer On non-C-like enums, this will inhibit certain optimizations like the null-
optimization. pointer optimization.
These reprs have no affect on a struct. These reprs have no affect on a struct.
@ -53,15 +55,15 @@ These reprs have no affect on a struct.
# repr(packed) # repr(packed)
`repr(packed)` forces rust to strip any padding, and only align the type to a `repr(packed)` forces rust to strip any padding, and only align the type to a
byte. This may improve the memory footprint, but will likely have other byte. This may improve the memory footprint, but will likely have other negative
negative side-effects. side-effects.
In particular, most architectures *strongly* prefer values to be aligned. This In particular, most architectures *strongly* prefer values to be aligned. This
may mean the unaligned loads are penalized (x86), or even fault (some ARM chips). may mean the unaligned loads are penalized (x86), or even fault (some ARM
For simple cases like directly loading or storing a packed field, the compiler chips). For simple cases like directly loading or storing a packed field, the
might be able to paper over alignment issues with shifts and masks. However if compiler might be able to paper over alignment issues with shifts and masks.
you take a reference to a packed field, it's unlikely that the compiler will be However if you take a reference to a packed field, it's unlikely that the
able to emit code to avoid an unaligned load. compiler will be able to emit code to avoid an unaligned load.
`repr(packed)` is not to be used lightly. Unless you have extreme requirements, `repr(packed)` is not to be used lightly. Unless you have extreme requirements,
this should not be used. this should not be used.

@ -2,13 +2,11 @@
There are two kinds of reference: There are two kinds of reference:
* Shared reference: `&` * Shared reference: `&` Mutable reference: `&mut`
* Mutable reference: `&mut`
Which obey the following rules: Which obey the following rules:
* A reference cannot outlive its referent * A reference cannot outlive its referent A mutable reference cannot be aliased
* A mutable reference cannot be aliased
To define aliasing, we must define the notion of *paths* and *liveness*. To define aliasing, we must define the notion of *paths* and *liveness*.
@ -17,60 +15,66 @@ To define aliasing, we must define the notion of *paths* and *liveness*.
# Paths # Paths
If all Rust had were values, then every value would be uniquely owned If all Rust had were values, then every value would be uniquely owned by a
by a variable or composite structure. From this we naturally derive a *tree* variable or composite structure. From this we naturally derive a *tree* of
of ownership. The stack itself is the root of the tree, with every variable ownership. The stack itself is the root of the tree, with every variable as its
as its direct children. Each variable's direct children would be their fields direct children. Each variable's direct children would be their fields (if any),
(if any), and so on. and so on.
From this view, every value in Rust has a unique *path* in the tree of ownership. From this view, every value in Rust has a unique *path* in the tree of
References to a value can subsequently be interpreted as a path in this tree. ownership. References to a value can subsequently be interpreted as a path in
Of particular interest are *ancestors* and *descendants*: if `x` owns `y`, then this tree. Of particular interest are *ancestors* and *descendants*: if `x` owns
`x` is an *ancestor* of `y`, and `y` is a *descendant* of `x`. Note that this is `y`, then `x` is an *ancestor* of `y`, and `y` is a *descendant* of `x`. Note
an inclusive relationship: `x` is a descendant and ancestor of itself. that this is an inclusive relationship: `x` is a descendant and ancestor of
itself.
Tragically, plenty of data doesn't reside on the stack, and we must also accommodate this. Tragically, plenty of data doesn't reside on the stack, and we must also
Globals and thread-locals are simple enough to model as residing at the bottom accommodate this. Globals and thread-locals are simple enough to model as
of the stack (though we must be careful with mutable globals). Data on residing at the bottom of the stack (though we must be careful with mutable
the heap poses a different problem. globals). Data on the heap poses a different problem.
If all Rust had on the heap was data uniquely owned by a pointer on the stack, If all Rust had on the heap was data uniquely owned by a pointer on the stack,
then we can just treat that pointer as a struct that owns the value on then we can just treat that pointer as a struct that owns the value on the heap.
the heap. Box, Vec, String, and HashMap, are examples of types which uniquely Box, Vec, String, and HashMap, are examples of types which uniquely own data on
own data on the heap. the heap.
Unfortunately, data on the heap is not *always* uniquely owned. Rc for instance Unfortunately, data on the heap is not *always* uniquely owned. Rc for instance
introduces a notion of *shared* ownership. Shared ownership means there is no introduces a notion of *shared* ownership. Shared ownership means there is no
unique path. A value with no unique path limits what we can do with it. In general, only unique path. A value with no unique path limits what we can do with it. In
shared references can be created to these values. However mechanisms which ensure general, only shared references can be created to these values. However
mutual exclusion may establish One True Owner temporarily, establishing a unique path mechanisms which ensure mutual exclusion may establish One True Owner
to that value (and therefore all its children). temporarily, establishing a unique path to that value (and therefore all its
children).
The most common way to establish such a path is through *interior mutability*, The most common way to establish such a path is through *interior mutability*,
in contrast to the *inherited mutability* that everything in Rust normally uses. in contrast to the *inherited mutability* that everything in Rust normally uses.
Cell, RefCell, Mutex, and RWLock are all examples of interior mutability types. These Cell, RefCell, Mutex, and RWLock are all examples of interior mutability types.
types provide exclusive access through runtime restrictions. However it is also These types provide exclusive access through runtime restrictions. However it is
possible to establish unique ownership without interior mutability. For instance, also possible to establish unique ownership without interior mutability. For
if an Rc has refcount 1, then it is safe to mutate or move its internals. instance, if an Rc has refcount 1, then it is safe to mutate or move its
internals.
In order to correctly communicate to the type system that a variable or field of In order to correctly communicate to the type system that a variable or field of
a struct can have interior mutability, it must be wrapped in an UnsafeCell. This a struct can have interior mutability, it must be wrapped in an UnsafeCell. This
does not in itself make it safe to perform interior mutability operations on that does not in itself make it safe to perform interior mutability operations on
value. You still must yourself ensure that mutual exclusion is upheld. that value. You still must yourself ensure that mutual exclusion is upheld.
# Liveness # Liveness
Note: Liveness is not the same thing as a *lifetime*, which will be explained
in detail in the next section of this chapter.
Roughly, a reference is *live* at some point in a program if it can be Roughly, a reference is *live* at some point in a program if it can be
dereferenced. Shared references are always live unless they are literally unreachable dereferenced. Shared references are always live unless they are literally
(for instance, they reside in freed or leaked memory). Mutable references can be unreachable (for instance, they reside in freed or leaked memory). Mutable
reachable but *not* live through the process of *reborrowing*. references can be reachable but *not* live through the process of *reborrowing*.
A mutable reference can be reborrowed to either a shared or mutable reference to A mutable reference can be reborrowed to either a shared or mutable reference to
one of its descendants. A reborrowed reference will only be live again once all one of its descendants. A reborrowed reference will only be live again once all
reborrows derived from it expire. For instance, a mutable reference can be reborrowed reborrows derived from it expire. For instance, a mutable reference can be
to point to a field of its referent: reborrowed to point to a field of its referent:
```rust ```rust
let x = &mut (1, 2); let x = &mut (1, 2);
@ -110,18 +114,18 @@ to make such a borrow*, just that Rust isn't as smart as you want.
To simplify things, we can model variables as a fake type of reference: *owned* To simplify things, we can model variables as a fake type of reference: *owned*
references. Owned references have much the same semantics as mutable references: references. Owned references have much the same semantics as mutable references:
they can be re-borrowed in a mutable or shared manner, which makes them no longer they can be re-borrowed in a mutable or shared manner, which makes them no
live. Live owned references have the unique property that they can be moved longer live. Live owned references have the unique property that they can be
out of (though mutable references *can* be swapped out of). This power is moved out of (though mutable references *can* be swapped out of). This power is
only given to *live* owned references because moving its referent would of only given to *live* owned references because moving its referent would of
course invalidate all outstanding references prematurely. course invalidate all outstanding references prematurely.
As a local lint against inappropriate mutation, only variables that are marked As a local lint against inappropriate mutation, only variables that are marked
as `mut` can be borrowed mutably. as `mut` can be borrowed mutably.
It is interesting to note that Box behaves exactly like an owned It is interesting to note that Box behaves exactly like an owned reference. It
reference. It can be moved out of, and Rust understands it sufficiently to can be moved out of, and Rust understands it sufficiently to reason about its
reason about its paths like a normal variable. paths like a normal variable.
@ -130,21 +134,21 @@ reason about its paths like a normal variable.
With liveness and paths defined, we can now properly define *aliasing*: With liveness and paths defined, we can now properly define *aliasing*:
**A mutable reference is aliased if there exists another live reference to one of **A mutable reference is aliased if there exists another live reference to one
its ancestors or descendants.** of its ancestors or descendants.**
(If you prefer, you may also say the two live references alias *each other*. (If you prefer, you may also say the two live references alias *each other*.
This has no semantic consequences, but is probably a more useful notion when This has no semantic consequences, but is probably a more useful notion when
verifying the soundness of a construct.) verifying the soundness of a construct.)
That's it. Super simple right? Except for the fact that it took us two pages That's it. Super simple right? Except for the fact that it took us two pages to
to define all of the terms in that definition. You know: Super. Simple. define all of the terms in that definition. You know: Super. Simple.
Actually it's a bit more complicated than that. In addition to references, Actually it's a bit more complicated than that. In addition to references, Rust
Rust has *raw pointers*: `*const T` and `*mut T`. Raw pointers have no inherent has *raw pointers*: `*const T` and `*mut T`. Raw pointers have no inherent
ownership or aliasing semantics. As a result, Rust makes absolutely no effort ownership or aliasing semantics. As a result, Rust makes absolutely no effort to
to track that they are used correctly, and they are wildly unsafe. track that they are used correctly, and they are wildly unsafe.
**It is an open question to what degree raw pointers have alias semantics. **It is an open question to what degree raw pointers have alias semantics.
However it is important for these definitions to be sound that the existence However it is important for these definitions to be sound that the existence of
of a raw pointer does not imply some kind of live path.** a raw pointer does not imply some kind of live path.**

@ -1,38 +1,40 @@
% Send and Sync % Send and Sync
Not everything obeys inherited mutability, though. Some types allow you to multiply Not everything obeys inherited mutability, though. Some types allow you to
alias a location in memory while mutating it. Unless these types use synchronization multiply alias a location in memory while mutating it. Unless these types use
to manage this access, they are absolutely not thread safe. Rust captures this with synchronization to manage this access, they are absolutely not thread safe. Rust
through the `Send` and `Sync` traits. captures this with through the `Send` and `Sync` traits.
* A type is Send if it is safe to send it to another thread. * A type is Send if it is safe to send it to another thread. A type is Sync if
* A type is Sync if it is safe to share between threads (`&T` is Send). * it is safe to share between threads (`&T` is Send).
Send and Sync are *very* fundamental to Rust's concurrency story. As such, a Send and Sync are *very* fundamental to Rust's concurrency story. As such, a
substantial amount of special tooling exists to make them work right. First and substantial amount of special tooling exists to make them work right. First and
foremost, they're *unsafe traits*. This means that they are unsafe *to implement*, foremost, they're *unsafe traits*. This means that they are unsafe *to
and other unsafe code can *trust* that they are correctly implemented. Since implement*, and other unsafe code can *trust* that they are correctly
they're *marker traits* (they have no associated items like methods), correctly implemented. Since they're *marker traits* (they have no associated items like
implemented simply means that they have the intrinsic properties an implementor methods), correctly implemented simply means that they have the intrinsic
should have. Incorrectly implementing Send or Sync can cause Undefined Behaviour. properties an implementor should have. Incorrectly implementing Send or Sync can
cause Undefined Behaviour.
Send and Sync are also what Rust calls *opt-in builtin traits*.
This means that, unlike every other trait, they are *automatically* derived: Send and Sync are also what Rust calls *opt-in builtin traits*. This means that,
if a type is composed entirely of Send or Sync types, then it is Send or Sync. unlike every other trait, they are *automatically* derived: if a type is
Almost all primitives are Send and Sync, and as a consequence pretty much composed entirely of Send or Sync types, then it is Send or Sync. Almost all
all types you'll ever interact with are Send and Sync. primitives are Send and Sync, and as a consequence pretty much all types you'll
ever interact with are Send and Sync.
Major exceptions include: Major exceptions include:
* raw pointers are neither Send nor Sync (because they have no safety guards) * raw pointers are neither Send nor Sync (because they have no safety guards)
* `UnsafeCell` isn't Sync (and therefore `Cell` and `RefCell` aren't) * `UnsafeCell` isn't Sync (and therefore `Cell` and `RefCell` aren't) `Rc` isn't
* `Rc` isn't Send or Sync (because the refcount is shared and unsynchronized) * Send or Sync (because the refcount is shared and unsynchronized)
`Rc` and `UnsafeCell` are very fundamentally not thread-safe: they enable `Rc` and `UnsafeCell` are very fundamentally not thread-safe: they enable
unsynchronized shared mutable state. However raw pointers are, strictly speaking, unsynchronized shared mutable state. However raw pointers are, strictly
marked as thread-unsafe as more of a *lint*. Doing anything useful speaking, marked as thread-unsafe as more of a *lint*. Doing anything useful
with a raw pointer requires dereferencing it, which is already unsafe. In that with a raw pointer requires dereferencing it, which is already unsafe. In that
sense, one could argue that it would be "fine" for them to be marked as thread safe. sense, one could argue that it would be "fine" for them to be marked as thread
safe.
However it's important that they aren't thread safe to prevent types that However it's important that they aren't thread safe to prevent types that
*contain them* from being automatically marked as thread safe. These types have *contain them* from being automatically marked as thread safe. These types have
@ -60,17 +62,16 @@ impl !Send for SpecialThreadToken {}
impl !Sync for SpecialThreadToken {} impl !Sync for SpecialThreadToken {}
``` ```
Note that *in and of itself* it is impossible to incorrectly derive Send and Sync. Note that *in and of itself* it is impossible to incorrectly derive Send and
Only types that are ascribed special meaning by other unsafe code can possible cause Sync. Only types that are ascribed special meaning by other unsafe code can
trouble by being incorrectly Send or Sync. possible cause trouble by being incorrectly Send or Sync.
Most uses of raw pointers should be encapsulated behind a sufficient abstraction Most uses of raw pointers should be encapsulated behind a sufficient abstraction
that Send and Sync can be derived. For instance all of Rust's standard that Send and Sync can be derived. For instance all of Rust's standard
collections are Send and Sync (when they contain Send and Sync types) collections are Send and Sync (when they contain Send and Sync types) in spite
in spite of their pervasive use raw pointers to of their pervasive use raw pointers to manage allocations and complex ownership.
manage allocations and complex ownership. Similarly, most iterators into these Similarly, most iterators into these collections are Send and Sync because they
collections are Send and Sync because they largely behave like an `&` or `&mut` largely behave like an `&` or `&mut` into the collection.
into the collection.
TODO: better explain what can or can't be Send or Sync. Sufficient to appeal TODO: better explain what can or can't be Send or Sync. Sufficient to appeal
only to data races? only to data races?
Loading…
Cancel
Save