last of the emphasis cleanup

pull/10/head
Alexis Beingessner 9 years ago committed by Manish Goregaokar
parent 42582a28ed
commit 37d42cdcef

@ -12,11 +12,13 @@ it's impossible to alias a mutable reference, so it's impossible to perform a
data race. Interior mutability makes this more complicated, which is largely why data race. Interior mutability makes this more complicated, which is largely why
we have the Send and Sync traits (see below). we have the Send and Sync traits (see below).
However Rust *does not* prevent general race conditions. This is **However Rust does not prevent general race conditions.**
pretty fundamentally impossible, and probably honestly undesirable. Your hardware
is racy, your OS is racy, the other programs on your computer are racy, and the This is pretty fundamentally impossible, and probably honestly undesirable. Your
world this all runs in is racy. Any system that could genuinely claim to prevent hardware is racy, your OS is racy, the other programs on your computer are racy,
*all* race conditions would be pretty awful to use, if not just incorrect. and the world this all runs in is racy. Any system that could genuinely claim to
prevent *all* race conditions would be pretty awful to use, if not just
incorrect.
So it's perfectly "fine" for a Safe Rust program to get deadlocked or do So it's perfectly "fine" for a Safe Rust program to get deadlocked or do
something incredibly stupid with incorrect synchronization. Obviously such a something incredibly stupid with incorrect synchronization. Obviously such a
@ -46,7 +48,7 @@ thread::spawn(move || {
}); });
// Index with the value loaded from the atomic. This is safe because we // Index with the value loaded from the atomic. This is safe because we
// read the atomic memory only once, and then pass a *copy* of that value // read the atomic memory only once, and then pass a copy of that value
// to the Vec's indexing implementation. This indexing will be correctly // to the Vec's indexing implementation. This indexing will be correctly
// bounds checked, and there's no chance of the value getting changed // bounds checked, and there's no chance of the value getting changed
// in the middle. However our program may panic if the thread we spawned // in the middle. However our program may panic if the thread we spawned
@ -75,7 +77,7 @@ thread::spawn(move || {
if idx.load(Ordering::SeqCst) < data.len() { if idx.load(Ordering::SeqCst) < data.len() {
unsafe { unsafe {
// Incorrectly loading the idx *after* we did the bounds check. // Incorrectly loading the idx after we did the bounds check.
// It could have changed. This is a race condition, *and dangerous* // It could have changed. This is a race condition, *and dangerous*
// because we decided to do `get_unchecked`, which is `unsafe`. // because we decided to do `get_unchecked`, which is `unsafe`.
println!("{}", data.get_unchecked(idx.load(Ordering::SeqCst))); println!("{}", data.get_unchecked(idx.load(Ordering::SeqCst)));

@ -70,7 +70,7 @@ struct B {
Rust *does* guarantee that two instances of A have their data laid out in Rust *does* guarantee that two instances of A have their data laid out in
exactly the same way. However Rust *does not* guarantee that an instance of A exactly the same way. However Rust *does not* guarantee that an instance of A
has the same field ordering or padding as an instance of B (in practice there's has the same field ordering or padding as an instance of B (in practice there's
no *particular* reason why they wouldn't, other than that its not currently no particular reason why they wouldn't, other than that its not currently
guaranteed). guaranteed).
With A and B as written, this is basically nonsensical, but several other With A and B as written, this is basically nonsensical, but several other
@ -88,9 +88,9 @@ struct Foo<T, U> {
``` ```
Now consider the monomorphizations of `Foo<u32, u16>` and `Foo<u16, u32>`. If Now consider the monomorphizations of `Foo<u32, u16>` and `Foo<u16, u32>`. If
Rust lays out the fields in the order specified, we expect it to *pad* the Rust lays out the fields in the order specified, we expect it to pad the
values in the struct to satisfy their *alignment* requirements. So if Rust values in the struct to satisfy their alignment requirements. So if Rust
didn't reorder fields, we would expect Rust to produce the following: didn't reorder fields, we would expect it to produce the following:
```rust,ignore ```rust,ignore
struct Foo<u16, u32> { struct Foo<u16, u32> {
@ -112,7 +112,7 @@ The latter case quite simply wastes space. An optimal use of space therefore
requires different monomorphizations to have *different field orderings*. requires different monomorphizations to have *different field orderings*.
**Note: this is a hypothetical optimization that is not yet implemented in Rust **Note: this is a hypothetical optimization that is not yet implemented in Rust
**1.0 1.0**
Enums make this consideration even more complicated. Naively, an enum such as: Enums make this consideration even more complicated. Naively, an enum such as:
@ -128,7 +128,7 @@ would be laid out as:
```rust ```rust
struct FooRepr { struct FooRepr {
data: u64, // this is *really* either a u64, u32, or u8 based on `tag` data: u64, // this is either a u64, u32, or u8 based on `tag`
tag: u8, // 0 = A, 1 = B, 2 = C tag: u8, // 0 = A, 1 = B, 2 = C
} }
``` ```

@ -5,7 +5,7 @@ So what's the relationship between Safe and Unsafe Rust? How do they interact?
Rust models the separation between Safe and Unsafe Rust with the `unsafe` Rust models the separation between Safe and Unsafe Rust with the `unsafe`
keyword, which can be thought as a sort of *foreign function interface* (FFI) keyword, which can be thought as a sort of *foreign function interface* (FFI)
between Safe and Unsafe Rust. This is the magic behind why we can say Safe Rust between Safe and Unsafe Rust. This is the magic behind why we can say Safe Rust
is a safe language: all the scary unsafe bits are relegated *exclusively* to FFI is a safe language: all the scary unsafe bits are relegated exclusively to FFI
*just like every other safe language*. *just like every other safe language*.
However because one language is a subset of the other, the two can be cleanly However because one language is a subset of the other, the two can be cleanly
@ -61,13 +61,13 @@ The need for unsafe traits boils down to the fundamental property of safe code:
**No matter how completely awful Safe code is, it can't cause Undefined **No matter how completely awful Safe code is, it can't cause Undefined
Behaviour.** Behaviour.**
This means that Unsafe, **the royal vanguard of Undefined Behaviour**, has to be This means that Unsafe Rust, **the royal vanguard of Undefined Behaviour**, has to be
*super paranoid* about generic safe code. Unsafe is free to trust *specific* safe *super paranoid* about generic safe code. To be clear, Unsafe Rust is totally free to trust
code (or else you would degenerate into infinite spirals of paranoid despair). specific safe code. Anything else would degenerate into infinite spirals of
It is generally regarded as ok to trust the standard library to be correct, as paranoid despair. In particular it's generally regarded as ok to trust the standard library
`std` is effectively an extension of the language (and you *really* just have to be correct. `std` is effectively an extension of the language, and you
to trust the language). If `std` fails to uphold the guarantees it declares, really just have to trust the language. If `std` fails to uphold the
then it's basically a language bug. guarantees it declares, then it's basically a language bug.
That said, it would be best to minimize *needlessly* relying on properties of That said, it would be best to minimize *needlessly* relying on properties of
concrete safe code. Bugs happen! Of course, I must reinforce that this is only concrete safe code. Bugs happen! Of course, I must reinforce that this is only
@ -75,36 +75,36 @@ a concern for Unsafe code. Safe code can blindly trust anyone and everyone
as far as basic memory-safety is concerned. as far as basic memory-safety is concerned.
On the other hand, safe traits are free to declare arbitrary contracts, but because On the other hand, safe traits are free to declare arbitrary contracts, but because
implementing them is Safe, Unsafe can't trust those contracts to actually implementing them is safe, unsafe code can't trust those contracts to actually
be upheld. This is different from the concrete case because *anyone* can be upheld. This is different from the concrete case because *anyone* can
randomly implement the interface. There is something fundamentally different randomly implement the interface. There is something fundamentally different
about trusting a *particular* piece of code to be correct, and trusting *all the about trusting a particular piece of code to be correct, and trusting *all the
code that will ever be written* to be correct. code that will ever be written* to be correct.
For instance Rust has `PartialOrd` and `Ord` traits to try to differentiate For instance Rust has `PartialOrd` and `Ord` traits to try to differentiate
between types which can "just" be compared, and those that actually implement a between types which can "just" be compared, and those that actually implement a
*total* ordering. Pretty much every API that wants to work with data that can be total ordering. Pretty much every API that wants to work with data that can be
compared *really* wants Ord data. For instance, a sorted map like BTreeMap compared wants Ord data. For instance, a sorted map like BTreeMap
*doesn't even make sense* for partially ordered types. If you claim to implement *doesn't even make sense* for partially ordered types. If you claim to implement
Ord for a type, but don't actually provide a proper total ordering, BTreeMap will Ord for a type, but don't actually provide a proper total ordering, BTreeMap will
get *really confused* and start making a total mess of itself. Data that is get *really confused* and start making a total mess of itself. Data that is
inserted may be impossible to find! inserted may be impossible to find!
But that's okay. BTreeMap is safe, so it guarantees that even if you give it a But that's okay. BTreeMap is safe, so it guarantees that even if you give it a
*completely* garbage Ord implementation, it will still do something *safe*. You completely garbage Ord implementation, it will still do something *safe*. You
won't start reading uninitialized memory or unallocated memory. In fact, BTreeMap won't start reading uninitialized or unallocated memory. In fact, BTreeMap
manages to not actually lose any of your data. When the map is dropped, all the manages to not actually lose any of your data. When the map is dropped, all the
destructors will be successfully called! Hooray! destructors will be successfully called! Hooray!
However BTreeMap is implemented using a modest spoonful of Unsafe (most collections However BTreeMap is implemented using a modest spoonful of Unsafe Rust (most collections
are). That means that it is not necessarily *trivially true* that a bad Ord are). That means that it's not necessarily *trivially true* that a bad Ord
implementation will make BTreeMap behave safely. Unsafe must be sure not to rely implementation will make BTreeMap behave safely. BTreeMap must be sure not to rely
on Ord *where safety is at stake*. Ord is provided by Safe, and safety is not on Ord *where safety is at stake*. Ord is provided by safe code, and safety is not
Safe's responsibility to uphold. safe code's responsibility to uphold.
But wouldn't it be grand if there was some way for Unsafe to trust *some* trait But wouldn't it be grand if there was some way for Unsafe to trust some trait
contracts *somewhere*? This is the problem that unsafe traits tackle: by marking contracts *somewhere*? This is the problem that unsafe traits tackle: by marking
*the trait itself* as unsafe *to implement*, Unsafe can trust the implementation *the trait itself* as unsafe to implement, unsafe code can trust the implementation
to uphold the trait's contract. Although the trait implementation may be to uphold the trait's contract. Although the trait implementation may be
incorrect in arbitrary other ways. incorrect in arbitrary other ways.
@ -126,7 +126,7 @@ But it's probably not the implementation you want.
Rust has traditionally avoided making traits unsafe because it makes Unsafe Rust has traditionally avoided making traits unsafe because it makes Unsafe
pervasive, which is not desirable. Send and Sync are unsafe is because thread pervasive, which is not desirable. Send and Sync are unsafe is because thread
safety is a *fundamental property* that Unsafe cannot possibly hope to defend safety is a *fundamental property* that unsafe code cannot possibly hope to defend
against in the same way it would defend against a bad Ord implementation. The against in the same way it would defend against a bad Ord implementation. The
only way to possibly defend against thread-unsafety would be to *not use only way to possibly defend against thread-unsafety would be to *not use
threading at all*. Making every load and store atomic isn't even sufficient, threading at all*. Making every load and store atomic isn't even sufficient,
@ -135,10 +135,10 @@ in memory. For instance, the pointer and capacity of a Vec must be in sync.
Even concurrent paradigms that are traditionally regarded as Totally Safe like Even concurrent paradigms that are traditionally regarded as Totally Safe like
message passing implicitly rely on some notion of thread safety -- are you message passing implicitly rely on some notion of thread safety -- are you
really message-passing if you pass a *pointer*? Send and Sync therefore require really message-passing if you pass a pointer? Send and Sync therefore require
some *fundamental* level of trust that Safe code can't provide, so they must be some fundamental level of trust that Safe code can't provide, so they must be
unsafe to implement. To help obviate the pervasive unsafety that this would unsafe to implement. To help obviate the pervasive unsafety that this would
introduce, Send (resp. Sync) is *automatically* derived for all types composed only introduce, Send (resp. Sync) is automatically derived for all types composed only
of Send (resp. Sync) values. 99% of types are Send and Sync, and 99% of those of Send (resp. Sync) values. 99% of types are Send and Sync, and 99% of those
never actually say it (the remaining 1% is overwhelmingly synchronization never actually say it (the remaining 1% is overwhelmingly synchronization
primitives). primitives).

@ -8,20 +8,19 @@ captures this with through the `Send` and `Sync` traits.
* A type is Send if it is safe to send it to another thread. A type is Sync if * A type is Send if it is safe to send it to another thread. A type is Sync if
* it is safe to share between threads (`&T` is Send). * it is safe to share between threads (`&T` is Send).
Send and Sync are *very* fundamental to Rust's concurrency story. As such, a Send and Sync are fundamental to Rust's concurrency story. As such, a
substantial amount of special tooling exists to make them work right. First and substantial amount of special tooling exists to make them work right. First and
foremost, they're *unsafe traits*. This means that they are unsafe *to foremost, they're [unsafe traits][]. This means that they are unsafe to
implement*, and other unsafe code can *trust* that they are correctly implement, and other unsafe code can that they are correctly
implemented. Since they're *marker traits* (they have no associated items like implemented. Since they're *marker traits* (they have no associated items like
methods), correctly implemented simply means that they have the intrinsic methods), correctly implemented simply means that they have the intrinsic
properties an implementor should have. Incorrectly implementing Send or Sync can properties an implementor should have. Incorrectly implementing Send or Sync can
cause Undefined Behaviour. cause Undefined Behaviour.
Send and Sync are also what Rust calls *opt-in builtin traits*. This means that, Send and Sync are also automatically derived traits. This means that, unlike
unlike every other trait, they are *automatically* derived: if a type is every other trait, if a type is composed entirely of Send or Sync types, then it
composed entirely of Send or Sync types, then it is Send or Sync. Almost all is Send or Sync. Almost all primitives are Send and Sync, and as a consequence
primitives are Send and Sync, and as a consequence pretty much all types you'll pretty much all types you'll ever interact with are Send and Sync.
ever interact with are Send and Sync.
Major exceptions include: Major exceptions include:
@ -37,13 +36,12 @@ sense, one could argue that it would be "fine" for them to be marked as thread
safe. safe.
However it's important that they aren't thread safe to prevent types that However it's important that they aren't thread safe to prevent types that
*contain them* from being automatically marked as thread safe. These types have contain them from being automatically marked as thread safe. These types have
non-trivial untracked ownership, and it's unlikely that their author was non-trivial untracked ownership, and it's unlikely that their author was
necessarily thinking hard about thread safety. In the case of Rc, we have a nice necessarily thinking hard about thread safety. In the case of Rc, we have a nice
example of a type that contains a `*mut` that is *definitely* not thread safe. example of a type that contains a `*mut` that is definitely not thread safe.
Types that aren't automatically derived can *opt-in* to Send and Sync by simply Types that aren't automatically derived can simply implement them if desired:
implementing them:
```rust ```rust
struct MyBox(*mut u8); struct MyBox(*mut u8);
@ -52,12 +50,13 @@ unsafe impl Send for MyBox {}
unsafe impl Sync for MyBox {} unsafe impl Sync for MyBox {}
``` ```
In the *incredibly rare* case that a type is *inappropriately* automatically In the *incredibly rare* case that a type is inappropriately automatically
derived to be Send or Sync, then one can also *unimplement* Send and Sync: derived to be Send or Sync, then one can also unimplement Send and Sync:
```rust ```rust
#![feature(optin_builtin_traits)] #![feature(optin_builtin_traits)]
// I have some magic semantics for some synchronization primitive!
struct SpecialThreadToken(u8); struct SpecialThreadToken(u8);
impl !Send for SpecialThreadToken {} impl !Send for SpecialThreadToken {}
@ -77,3 +76,5 @@ largely behave like an `&` or `&mut` into the collection.
TODO: better explain what can or can't be Send or Sync. Sufficient to appeal TODO: better explain what can or can't be Send or Sync. Sufficient to appeal
only to data races? only to data races?
[unsafe traits]: safe-unsafe-meaning.html

@ -1,14 +1,14 @@
% Subtyping and Variance % Subtyping and Variance
Although Rust doesn't have any notion of structural inheritance, it *does* Although Rust doesn't have any notion of structural inheritance, it *does*
include subtyping. In Rust, subtyping derives entirely from *lifetimes*. Since include subtyping. In Rust, subtyping derives entirely from lifetimes. Since
lifetimes are scopes, we can partially order them based on the *contains* lifetimes are scopes, we can partially order them based on the *contains*
(outlives) relationship. We can even express this as a generic bound. (outlives) relationship. We can even express this as a generic bound.
Subtyping on lifetimes in terms of that relationship: if `'a: 'b` ("a contains Subtyping on lifetimes is in terms of that relationship: if `'a: 'b` ("a contains
b" or "a outlives b"), then `'a` is a subtype of `'b`. This is a large source of b" or "a outlives b"), then `'a` is a subtype of `'b`. This is a large source of
confusion, because it seems intuitively backwards to many: the bigger scope is a confusion, because it seems intuitively backwards to many: the bigger scope is a
*sub type* of the smaller scope. *subtype* of the smaller scope.
This does in fact make sense, though. The intuitive reason for this is that if This does in fact make sense, though. The intuitive reason for this is that if
you expect an `&'a u8`, then it's totally fine for me to hand you an `&'static you expect an `&'a u8`, then it's totally fine for me to hand you an `&'static
@ -72,7 +72,7 @@ to be able to pass `&&'static str` where an `&&'a str` is expected. The
additional level of indirection does not change the desire to be able to pass additional level of indirection does not change the desire to be able to pass
longer lived things where shorted lived things are expected. longer lived things where shorted lived things are expected.
However this logic *does not* apply to `&mut`. To see why `&mut` should However this logic doesn't apply to `&mut`. To see why `&mut` should
be invariant over T, consider the following code: be invariant over T, consider the following code:
```rust,ignore ```rust,ignore
@ -109,7 +109,7 @@ between `'a` and T is that `'a` is a property of the reference itself,
while T is something the reference is borrowing. If you change T's type, then while T is something the reference is borrowing. If you change T's type, then
the source still remembers the original type. However if you change the the source still remembers the original type. However if you change the
lifetime's type, no one but the reference knows this information, so it's fine. lifetime's type, no one but the reference knows this information, so it's fine.
Put another way, `&'a mut T` owns `'a`, but only *borrows* T. Put another way: `&'a mut T` owns `'a`, but only *borrows* T.
`Box` and `Vec` are interesting cases because they're variant, but you can `Box` and `Vec` are interesting cases because they're variant, but you can
definitely store values in them! This is where Rust gets really clever: it's definitely store values in them! This is where Rust gets really clever: it's
@ -118,7 +118,7 @@ in them *via a mutable reference*! The mutable reference makes the whole type
invariant, and therefore prevents you from smuggling a short-lived type into invariant, and therefore prevents you from smuggling a short-lived type into
them. them.
Being variant *does* allows `Box` and `Vec` to be weakened when shared Being variant allows `Box` and `Vec` to be weakened when shared
immutably. So you can pass a `&Box<&'static str>` where a `&Box<&'a str>` is immutably. So you can pass a `&Box<&'static str>` where a `&Box<&'a str>` is
expected. expected.
@ -126,7 +126,7 @@ However what should happen when passing *by-value* is less obvious. It turns out
that, yes, you can use subtyping when passing by-value. That is, this works: that, yes, you can use subtyping when passing by-value. That is, this works:
```rust ```rust
fn get_box<'a>(str: &'a u8) -> Box<&'a str> { fn get_box<'a>(str: &'a str) -> Box<&'a str> {
// string literals are `&'static str`s // string literals are `&'static str`s
Box::new("hello") Box::new("hello")
} }
@ -150,7 +150,7 @@ signature:
fn foo(&'a str) -> usize; fn foo(&'a str) -> usize;
``` ```
This signature claims that it can handle any `&str` that lives *at least* as This signature claims that it can handle any `&str` that lives at least as
long as `'a`. Now if this signature was variant over `&'a str`, that long as `'a`. Now if this signature was variant over `&'a str`, that
would mean would mean
@ -159,10 +159,12 @@ fn foo(&'static str) -> usize;
``` ```
could be provided in its place, as it would be a subtype. However this function could be provided in its place, as it would be a subtype. However this function
has a *stronger* requirement: it says that it can *only* handle `&'static str`s, has a stronger requirement: it says that it can only handle `&'static str`s,
and nothing else. Therefore functions are not variant over their arguments. and nothing else. Giving `&'a str`s to it would be unsound, as it's free to
assume that what it's given lives forever. Therefore functions are not variant
over their arguments.
To see why `Fn(T) -> U` should be *variant* over U, consider the following To see why `Fn(T) -> U` should be variant over U, consider the following
function signature: function signature:
```rust,ignore ```rust,ignore
@ -177,7 +179,7 @@ therefore completely reasonable to provide
fn foo(usize) -> &'static str; fn foo(usize) -> &'static str;
``` ```
in its place. Therefore functions *are* variant over their return type. in its place. Therefore functions are variant over their return type.
`*const` has the exact same semantics as `&`, so variance follows. `*mut` on the `*const` has the exact same semantics as `&`, so variance follows. `*mut` on the
other hand can dereference to an `&mut` whether shared or not, so it is marked other hand can dereference to an `&mut` whether shared or not, so it is marked

@ -31,12 +31,12 @@ panics can only be caught by the parent thread. This means catching a panic
requires spinning up an entire OS thread! This unfortunately stands in conflict requires spinning up an entire OS thread! This unfortunately stands in conflict
to Rust's philosophy of zero-cost abstractions. to Rust's philosophy of zero-cost abstractions.
There is an *unstable* API called `catch_panic` that enables catching a panic There is an unstable API called `catch_panic` that enables catching a panic
without spawning a thread. Still, we would encourage you to only do this without spawning a thread. Still, we would encourage you to only do this
sparingly. In particular, Rust's current unwinding implementation is heavily sparingly. In particular, Rust's current unwinding implementation is heavily
optimized for the "doesn't unwind" case. If a program doesn't unwind, there optimized for the "doesn't unwind" case. If a program doesn't unwind, there
should be no runtime cost for the program being *ready* to unwind. As a should be no runtime cost for the program being *ready* to unwind. As a
consequence, *actually* unwinding will be more expensive than in e.g. Java. consequence, actually unwinding will be more expensive than in e.g. Java.
Don't build your programs to unwind under normal circumstances. Ideally, you Don't build your programs to unwind under normal circumstances. Ideally, you
should only panic for programming errors or *extreme* problems. should only panic for programming errors or *extreme* problems.

@ -60,7 +60,7 @@ of memory at once (e.g. half the theoretical address space). As such it's
like the standard library as much as possible, so we'll just kill the whole like the standard library as much as possible, so we'll just kill the whole
program. program.
We said we don't want to use intrinsics, so doing *exactly* what `std` does is We said we don't want to use intrinsics, so doing exactly what `std` does is
out. Instead, we'll call `std::process::exit` with some random number. out. Instead, we'll call `std::process::exit` with some random number.
```rust ```rust
@ -84,7 +84,7 @@ But Rust's only supported allocator API is so low level that we'll need to do a
fair bit of extra work. We also need to guard against some special fair bit of extra work. We also need to guard against some special
conditions that can occur with really large allocations or empty allocations. conditions that can occur with really large allocations or empty allocations.
In particular, `ptr::offset` will cause us *a lot* of trouble, because it has In particular, `ptr::offset` will cause us a lot of trouble, because it has
the semantics of LLVM's GEP inbounds instruction. If you're fortunate enough to the semantics of LLVM's GEP inbounds instruction. If you're fortunate enough to
not have dealt with this instruction, here's the basic story with GEP: alias not have dealt with this instruction, here's the basic story with GEP: alias
analysis, alias analysis, alias analysis. It's super important to an optimizing analysis, alias analysis, alias analysis. It's super important to an optimizing
@ -102,7 +102,7 @@ As a simple example, consider the following fragment of code:
If the compiler can prove that `x` and `y` point to different locations in If the compiler can prove that `x` and `y` point to different locations in
memory, the two operations can in theory be executed in parallel (by e.g. memory, the two operations can in theory be executed in parallel (by e.g.
loading them into different registers and working on them independently). loading them into different registers and working on them independently).
However in *general* the compiler can't do this because if x and y point to However the compiler can't do this in general because if x and y point to
the same location in memory, the operations need to be done to the same value, the same location in memory, the operations need to be done to the same value,
and they can't just be merged afterwards. and they can't just be merged afterwards.
@ -118,7 +118,7 @@ possible.
So that's what GEP's about, how can it cause us trouble? So that's what GEP's about, how can it cause us trouble?
The first problem is that we index into arrays with unsigned integers, but The first problem is that we index into arrays with unsigned integers, but
GEP (and as a consequence `ptr::offset`) takes a *signed integer*. This means GEP (and as a consequence `ptr::offset`) takes a signed integer. This means
that half of the seemingly valid indices into an array will overflow GEP and that half of the seemingly valid indices into an array will overflow GEP and
actually go in the wrong direction! As such we must limit all allocations to actually go in the wrong direction! As such we must limit all allocations to
`isize::MAX` elements. This actually means we only need to worry about `isize::MAX` elements. This actually means we only need to worry about
@ -138,7 +138,7 @@ However since this is a tutorial, we're not going to be particularly optimal
here, and just unconditionally check, rather than use clever platform-specific here, and just unconditionally check, rather than use clever platform-specific
`cfg`s. `cfg`s.
The other corner-case we need to worry about is *empty* allocations. There will The other corner-case we need to worry about is empty allocations. There will
be two kinds of empty allocations we need to worry about: `cap = 0` for all T, be two kinds of empty allocations we need to worry about: `cap = 0` for all T,
and `cap > 0` for zero-sized types. and `cap > 0` for zero-sized types.
@ -165,9 +165,9 @@ protected from being allocated anyway (a whole 4k, on many platforms).
However what about for positive-sized types? That one's a bit trickier. In However what about for positive-sized types? That one's a bit trickier. In
principle, you can argue that offsetting by 0 gives LLVM no information: either principle, you can argue that offsetting by 0 gives LLVM no information: either
there's an element before the address, or after it, but it can't know which. there's an element before the address or after it, but it can't know which.
However we've chosen to conservatively assume that it may do bad things. As However we've chosen to conservatively assume that it may do bad things. As
such we *will* guard against this case explicitly. such we will guard against this case explicitly.
*Phew* *Phew*

@ -130,7 +130,7 @@ impl<'a, T> Drop for Drain<'a, T> {
impl<T> Vec<T> { impl<T> Vec<T> {
pub fn drain(&mut self) -> Drain<T> { pub fn drain(&mut self) -> Drain<T> {
// this is a mem::forget safety thing. If Drain is forgotten, we just // this is a mem::forget safety thing. If Drain is forgotten, we just
// leak the whole Vec's contents. Also we need to do this *eventually* // leak the whole Vec's contents. Also we need to do this eventually
// anyway, so why not do it now? // anyway, so why not do it now?
self.len = 0; self.len = 0;

@ -10,7 +10,7 @@ handling the case where the source and destination overlap (which will
definitely happen here). definitely happen here).
If we insert at index `i`, we want to shift the `[i .. len]` to `[i+1 .. len+1]` If we insert at index `i`, we want to shift the `[i .. len]` to `[i+1 .. len+1]`
using the *old* len. using the old len.
```rust,ignore ```rust,ignore
pub fn insert(&mut self, index: usize, elem: T) { pub fn insert(&mut self, index: usize, elem: T) {

@ -21,8 +21,8 @@ read out the value pointed to at that end and move the pointer over by one. When
the two pointers are equal, we know we're done. the two pointers are equal, we know we're done.
Note that the order of read and offset are reversed for `next` and `next_back` Note that the order of read and offset are reversed for `next` and `next_back`
For `next_back` the pointer is always *after* the element it wants to read next, For `next_back` the pointer is always after the element it wants to read next,
while for `next` the pointer is always *at* the element it wants to read next. while for `next` the pointer is always at the element it wants to read next.
To see why this is, consider the case where every element but one has been To see why this is, consider the case where every element but one has been
yielded. yielded.
@ -124,7 +124,7 @@ impl<T> DoubleEndedIterator for IntoIter<T> {
``` ```
Because IntoIter takes ownership of its allocation, it needs to implement Drop Because IntoIter takes ownership of its allocation, it needs to implement Drop
to free it. However it *also* wants to implement Drop to drop any elements it to free it. However it also wants to implement Drop to drop any elements it
contains that weren't yielded. contains that weren't yielded.

@ -32,14 +32,14 @@ pub fn push(&mut self, elem: T) {
Easy! How about `pop`? Although this time the index we want to access is Easy! How about `pop`? Although this time the index we want to access is
initialized, Rust won't just let us dereference the location of memory to move initialized, Rust won't just let us dereference the location of memory to move
the value out, because that *would* leave the memory uninitialized! For this we the value out, because that would leave the memory uninitialized! For this we
need `ptr::read`, which just copies out the bits from the target address and need `ptr::read`, which just copies out the bits from the target address and
intrprets it as a value of type T. This will leave the memory at this address intrprets it as a value of type T. This will leave the memory at this address
*logically* uninitialized, even though there is in fact a perfectly good instance logically uninitialized, even though there is in fact a perfectly good instance
of T there. of T there.
For `pop`, if the old len is 1, we want to read out of the 0th index. So we For `pop`, if the old len is 1, we want to read out of the 0th index. So we
should offset by the *new* len. should offset by the new len.
```rust,ignore ```rust,ignore
pub fn pop(&mut self) -> Option<T> { pub fn pop(&mut self) -> Option<T> {

@ -2,7 +2,7 @@
It's time. We're going to fight the spectre that is zero-sized types. Safe Rust It's time. We're going to fight the spectre that is zero-sized types. Safe Rust
*never* needs to care about this, but Vec is very intensive on raw pointers and *never* needs to care about this, but Vec is very intensive on raw pointers and
raw allocations, which are exactly the *only* two things that care about raw allocations, which are exactly the two things that care about
zero-sized types. We need to be careful of two things: zero-sized types. We need to be careful of two things:
* The raw allocator API has undefined behaviour if you pass in 0 for an * The raw allocator API has undefined behaviour if you pass in 0 for an
@ -22,7 +22,7 @@ So if the allocator API doesn't support zero-sized allocations, what on earth
do we store as our allocation? Why, `heap::EMPTY` of course! Almost every operation do we store as our allocation? Why, `heap::EMPTY` of course! Almost every operation
with a ZST is a no-op since ZSTs have exactly one value, and therefore no state needs with a ZST is a no-op since ZSTs have exactly one value, and therefore no state needs
to be considered to store or load them. This actually extends to `ptr::read` and to be considered to store or load them. This actually extends to `ptr::read` and
`ptr::write`: they won't actually look at the pointer at all. As such we *never* need `ptr::write`: they won't actually look at the pointer at all. As such we never need
to change the pointer. to change the pointer.
Note however that our previous reliance on running out of memory before overflow is Note however that our previous reliance on running out of memory before overflow is

Loading…
Cancel
Save