From 36d7b94c89e99e42c23622b387f5e385c8355703 Mon Sep 17 00:00:00 2001 From: Alexis Beingessner Date: Tue, 28 Jul 2015 15:13:54 -0700 Subject: [PATCH] lots more felix fixes --- casts.md | 18 +++++++++++------- checked-uninit.md | 13 +++++++++++-- coercions.md | 2 +- drop-flags.md | 34 ++++++++++++++++++++++++---------- dropck.md | 12 ++++++------ exception-safety.md | 2 +- hrtb.md | 23 ++++++++++++----------- lifetime-splitting.md | 13 +++++++------ lifetimes.md | 17 ++++++++--------- safe-unsafe-meaning.md | 14 +++++++------- subtyping.md | 14 ++++++++------ unchecked-uninit.md | 16 ++++++++-------- 12 files changed, 104 insertions(+), 74 deletions(-) diff --git a/casts.md b/casts.md index a5527b2..cb12ffe 100644 --- a/casts.md +++ b/casts.md @@ -41,8 +41,7 @@ Note that lengths are not adjusted when casting raw slices - half of the original memory. Casting is not transitive, that is, even if `e as U1 as U2` is a valid -expression, `e as U2` is not necessarily so (in fact it will only be valid if -`U1` coerces to `U2`). +expression, `e as U2` is not necessarily so. For numeric casts, there are quite a few cases to consider: @@ -53,15 +52,20 @@ For numeric casts, there are quite a few cases to consider: * zero-extend if the source is unsigned * sign-extend if the source is signed * casting from a float to an integer will round the float towards zero - * **NOTE: currently this will cause Undefined Behaviour if the rounded - value cannot be represented by the target integer type**. This includes - Inf and NaN. This is a bug and will be fixed. + * **[NOTE: currently this will cause Undefined Behaviour if the rounded + value cannot be represented by the target integer type][float-int]**. + This includes Inf and NaN. This is a bug and will be fixed. * casting from an integer to float will produce the floating point representation of the integer, rounded if necessary (rounding strategy unspecified) * casting from an f32 to an f64 is perfect and lossless * casting from an f64 to an f32 will produce the closest possible value (rounding strategy unspecified) - * **NOTE: currently this will cause Undefined Behaviour if the value + * **[NOTE: currently this will cause Undefined Behaviour if the value is finite but larger or smaller than the largest or smallest finite - value representable by f32**. This is a bug and will be fixed. + value representable by f32][float-float]**. This is a bug and will + be fixed. + + +[float-int]: https://github.com/rust-lang/rust/issues/10184 +[float-float]: https://github.com/rust-lang/rust/issues/15536 diff --git a/checked-uninit.md b/checked-uninit.md index 8b03cd4..706016a 100644 --- a/checked-uninit.md +++ b/checked-uninit.md @@ -104,5 +104,14 @@ fn main() { ``` However reassigning `y` in this example *would* require `y` to be marked as -mutable, as a Safe Rust program could observe that the value of `y` changed. -Otherwise the variable is exactly like new. +mutable, as a Safe Rust program could observe that the value of `y` changed: + +```rust +fn main() { + let mut y = Box::new(0); + let z = y; // y is now logically uninitialized because Box isn't Copy + y = Box::new(1); // reinitialize y +} +``` + +Otherwise it's like `y` is a brand new variable. diff --git a/coercions.md b/coercions.md index 8bb8284..2e33a67 100644 --- a/coercions.md +++ b/coercions.md @@ -27,7 +27,7 @@ only implemented automatically, and enables the following transformations: * `Foo<..., T, ...>` => `Foo<..., U, ...>` where: * `T: Unsize` * `Foo` is a struct - * Only the last field has type `T` + * Only the last field of `Foo` has type `T` * `T` is not part of the type of any other fields Coercions occur at a *coercion site*. Any location that is explicitly typed diff --git a/drop-flags.md b/drop-flags.md index e8c331c..f95ccc0 100644 --- a/drop-flags.md +++ b/drop-flags.md @@ -2,12 +2,25 @@ The examples in the previous section introduce an interesting problem for Rust. We have seen that's possible to conditionally initialize, deinitialize, and -*reinitialize* locations of memory totally safely. For Copy types, this isn't +reinitialize locations of memory totally safely. For Copy types, this isn't particularly notable since they're just a random pile of bits. However types with destructors are a different story: Rust needs to know whether to call a destructor whenever a variable is assigned to, or a variable goes out of scope. How can it do this with conditional initialization? +Note that this is not a problem that all assignments need worry about. In +particular, assigning through a dereference unconditionally drops, and assigning +in a `let` unconditionally *doesn't* drop: + +``` +let mut x = Box::new(0); // let makes a fresh variable, so never need to drop +let y = &mut x; +*y = Box::new(1); // Deref assumes the referent is initialized, so always drops +``` + +This is only a problem when overwriting a previously initialized variable or +one of its subfields. + It turns out that Rust actually tracks whether a type should be dropped or not *at runtime*. As a variable becomes initialized and uninitialized, a *drop flag* for that variable is toggled. When a variable *might* need to be dropped, this @@ -15,7 +28,7 @@ flag is evaluated to determine if it *should* be dropped. Of course, it is *often* the case that a value's initialization state can be *statically* known at every point in the program. If this is the case, then the -compiler can theoretically generate more effecient code! For instance, straight- +compiler can theoretically generate more efficient code! For instance, straight- line code has such *static drop semantics*: ```rust @@ -23,8 +36,8 @@ let mut x = Box::new(0); // x was uninit; just overwrite. let mut y = x; // y was uninit; just overwrite and make x uninit. x = Box::new(0); // x was uninit; just overwrite. y = x; // y was init; Drop y, overwrite it, and make x uninit! - // y was init; Drop y! - // x was uninit; do nothing. + // y goes out of scope; y was init; Drop y! + // x goes out of scope; x was uninit; do nothing. ``` And even branched code where all branches have the same behaviour with respect @@ -40,7 +53,7 @@ if condition { drop(x) // x gets moved out; make x uninit. } x = Box::new(0); // x was uninit; just overwrite. - // x was init; Drop x! + // x goes out of scope; x was init; Drop x! ``` However code like this *requires* runtime information to correctly Drop: @@ -52,7 +65,8 @@ if condition { x = Box::new(0); // x was uninit; just overwrite. println!("{}", x); } - // x *might* be uninit; check the flag! + // x goes out of scope; x *might* be uninit; + // check the flag! ``` Of course, in this case it's trivial to retrieve static drop semantics: @@ -66,10 +80,10 @@ if condition { ``` As of Rust 1.0, the drop flags are actually not-so-secretly stashed in a hidden -field of any type that implements Drop. Rust sets the drop flag by -overwriting the *entire* value with a particular byte. This is pretty obviously -Not The Fastest and causes a bunch of trouble with optimizing code. It's legacy -from a time when you could do much more complex conditional initialization. +field of any type that implements Drop. Rust sets the drop flag by overwriting +the *entire* value with a particular bit pattern. This is pretty obviously Not +The Fastest and causes a bunch of trouble with optimizing code. It's legacy from +a time when you could do much more complex conditional initialization. As such work is currently under way to move the flags out onto the stack frame where they more reasonably belong. Unfortunately, this work will take some time diff --git a/dropck.md b/dropck.md index e1a25f5..419c612 100644 --- a/dropck.md +++ b/dropck.md @@ -49,7 +49,7 @@ accidentally make dangling pointers. Consider the following simple program: struct Inspector<'a>(&'a u8); fn main() { - let (days, inspector); + let (inspector, days); days = Box::new(1); inspector = Inspector(&days); } @@ -71,7 +71,7 @@ impl<'a> Drop for Inspector<'a> { } fn main() { - let (days, inspector); + let (inspector, days); days = Box::new(1); inspector = Inspector(&days); // Let's say `days` happens to get dropped first. @@ -85,14 +85,14 @@ fn main() { ^~~~ :9:11: 15:2 note: reference must be valid for the block at 9:10... :9 fn main() { -:10 let (days, inspector); +:10 let (inspector, days); :11 days = Box::new(1); :12 inspector = Inspector(&days); :13 // Let's say `days` happens to get dropped first. :14 // Then when Inspector is dropped, it will try to read free'd memory! ... :10:27: 15:2 note: ...but borrowed value is only valid for the block suffix following statement 0 at 10:26 -:10 let (days, inspector); +:10 let (inspector, days); :11 days = Box::new(1); :12 inspector = Inspector(&days); :13 // Let's say `days` happens to get dropped first. @@ -112,8 +112,8 @@ of the finer details of how the drop checker validates types is totally up in the air. However The Big Rule is the subtlety that we have focused on this whole section: -**For a generic type to soundly implement drop, it must strictly outlive all of -its generic arguments.** +**For a generic type to soundly implement drop, its generics arguments must +strictly outlive it.** This rule is sufficient but not necessary to satisfy the drop checker. That is, if your type obeys this rule then it's *definitely* sound to drop. However diff --git a/exception-safety.md b/exception-safety.md index 9a31934..a43eec4 100644 --- a/exception-safety.md +++ b/exception-safety.md @@ -37,7 +37,7 @@ needs to be careful and consider exception safety. ## Vec::push_all `Vec::push_all` is a temporary hack to get extending a Vec by a slice reliably -effecient without specialization. Here's a simple implementation: +efficient without specialization. Here's a simple implementation: ```rust,ignore impl Vec { diff --git a/hrtb.md b/hrtb.md index 640742f..3cc06f2 100644 --- a/hrtb.md +++ b/hrtb.md @@ -1,6 +1,6 @@ % Higher-Rank Trait Bounds (HRTBs) -Rust's Fn traits are a little bit magic. For instance, we can write the +Rust's `Fn` traits are a little bit magic. For instance, we can write the following code: ```rust @@ -52,21 +52,22 @@ fn main() { } ``` -How on earth are we supposed to express the lifetimes on F's trait bound? We need -to provide some lifetime there, but the lifetime we care about can't be named until -we enter the body of `call`! Also, that isn't some fixed lifetime; call works with -*any* lifetime `&self` happens to have at that point. +How on earth are we supposed to express the lifetimes on `F`'s trait bound? We +need to provide some lifetime there, but the lifetime we care about can't be +named until we enter the body of `call`! Also, that isn't some fixed lifetime; +call works with *any* lifetime `&self` happens to have at that point. -This job requires The Magic of Higher-Rank Trait Bounds. The way we desugar -this is as follows: +This job requires The Magic of Higher-Rank Trait Bounds (HRTBs). The way we +desugar this is as follows: ```rust,ignore where for<'a> F: Fn(&'a (u8, u16)) -> &'a u8, ``` -(Where `Fn(a, b, c) -> d` is itself just sugar for the unstable *real* Fn trait) +(Where `Fn(a, b, c) -> d` is itself just sugar for the unstable *real* `Fn` +trait) `for<'a>` can be read as "for all choices of `'a`", and basically produces an -*inifinite list* of trait bounds that F must satisfy. Intense. There aren't many -places outside of the Fn traits where we encounter HRTBs, and even for those we -have a nice magic sugar for the common cases. +*infinite list* of trait bounds that F must satisfy. Intense. There aren't many +places outside of the `Fn` traits where we encounter HRTBs, and even for +those we have a nice magic sugar for the common cases. diff --git a/lifetime-splitting.md b/lifetime-splitting.md index 9b6b769..e320c5c 100644 --- a/lifetime-splitting.md +++ b/lifetime-splitting.md @@ -52,16 +52,17 @@ In order to "teach" borrowck that what we're doing is ok, we need to drop down to unsafe code. For instance, mutable slices expose a `split_at_mut` function that consumes the slice and returns *two* mutable slices. One for everything to the left of the index, and one for everything to the right. Intuitively we know -this is safe because the slices don't alias. However the implementation requires -some unsafety: +this is safe because the slices don't overlap, and therefore alias. However +the implementation requires some unsafety: ```rust,ignore fn split_at_mut(&mut self, mid: usize) -> (&mut [T], &mut [T]) { + let len = self.len(); + let ptr = self.as_mut_ptr(); + assert!(mid <= len); unsafe { - let self2: &mut [T] = mem::transmute_copy(&self); - - (ops::IndexMut::index_mut(self, ops::RangeTo { end: mid } ), - ops::IndexMut::index_mut(self2, ops::RangeFrom { start: mid } )) + (from_raw_parts_mut(ptr, mid), + from_raw_parts_mut(ptr.offset(mid as isize), len - mid)) } } ``` diff --git a/lifetimes.md b/lifetimes.md index f4475b7..5d7b305 100644 --- a/lifetimes.md +++ b/lifetimes.md @@ -7,11 +7,10 @@ the scope it's valid for. Within a function body, Rust generally doesn't let you explicitly name the lifetimes involved. This is because it's generally not really *necessary* -to talk about lifetimes in a local context; rust has all the information and -can work out everything. It's also a good thing because the scope of a borrow -is often significantly smaller than the scope its referent is *actually* valid -for. Rust will introduce *many* anonymous scopes and temporaries to make your -code *just work*. +to talk about lifetimes in a local context; Rust has all the information and +can work out everything as optimally as possible. Many anonymous scopes and +temporaries that you would otherwise have to write are often introduced to +make your code *just work*. However once you cross the function boundary, you need to start talking about lifetimes. Lifetimes are denoted with an apostrophe: `'a`, `'static`. To dip @@ -19,10 +18,10 @@ our toes with lifetimes, we're going to pretend that we're actually allowed to label scopes with lifetimes, and desugar the examples from the start of this chapter. -Originally, our examples made use of *aggressive* sugar -- high fructose corn syrup even -- -around scopes and lifetimes, because writing everything out explicitly is -*extremely noisy*. All Rust code relies on aggressive inference and elision of -"obvious" things. +Originally, our examples made use of *aggressive* sugar -- high fructose corn +syrup even -- around scopes and lifetimes, because writing everything out +explicitly is *extremely noisy*. All Rust code relies on aggressive inference +and elision of "obvious" things. One particularly interesting piece of sugar is that each `let` statement implicitly introduces a scope. For the most part, this doesn't really matter. However it diff --git a/safe-unsafe-meaning.md b/safe-unsafe-meaning.md index c7210f8..1a4e5b8 100644 --- a/safe-unsafe-meaning.md +++ b/safe-unsafe-meaning.md @@ -125,13 +125,13 @@ unsafe impl UnsafeOrd for MyType { But it's probably not the implementation you want. Rust has traditionally avoided making traits unsafe because it makes Unsafe -pervasive, which is not desirable. Send and Sync are unsafe is because -thread safety is a *fundamental property* that Unsafe cannot possibly hope to -defend against in the same way it would defend against a bad Ord implementation. -The only way to possibly defend against thread-unsafety would be to *not use -threading at all*. Making every operation atomic isn't even sufficient, because -it's possible for complex invariants to exist between disjoint locations in -memory. For instance, the pointer and capacity of a Vec must be in sync. +pervasive, which is not desirable. Send and Sync are unsafe is because thread +safety is a *fundamental property* that Unsafe cannot possibly hope to defend +against in the same way it would defend against a bad Ord implementation. The +only way to possibly defend against thread-unsafety would be to *not use +threading at all*. Making every load and store atomic isn't even sufficient, +because it's possible for complex invariants to exist between disjoint locations +in memory. For instance, the pointer and capacity of a Vec must be in sync. Even concurrent paradigms that are traditionally regarded as Totally Safe like message passing implicitly rely on some notion of thread safety -- are you diff --git a/subtyping.md b/subtyping.md index 8c5ac9c..975d1c5 100644 --- a/subtyping.md +++ b/subtyping.md @@ -33,7 +33,7 @@ Variance is where things get a bit complicated. Variance is a property that *type constructors* have with respect to their arguments. A type constructor in Rust is a generic type with unbound arguments. For instance `Vec` is a type constructor that takes a `T` and returns a -`Vec`. `&` and `&mut` are type constructors that take a two types: a +`Vec`. `&` and `&mut` are type constructors that take two inputs: a lifetime, and a type to point to. A type constructor's *variance* is how the subtyping of its inputs affects the @@ -54,7 +54,8 @@ Some important variances: * `&'a T` is variant over `'a` and `T` (as is `*const T` by metaphor) * `&'a mut T` is variant with over `'a` but invariant over `T` * `Fn(T) -> U` is invariant over `T`, but variant over `U` -* `Box`, `Vec`, and all other collections are variant over their contents +* `Box`, `Vec`, and all other collections are variant over the types of + their contents * `UnsafeCell`, `Cell`, `RefCell`, `Mutex` and all other interior mutability types are invariant over T (as is `*mut T` by metaphor) @@ -71,7 +72,7 @@ to be able to pass `&&'static str` where an `&&'a str` is expected. The additional level of indirection does not change the desire to be able to pass longer lived things where shorted lived things are expected. -However this logic *does not* apply to see why `&mut`. To see why &mut should +However this logic *does not* apply to `&mut`. To see why `&mut` should be invariant over T, consider the following code: ```rust,ignore @@ -117,8 +118,9 @@ in them *via a mutable reference*! The mutable reference makes the whole type invariant, and therefore prevents you from smuggling a short-lived type into them. -Being variant *does* allows them to be weakened when shared immutably. -So you can pass a `&Box<&'static str>` where a `&Box<&'a str>` is expected. +Being variant *does* allows `Box` and `Vec` to be weakened when shared +immutably. So you can pass a `&Box<&'static str>` where a `&Box<&'a str>` is +expected. However what should happen when passing *by-value* is less obvious. It turns out that, yes, you can use subtyping when passing by-value. That is, this works: @@ -178,7 +180,7 @@ fn foo(usize) -> &'static str; in its place. Therefore functions *are* variant over their return type. `*const` has the exact same semantics as `&`, so variance follows. `*mut` on the -other hand can dereference to an &mut whether shared or not, so it is marked +other hand can dereference to an `&mut` whether shared or not, so it is marked as invariant just like cells. This is all well and good for the types the standard library provides, but diff --git a/unchecked-uninit.md b/unchecked-uninit.md index 9ab97b9..d0397c3 100644 --- a/unchecked-uninit.md +++ b/unchecked-uninit.md @@ -26,7 +26,7 @@ returns a pointer to uninitialized memory. To handle this, we must use the `ptr` module. In particular, it provides three functions that allow us to assign bytes to a location in memory without -evaluating the old value: `write`, `copy`, and `copy_nonoverlapping`. +dropping the old value: `write`, `copy`, and `copy_nonoverlapping`. * `ptr::write(ptr, val)` takes a `val` and moves it into the address pointed to by `ptr`. @@ -35,7 +35,7 @@ evaluating the old value: `write`, `copy`, and `copy_nonoverlapping`. order is reversed!) * `ptr::copy_nonoverlapping(src, dest, count)` does what `copy` does, but a little faster on the assumption that the two ranges of memory don't overlap. - (this is equivalent to memcopy -- note that the argument order is reversed!) + (this is equivalent to memcpy -- note that the argument order is reversed!) It should go without saying that these functions, if misused, will cause serious havoc or just straight up Undefined Behaviour. The only things that these @@ -68,14 +68,14 @@ unsafe { println!("{:?}", x); ``` -It's worth noting that you don't need to worry about ptr::write-style -shenanigans with types which don't implement Drop or -contain Drop types, because Rust knows not to try to Drop them. Similarly you -should be able to assign to fields of partially initialized structs -directly if those fields don't contain any Drop types. +It's worth noting that you don't need to worry about `ptr::write`-style +shenanigans with types which don't implement `Drop` or contain `Drop` types, +because Rust knows not to try to drop them. Similarly you should be able to +assign to fields of partially initialized structs directly if those fields don't +contain any `Drop` types. However when working with uninitialized memory you need to be ever-vigilant for -Rust trying to Drop values you make like this before they're fully initialized. +Rust trying to drop values you make like this before they're fully initialized. Every control path through that variable's scope must initialize the value before it ends, if has a destructor. *[This includes code panicking](unwinding.html)*.