From 00790e6d0a20c7b51bae50dc8bd795d43cbe3f7f Mon Sep 17 00:00:00 2001 From: Alexis Beingessner Date: Mon, 29 Jun 2015 17:13:15 -0700 Subject: [PATCH] unwinding start --- raii.md | 34 +++++++++------------- unwinding.md | 82 ++++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 96 insertions(+), 20 deletions(-) create mode 100644 unwinding.md diff --git a/raii.md b/raii.md index 679c1dd..f85562b 100644 --- a/raii.md +++ b/raii.md @@ -1,6 +1,6 @@ % The Perils Of RAII -Ownership Based Resource Management (AKA RAII: Resource Acquisition is Initialization) is +Ownership Based Resource Management (AKA RAII: Resource Acquisition Is Initialization) is something you'll interact with a lot in Rust. Especially if you use the standard library. Roughly speaking the pattern is as follows: to acquire a resource, you create an object that @@ -38,10 +38,8 @@ treating the old copy as uninitialized -- a no-op. While Rust provides a `Default` trait for specifying the moral equivalent of a default constructor, it's incredibly rare for this trait to be used. This is because variables -aren't implicitly initialized (see [working with uninitialized memory][uninit] for details). -Default is basically only useful for generic programming. - -In concrete contexts, a type will provide a static `new` method for any +[aren't implicitly initialized][uninit]. Default is basically only useful for generic +programming. In concrete contexts, a type will provide a static `new` method for any kind of "default" constructor. This has no relation to `new` in other languages and has no special meaning. It's just a naming convention. @@ -59,20 +57,16 @@ fn drop(&mut self); ``` This method gives the type time to somehow finish what it was doing. **After `drop` is run, -Rust will recursively try to drop all of the fields of the `self` struct**. This is a +Rust will recursively try to drop all of the fields of `self`**. This is a convenience feature so that you don't have to write "destructor boilerplate" to drop children. If a struct has no special logic for being dropped other than dropping its children, then it means `Drop` doesn't need to be implemented at all! -**There is no way to prevent this behaviour in Rust 1.0**. +**There is no stable way to prevent this behaviour in Rust 1.0**. Note that taking `&mut self` means that even if you *could* suppress recursive Drop, Rust will prevent you from e.g. moving fields out of self. For most types, this -is totally fine: - -* They own all their data (they don't contain pointers to elsewhere). -* There's no additional state passed into drop to try to send things. -* `self` is about to be marked as uninitialized (and therefore inaccessible). +is totally fine. For instance, a custom implementation of `Box` might write `Drop` like this: @@ -120,7 +114,7 @@ impl Drop for SuperBox { } ``` -because after we deallocate the `box`'s ptr in SuperBox's destructor, Rust will +After we deallocate the `box`'s ptr in SuperBox's destructor, Rust will happily proceed to tell the box to Drop itself and everything will blow up with use-after-frees and double-frees. @@ -216,7 +210,7 @@ refers to it. The collection will sit around uselessly, holding on to its precious resources until the program terminates (at which point all those resources would have been reclaimed by the OS anyway). -We may consider a more restricted form of leak: failing to free memory that +We may consider a more restricted form of leak: failing to drop a value that is unreachable. Rust also doesn't prevent this. In fact Rust has a *function for doing this*: `mem::forget`. This function consumes the value it is passed *and then doesn't run its destructor*. @@ -232,18 +226,18 @@ It is reasonable for safe code to assume that destructor leaks do not happen, as any program that leaks destructors is probably wrong. However *unsafe* code cannot rely on destructors to be run to be *safe*. For most types this doesn't matter: if you leak the destructor then the type is *by definition* inaccessible, -so it doesn't matter, right? e.g. if you leak a `Box` then you waste some -memory but that's hardly going to violate memory-safety. +so it doesn't matter, right? For instance, if you leak a `Box` then you +waste some memory but that's hardly going to violate memory-safety. However where we must be careful with destructor leaks are *proxy* types. These are types which manage access to a distinct object, but don't actually own it. Proxy objects are quite rare. Proxy objects you'll need to care about -are even rarer. However we'll focus on two interesting examples in the +are even rarer. However we'll focus on three interesting examples in the standard library: * `vec::Drain` * `Rc` - +* `thread::scoped::JoinGuard` @@ -251,7 +245,7 @@ standard library: `drain` is a collections API that moves data out of the container without consuming the container. This enables us to reuse the allocation of a `Vec` -after claiming ownership over all of its contents. drain produces an iterator +after claiming ownership over all of its contents. It produces an iterator (Drain) that returns the contents of the Vec by-value. Now, consider Drain in the middle of iteration: some values have been moved out, @@ -376,7 +370,7 @@ in memory. -## thread::scoped +## thread::scoped::JoinGuard The thread::scoped API intends to allow threads to be spawned that reference data on the stack without any synchronization over that data. Usage looked like: diff --git a/unwinding.md b/unwinding.md new file mode 100644 index 0000000..cd9ca58 --- /dev/null +++ b/unwinding.md @@ -0,0 +1,82 @@ +% Unwinding + +Rust has a *tiered* error-handling scheme: + +* If something might reasonably be absent, Option is used +* If something goes wrong and can reasonably be handled, Result is used +* If something goes wrong and cannot reasonably be handled, the thread panics +* If something catastrophic happens, the program aborts + +Option and Result are overwhelmingly preferred in most situations, especially +since they can be promoted into a panic or abort at the API user's discretion. +However, anything and everything *can* panic, and you need to be ready for this. +Panics cause the thread to halt normal execution and unwind its stack, calling +destructors as if every function instantly returned. + +As of 1.0, Rust is of two minds when it comes to panics. In the long-long-ago, +Rust was much more like Erlang. Like Erlang, Rust had lightweight tasks, +and tasks were intended to kill themselves with a panic when they reached an +untenable state. Unlike an exception in Java or C++, a panic could not be +caught at any time. Panics could only be caught by the owner of the task, at which +point they had to be handled or *that* task would itself panic. + +Unwinding was important to this story because if a task's +destructors weren't called, it would cause memory and other system resources to +leak. Since tasks were expected to die during normal execution, this would make +Rust very poor for long-running systems! + +As the Rust we know today came to be, this style of programming grew out of +fashion in the push for less-and-less abstraction. Light-weight tasks were +killed in the name of heavy-weight OS threads. Still, panics could only be +caught by the parent thread. This meant catching a panic required spinning up +an entire OS thread! Although Rust maintains the philosophy that panics should +not be used for "basic" error-handling like C++ or Java, it is still desirable +to not have the entire program crash in the face of a panic. + +In the near future there will be a stable interface for catching panics in an +arbitrary location, though we would encourage you to still only do this +sparingly. In particular, Rust's current unwinding implementation is heavily +optimized for the "doesn't unwind" case. If a program doesn't unwind, there +should be no runtime cost for the program being *ready* to unwind. As a +consequence, *actually* unwinding will be more expensive than in e.g. Java. +Don't build your programs to unwind under normal circumstances. Ideally, you +should only panic for programming errors. + + + + +# Exception Safety + +Being ready for unwinding is often referred to as "exception safety" +in the broader programming world. In Rust, their are two levels of exception +safety that one may concern themselves with: + +* In unsafe code, we *must* be exception safe to the point of not violating + memory safety. + +* In safe code, it is *good* to be exception safe to the point of your program + doing the right thing. + +As is the case in many places in Rust, unsafe code must be ready to deal with +bad safe code, and that includes code that panics. Code that transiently creates +unsound states must be careful that a panic does not cause that state to be +used. Generally this means ensuring that only non-panicing code is run while +these states exist, or making a guard that cleans up the state in the case of +a panic. This does not necessarily mean that the state a panic witnesses is a +fully *coherent* state. We need only guarantee that it's a *safe* state. + +For instance, consider extending a Vec: + +```rust + +impl Extend for Vec { + fn extend>(&mut self, iterable: I) { + let mut iter = iterable.into_iter(); + let size_hint = iter.size_hint().0; + self.reserve(size_hint); + self.set_len(self.len() + size_hint()); + + for + } +} +