nomicon/leaking.md

% Leaking

Ownership-based resource management is intended to simplify composition. You
acquire resources when you create the object, and you release the resources when
it gets destroyed. Since destruction is handled for you, it means you can't
forget to release the resources, and it happens as soon as possible! Surely this
is perfect and all of our problems are solved.

Everything is terrible and we have new and exotic problems to try to solve.

Many people like to believe that Rust eliminates resource leaks. In practice,
this is basically true. You would be surprised to see a Safe Rust program
leak resources in an uncontrolled way.

However from a theoretical perspective this is absolutely not the case, no
matter how you look at it. In the strictest sense, "leaking" is so abstract as
to be unpreventable. It's quite trivial to initialize a collection at the start
of a program, fill it with tons of objects with destructors, and then enter an
infinite event loop that never refers to it. The collection will sit around
uselessly, holding on to its precious resources until the program terminates (at
which point all those resources would have been reclaimed by the OS anyway).

We may consider a more restricted form of leak: failing to drop a value that is
unreachable. Rust also doesn't prevent this. In fact Rust *has a function for
doing this*: `mem::forget`. This function consumes the value it is passed *and
then doesn't run its destructor*.

In the past `mem::forget` was marked as unsafe as a sort of lint against using
it, since failing to call a destructor is generally not a well-behaved thing to
do (though useful for some special unsafe code). However this was generally
determined to be an untenable stance to take: there are many ways to fail to
call a destructor in safe code. The most famous example is creating a cycle of
reference-counted pointers using interior mutability.

It is reasonable for safe code to assume that destructor leaks do not happen, as
any program that leaks destructors is probably wrong. However *unsafe* code
cannot rely on destructors to be run in order to be safe. For most types this
doesn't matter: if you leak the destructor then the type is by definition
inaccessible, so it doesn't matter, right? For instance, if you leak a `Box<u8>`
then you waste some memory but that's hardly going to violate memory-safety.

However where we must be careful with destructor leaks are *proxy* types. These
are types which manage access to a distinct object, but don't actually own it.
Proxy objects are quite rare. Proxy objects you'll need to care about are even
rarer. However we'll focus on three interesting examples in the standard
library:

* `vec::Drain`
* `Rc`
* `thread::scoped::JoinGuard`


## Drain

`drain` is a collections API that moves data out of the container without
consuming the container. This enables us to reuse the allocation of a `Vec`
after claiming ownership over all of its contents. It produces an iterator
(Drain) that returns the contents of the Vec by-value.

Now, consider Drain in the middle of iteration: some values have been moved out,
and others haven't. This means that part of the Vec is now full of logically
uninitialized data! We could backshift all the elements in the Vec every time we
remove a value, but this would have pretty catastrophic performance
consequences.

Instead, we would like Drain to fix the Vec's backing storage when it is
dropped. It should run itself to completion, backshift any elements that weren't
removed (drain supports subranges), and then fix Vec's `len`. It's even
unwinding-safe! Easy!

Now consider the following:

```rust,ignore
let mut vec = vec![Box::new(0); 4];

{
    // start draining, vec can no longer be accessed
    let mut drainer = vec.drain(..);

    // pull out two elements and immediately drop them
    drainer.next();
    drainer.next();

    // get rid of drainer, but don't call its destructor
    mem::forget(drainer);
}

// Oops, vec[0] was dropped, we're reading a pointer into free'd memory!
println!("{}", vec[0]);
```

This is pretty clearly Not Good. Unfortunately, we're kind of stuck between a
rock and a hard place: maintaining consistent state at every step has an
enormous cost (and would negate any benefits of the API). Failing to maintain
consistent state gives us Undefined Behavior in safe code (making the API
unsound).

So what can we do? Well, we can pick a trivially consistent state: set the Vec's
len to be 0 when we start the iteration, and fix it up if necessary in the
destructor. That way, if everything executes like normal we get the desired
behavior with minimal overhead. But if someone has the *audacity* to
mem::forget us in the middle of the iteration, all that does is *leak even more*
(and possibly leave the Vec in an unexpected but otherwise consistent state).
Since we've accepted that mem::forget is safe, this is definitely safe. We call
leaks causing more leaks a *leak amplification*.


## Rc

Rc is an interesting case because at first glance it doesn't appear to be a
proxy value at all. After all, it manages the data it points to, and dropping
all the Rcs for a value will drop that value. Leaking an Rc doesn't seem like it
would be particularly dangerous. It will leave the refcount permanently
incremented and prevent the data from being freed or dropped, but that seems
just like Box, right?

Nope.

Let's consider a simplified implementation of Rc:

```rust,ignore
struct Rc<T> {
    ptr: *mut RcBox<T>,
}

struct RcBox<T> {
    data: T,
    ref_count: usize,
}

impl<T> Rc<T> {
    fn new(data: T) -> Self {
        unsafe {
            // Wouldn't it be nice if heap::allocate worked like this?
            let ptr = heap::allocate<RcBox<T>>();
            ptr::write(ptr, RcBox {
                data: data,
                ref_count: 1,
            });
            Rc { ptr: ptr }
        }
    }

    fn clone(&self) -> Self {
        unsafe {
            (*self.ptr).ref_count += 1;
        }
        Rc { ptr: self.ptr }
    }
}

impl<T> Drop for Rc<T> {
    fn drop(&mut self) {
        unsafe {
            (*self.ptr).ref_count -= 1;
            if (*self.ptr).ref_count == 0 {
                // drop the data and then free it
                ptr::read(self.ptr);
                heap::deallocate(self.ptr);
            }
        }
    }
}
```

This code contains an implicit and subtle assumption: `ref_count` can fit in a
`usize`, because there can't be more than `usize::MAX` Rcs in memory. However
this itself assumes that the `ref_count` accurately reflects the number of Rcs
in memory, which we know is false with `mem::forget`. Using `mem::forget` we can
overflow the `ref_count`, and then get it down to 0 with outstanding Rcs. Then
we can happily use-after-free the inner data. Bad Bad Not Good.

This can be solved by just checking the `ref_count` and doing *something*. The
standard library's stance is to just abort, because your program has become
horribly degenerate. Also *oh my gosh* it's such a ridiculous corner case.


## thread::scoped::JoinGuard

The thread::scoped API intends to allow threads to be spawned that reference
data on their parent's stack without any synchronization over that data by
ensuring the parent joins the thread before any of the shared data goes out
of scope.

```rust,ignore
pub fn scoped<'a, F>(f: F) -> JoinGuard<'a>
    where F: FnOnce() + Send + 'a
```

Here `f` is some closure for the other thread to execute. Saying that
`F: Send +'a` is saying that it closes over data that lives for `'a`, and it
either owns that data or the data was Sync (implying `&data` is Send).

Because JoinGuard has a lifetime, it keeps all the data it closes over
borrowed in the parent thread. This means the JoinGuard can't outlive
the data that the other thread is working on. When the JoinGuard *does* get
dropped it blocks the parent thread, ensuring the child terminates before any
of the closed-over data goes out of scope in the parent.

Usage looked like:

```rust,ignore
let mut data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
{
    let guards = vec![];
    for x in &mut data {
        // Move the mutable reference into the closure, and execute
        // it on a different thread. The closure has a lifetime bound
        // by the lifetime of the mutable reference `x` we store in it.
        // The guard that is returned is in turn assigned the lifetime
        // of the closure, so it also mutably borrows `data` as `x` did.
        // This means we cannot access `data` until the guard goes away.
        let guard = thread::scoped(move || {
            *x *= 2;
        });
        // store the thread's guard for later
        guards.push(guard);
    }
    // All guards are dropped here, forcing the threads to join
    // (this thread blocks here until the others terminate).
    // Once the threads join, the borrow expires and the data becomes
    // accessible again in this thread.
}
// data is definitely mutated here.
```

In principle, this totally works! Rust's ownership system perfectly ensures it!
...except it relies on a destructor being called to be safe.

```rust,ignore
let mut data = Box::new(0);
{
    let guard = thread::scoped(|| {
        // This is at best a data race. At worst, it's also a use-after-free.
        *data += 1;
    });
    // Because the guard is forgotten, expiring the loan without blocking this
    // thread.
    mem::forget(guard);
}
// So the Box is dropped here while the scoped thread may or may not be trying
// to access it.
```

Dang. Here the destructor running was pretty fundamental to the API, and it had
to be scrapped in favor of a completely different design.
SHARD ALL THE CHAPTERS 9 years ago			`% Leaking`

nits and realigning 9 years ago			`Ownership-based resource management is intended to simplify composition. You`
			`acquire resources when you create the object, and you release the resources when`
			`it gets destroyed. Since destruction is handled for you, it means you can't`
			`forget to release the resources, and it happens as soon as possible! Surely this`
			`is perfect and all of our problems are solved.`
SHARD ALL THE CHAPTERS 9 years ago
			`Everything is terrible and we have new and exotic problems to try to solve.`

OBRM for aturon 9 years ago			`Many people like to believe that Rust eliminates resource leaks. In practice,`
			`this is basically true. You would be surprised to see a Safe Rust program`
			`leak resources in an uncontrolled way.`

			`However from a theoretical perspective this is absolutely not the case, no`
			`matter how you look at it. In the strictest sense, "leaking" is so abstract as`
			`to be unpreventable. It's quite trivial to initialize a collection at the start`
			`of a program, fill it with tons of objects with destructors, and then enter an`
			`infinite event loop that never refers to it. The collection will sit around`
			`uselessly, holding on to its precious resources until the program terminates (at`
			`which point all those resources would have been reclaimed by the OS anyway).`
SHARD ALL THE CHAPTERS 9 years ago
nits and realigning 9 years ago			`We may consider a more restricted form of leak: failing to drop a value that is`
frob emphasis 9 years ago			`unreachable. Rust also doesn't prevent this. In fact Rust *has a function for`
nits and realigning 9 years ago			doing this: `mem::forget`. This function consumes the value it is passed and
			`then doesn't run its destructor*.`
SHARD ALL THE CHAPTERS 9 years ago
			In the past `mem::forget` was marked as unsafe as a sort of lint against using
			`it, since failing to call a destructor is generally not a well-behaved thing to`
			`do (though useful for some special unsafe code). However this was generally`
frob emphasis 9 years ago			`determined to be an untenable stance to take: there are many ways to fail to`
nits and realigning 9 years ago			`call a destructor in safe code. The most famous example is creating a cycle of`
			`reference-counted pointers using interior mutability.`
SHARD ALL THE CHAPTERS 9 years ago
nits and realigning 9 years ago			`It is reasonable for safe code to assume that destructor leaks do not happen, as`
			`any program that leaks destructors is probably wrong. However unsafe code`
frob emphasis 9 years ago			`cannot rely on destructors to be run in order to be safe. For most types this`
			`doesn't matter: if you leak the destructor then the type is by definition`
nits and realigning 9 years ago			inaccessible, so it doesn't matter, right? For instance, if you leak a `Box<u8>`
			`then you waste some memory but that's hardly going to violate memory-safety.`
SHARD ALL THE CHAPTERS 9 years ago
nits and realigning 9 years ago			`However where we must be careful with destructor leaks are proxy types. These`
			`are types which manage access to a distinct object, but don't actually own it.`
			`Proxy objects are quite rare. Proxy objects you'll need to care about are even`
			`rarer. However we'll focus on three interesting examples in the standard`
			`library:`
SHARD ALL THE CHAPTERS 9 years ago
			* `vec::Drain`
			* `Rc`
			* `thread::scoped::JoinGuard`



			`## Drain`

			`drain` is a collections API that moves data out of the container without
			consuming the container. This enables us to reuse the allocation of a `Vec`
			`after claiming ownership over all of its contents. It produces an iterator`
			`(Drain) that returns the contents of the Vec by-value.`

			`Now, consider Drain in the middle of iteration: some values have been moved out,`
			`and others haven't. This means that part of the Vec is now full of logically`
			`uninitialized data! We could backshift all the elements in the Vec every time we`
nits and realigning 9 years ago			`remove a value, but this would have pretty catastrophic performance`
			`consequences.`
SHARD ALL THE CHAPTERS 9 years ago
frob emphasis 9 years ago			`Instead, we would like Drain to fix the Vec's backing storage when it is`
SHARD ALL THE CHAPTERS 9 years ago			`dropped. It should run itself to completion, backshift any elements that weren't`
			removed (drain supports subranges), and then fix Vec's `len`. It's even
			`unwinding-safe! Easy!`

			`Now consider the following:`

fix all the doc tests 9 years ago			```rust,ignore
SHARD ALL THE CHAPTERS 9 years ago			`let mut vec = vec![Box::new(0); 4];`

			`{`
nits and realigning 9 years ago			`// start draining, vec can no longer be accessed`
			`let mut drainer = vec.drain(..);`
SHARD ALL THE CHAPTERS 9 years ago
nits and realigning 9 years ago			`// pull out two elements and immediately drop them`
			`drainer.next();`
			`drainer.next();`
SHARD ALL THE CHAPTERS 9 years ago
nits and realigning 9 years ago			`// get rid of drainer, but don't call its destructor`
			`mem::forget(drainer);`
SHARD ALL THE CHAPTERS 9 years ago			`}`

			`// Oops, vec[0] was dropped, we're reading a pointer into free'd memory!`
			`println!("{}", vec[0]);`
			```

Correct spelling in docs 9 years ago			`This is pretty clearly Not Good. Unfortunately, we're kind of stuck between a`
nits and realigning 9 years ago			`rock and a hard place: maintaining consistent state at every step has an`
			`enormous cost (and would negate any benefits of the API). Failing to maintain`
Fix some typos. 9 years ago			`consistent state gives us Undefined Behavior in safe code (making the API`
SHARD ALL THE CHAPTERS 9 years ago			`unsound).`

			`So what can we do? Well, we can pick a trivially consistent state: set the Vec's`
frob emphasis 9 years ago			`len to be 0 when we start the iteration, and fix it up if necessary in the`
SHARD ALL THE CHAPTERS 9 years ago			`destructor. That way, if everything executes like normal we get the desired`
Fix some typos. 9 years ago			`behavior with minimal overhead. But if someone has the audacity to`
nits and realigning 9 years ago			`mem::forget us in the middle of the iteration, all that does is leak even more`
frob emphasis 9 years ago			`(and possibly leave the Vec in an unexpected but otherwise consistent state).`
			`Since we've accepted that mem::forget is safe, this is definitely safe. We call`
			`leaks causing more leaks a leak amplification.`
SHARD ALL THE CHAPTERS 9 years ago



			`## Rc`

			`Rc is an interesting case because at first glance it doesn't appear to be a`
			`proxy value at all. After all, it manages the data it points to, and dropping`
nits and realigning 9 years ago			`all the Rcs for a value will drop that value. Leaking an Rc doesn't seem like it`
			`would be particularly dangerous. It will leave the refcount permanently`
SHARD ALL THE CHAPTERS 9 years ago			`incremented and prevent the data from being freed or dropped, but that seems`
			`just like Box, right?`

			`Nope.`

			`Let's consider a simplified implementation of Rc:`

fix all the doc tests 9 years ago			```rust,ignore
SHARD ALL THE CHAPTERS 9 years ago			`struct Rc<T> {`
nits and realigning 9 years ago			`ptr: *mut RcBox<T>,`
SHARD ALL THE CHAPTERS 9 years ago			`}`

			`struct RcBox<T> {`
nits and realigning 9 years ago			`data: T,`
			`ref_count: usize,`
SHARD ALL THE CHAPTERS 9 years ago			`}`

			`impl<T> Rc<T> {`
nits and realigning 9 years ago			`fn new(data: T) -> Self {`
			`unsafe {`
			`// Wouldn't it be nice if heap::allocate worked like this?`
			`let ptr = heap::allocate<RcBox<T>>();`
			`ptr::write(ptr, RcBox {`
			`data: data,`
			`ref_count: 1,`
			`});`
			`Rc { ptr: ptr }`
			`}`
			`}`

			`fn clone(&self) -> Self {`
			`unsafe {`
			`(*self.ptr).ref_count += 1;`
			`}`
			`Rc { ptr: self.ptr }`
			`}`
SHARD ALL THE CHAPTERS 9 years ago			`}`

			`impl<T> Drop for Rc<T> {`
nits and realigning 9 years ago			`fn drop(&mut self) {`
			`unsafe {`
			`(*self.ptr).ref_count -= 1;`
			`if (*self.ptr).ref_count == 0 {`
			`// drop the data and then free it`
			`ptr::read(self.ptr);`
			`heap::deallocate(self.ptr);`
			`}`
			`}`
			`}`
SHARD ALL THE CHAPTERS 9 years ago			`}`
			```

frob emphasis 9 years ago			This code contains an implicit and subtle assumption: `ref_count` can fit in a
SHARD ALL THE CHAPTERS 9 years ago			`usize`, because there can't be more than `usize::MAX` Rcs in memory. However
frob emphasis 9 years ago			this itself assumes that the `ref_count` accurately reflects the number of Rcs
			in memory, which we know is false with `mem::forget`. Using `mem::forget` we can
			overflow the `ref_count`, and then get it down to 0 with outstanding Rcs. Then
			`we can happily use-after-free the inner data. Bad Bad Not Good.`
SHARD ALL THE CHAPTERS 9 years ago
frob emphasis 9 years ago			This can be solved by just checking the `ref_count` and doing something. The
			`standard library's stance is to just abort, because your program has become`
			`horribly degenerate. Also oh my gosh it's such a ridiculous corner case.`
SHARD ALL THE CHAPTERS 9 years ago



			`## thread::scoped::JoinGuard`

			`The thread::scoped API intends to allow threads to be spawned that reference`
OBRM for aturon 9 years ago			`data on their parent's stack without any synchronization over that data by`
			`ensuring the parent joins the thread before any of the shared data goes out`
			`of scope.`

fix example code 9 years ago			```rust,ignore
OBRM for aturon 9 years ago			`pub fn scoped<'a, F>(f: F) -> JoinGuard<'a>`
			`where F: FnOnce() + Send + 'a`
			```

			Here `f` is some closure for the other thread to execute. Saying that
			`F: Send +'a` is saying that it closes over data that lives for `'a`, and it
			either owns that data or the data was Sync (implying `&data` is Send).

			`Because JoinGuard has a lifetime, it keeps all the data it closes over`
			`borrowed in the parent thread. This means the JoinGuard can't outlive`
			`the data that the other thread is working on. When the JoinGuard does get`
			`dropped it blocks the parent thread, ensuring the child terminates before any`
			`of the closed-over data goes out of scope in the parent.`

			`Usage looked like:`
SHARD ALL THE CHAPTERS 9 years ago
fix all the doc tests 9 years ago			```rust,ignore
SHARD ALL THE CHAPTERS 9 years ago			`let mut data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];`
			`{`
nits and realigning 9 years ago			`let guards = vec![];`
			`for x in &mut data {`
			`// Move the mutable reference into the closure, and execute`
			`// it on a different thread. The closure has a lifetime bound`
			// by the lifetime of the mutable reference `x` we store in it.
			`// The guard that is returned is in turn assigned the lifetime`
			// of the closure, so it also mutably borrows `data` as `x` did.
			// This means we cannot access `data` until the guard goes away.
			`let guard = thread::scoped(move \|\| {`
			`x = 2;`
			`});`
			`// store the thread's guard for later`
			`guards.push(guard);`
			`}`
			`// All guards are dropped here, forcing the threads to join`
			`// (this thread blocks here until the others terminate).`
			`// Once the threads join, the borrow expires and the data becomes`
			`// accessible again in this thread.`
SHARD ALL THE CHAPTERS 9 years ago			`}`
			`// data is definitely mutated here.`
			```

			`In principle, this totally works! Rust's ownership system perfectly ensures it!`
			`...except it relies on a destructor being called to be safe.`

fix all the doc tests 9 years ago			```rust,ignore
SHARD ALL THE CHAPTERS 9 years ago			`let mut data = Box::new(0);`
			`{`
nits and realigning 9 years ago			`let guard = thread::scoped(\|\| {`
frob emphasis 9 years ago			`// This is at best a data race. At worst, it's also a use-after-free.`
nits and realigning 9 years ago			`*data += 1;`
			`});`
			`// Because the guard is forgotten, expiring the loan without blocking this`
			`// thread.`
			`mem::forget(guard);`
SHARD ALL THE CHAPTERS 9 years ago			`}`
			`// So the Box is dropped here while the scoped thread may or may not be trying`
			`// to access it.`
			```

			`Dang. Here the destructor running was pretty fundamental to the API, and it had`
Correct spelling in docs 9 years ago			`to be scrapped in favor of a completely different design.`