diff --git a/vec-alloc.md b/vec-alloc.md index 14b9b46..6f98220 100644 --- a/vec-alloc.md +++ b/vec-alloc.md @@ -1,5 +1,22 @@ % Allocating Memory +Using Unique throws a wrench in an important feature of Vec (and indeed all of +the std collections): an empty Vec doesn't actually allocate at all. So if we +can't allocate, but also can't put a null pointer in `ptr`, what do we do in +`Vec::new`? Well, we just put some other garbage in there! + +This is perfectly fine because we already have `cap == 0` as our sentinel for no +allocation. We don't even need to handle it specially in almost any code because +we usually need to check if `cap > len` or `len > 0` anyway. The traditional +Rust value to put here is `0x01`. The standard library actually exposes this +as `std::rt::heap::EMPTY`. There are quite a few places where we'll +want to use `heap::EMPTY` because there's no real allocation to talk about but +`null` would make the compiler do bad things. + +All of the `heap` API is totally unstable under the `heap_api` feature, though. +We could trivially define `heap::EMPTY` ourselves, but we'll want the rest of +the `heap` API anyway, so let's just get that dependency over with. + So: ```rust,ignore @@ -24,15 +41,29 @@ I slipped in that assert there because zero-sized types will require some special handling throughout our code, and I want to defer the issue for now. Without this assert, some of our early drafts will do some Very Bad Things. -Next we need to figure out what to actually do when we *do* want space. For that, -we'll need to use the rest of the heap APIs. These basically allow us to -talk directly to Rust's instance of jemalloc. - -We'll also need a way to handle out-of-memory conditions. The standard library -calls the `abort` intrinsic, but calling intrinsics from normal Rust code is a -pretty bad idea. Unfortunately, the `abort` exposed by the standard library -allocates. Not something we want to do during `oom`! Instead, we'll call -`std::process::exit`. +Next we need to figure out what to actually do when we *do* want space. For +that, we'll need to use the rest of the heap APIs. These basically allow us to +talk directly to Rust's allocator (jemalloc by default). + +We'll also need a way to handle out-of-memory (OOM) conditions. The standard +library calls the `abort` intrinsic, which just calls an illegal instruction to +crash the whole program. The reason we abort and don't panic is because +unwinding can cause allocations to happen, and that seems like a bad thing to do +when your allocator just came back with "hey I don't have any more memory". + +Of course, this is a bit silly since most platforms don't actually run out of +memory in a conventional way. Your operating system will probably kill the +application by another means if you legitimately start using up all the memory. +The most likely way we'll trigger OOM is by just asking for ludicrous quantities +of memory at once (e.g. half the theoretical address space). As such it's +*probably* fine to panic and nothing bad will happen. Still, we're trying to be +like the standard library as much as possible, so we'll just kill the whole +program. + +We said we don't want to use intrinsics, so doing *exactly* what `std` does is +out. `std::rt::util::abort` actually exists, but it takes a message to print, +which will probably allocate. Also it's still unstable. Instead, we'll call +`std::process::exit` with some random number. ```rust fn oom() { @@ -51,29 +82,104 @@ else: cap *= 2 ``` -But Rust's only supported allocator API is so low level that we'll need to -do a fair bit of extra work, though. We also need to guard against some special -conditions that can occur with really large allocations. In particular, we index -into arrays using unsigned integers, but `ptr::offset` takes signed integers. This -means Bad Things will happen if we ever manage to grow to contain more than -`isize::MAX` elements. Thankfully, this isn't something we need to worry about -in most cases. +But Rust's only supported allocator API is so low level that we'll need to do a +fair bit of extra work. We also need to guard against some special +conditions that can occur with really large allocations or empty allocations. + +In particular, `ptr::offset` will cause us *a lot* of trouble, because it has +the semantics of LLVM's GEP inbounds instruction. If you're fortunate enough to +not have dealt with this instruction, here's the basic story with GEP: alias +analysis, alias analysis, alias analysis. It's super important to an optimizing +compiler to be able to reason about data dependencies and aliasing. -On 64-bit targets we're artifically limited to only 48-bits, so we'll run out -of memory far before we reach that point. However on 32-bit targets, particularly -those with extensions to use more of the address space, it's theoretically possible -to successfully allocate more than `isize::MAX` bytes of memory. Still, we only -really need to worry about that if we're allocating elements that are a byte large. -Anything else will use up too much space. +As a simple example, consider the following fragment of code: + +```rust +# let x = &mut 0; +# let y = &mut 0; +*x *= 7; +*y *= 3; +``` -However since this is a tutorial, we're not going to be particularly optimal here, -and just unconditionally check, rather than use clever platform-specific `cfg`s. +If the compiler can prove that `x` and `y` point to different locations in +memory, the two operations can in theory be executed in parallel (by e.g. +loading them into different registers and working on them independently). +However in *general* the compiler can't do this because if x and y point to +the same location in memory, the operations need to be done to the same value, +and they can't just be merged afterwards. + +When you use GEP inbounds, you are specifically telling LLVM that the offsets +you're about to do are within the bounds of a single allocated entity. The +ultimate payoff being that LLVM can assume that if two pointers are known to +point to two disjoint objects, all the offsets of those pointers are *also* +known to not alias (because you won't just end up in some random place in +memory). LLVM is heavily optimized to work with GEP offsets, and inbounds +offsets are the best of all, so it's important that we use them as much as +possible. + +So that's what GEP's about, how can it cause us trouble? + +The first problem is that we index into arrays with unsigned integers, but +GEP (and as a consequence `ptr::offset`) takes a *signed integer*. This means +that half of the seemingly valid indices into an array will overflow GEP and +actually go in the wrong direction! As such we must limit all allocations to +`isize::MAX` elements. This actually means we only need to worry about +byte-sized objects, because e.g. `> isize::MAX` `u16`s will truly exhaust all of +the system's memory. However in order to avoid subtle corner cases where someone +reinterprets some array of `< isize::MAX` objects as bytes, std limits all +allocations to `isize::MAX` bytes. + +On all 64-bit targets that Rust currently supports we're artificially limited +to significantly less than all 64 bits of the address space (modern x64 +platforms only expose 48-bit addressing), so we can rely on just running out of +memory first. However on 32-bit targets, particularly those with extensions to +use more of the address space (PAE x86 or x32), it's theoretically possible to +successfully allocate more than `isize::MAX` bytes of memory. + +However since this is a tutorial, we're not going to be particularly optimal +here, and just unconditionally check, rather than use clever platform-specific +`cfg`s. + +The other corner-case we need to worry about is *empty* allocations. There will +be two kinds of empty allocations we need to worry about: `cap = 0` for all T, +and `cap > 0` for zero-sized types. + +These cases are tricky because they come +down to what LLVM means by "allocated". LLVM's notion of an +allocation is significantly more abstract than how we usually use it. Because +LLVM needs to work with different languages' semantics and custom allocators, +it can't really intimately understand allocation. Instead, the main idea behind +allocation is "doesn't overlap with other stuff". That is, heap allocations, +stack allocations, and globals don't randomly overlap. Yep, it's about alias +analysis. As such, Rust can technically play a bit fast an loose with the notion of +an allocation as long as it's *consistent*. + +Getting back to the empty allocation case, there are a couple of places where +we want to offset by 0 as a consequence of generic code. The question is then: +is it consistent to do so? For zero-sized types, we have concluded that it is +indeed consistent to do a GEP inbounds offset by an arbitrary number of +elements. This is a runtime no-op because every element takes up no space, +and it's fine to pretend that there's infinite zero-sized types allocated +at `0x01`. No allocator will ever allocate that address, because they won't +allocate `0x00` and they generally allocate to some minimal alignment higher +than a byte. + +However what about for positive-sized types? That one's a bit trickier. In +principle, you can argue that offsetting by 0 gives LLVM no information: either +there's an element before the address, or after it, but it can't know which. +However we've chosen to conservatively assume that it may do bad things. As +such we *will* guard against this case explicitly. + +*Phew* + +Ok with all the nonsense out of the way, let's actually allocate some memory: ```rust,ignore fn grow(&mut self) { // this is all pretty delicate, so let's say it's all unsafe unsafe { - let align = mem::min_align_of::(); + // current API requires us to specify size and alignment manually. + let align = mem::align_of::(); let elem_size = mem::size_of::(); let (new_cap, ptr) = if self.cap == 0 { diff --git a/vec-layout.md b/vec-layout.md index 4e44084..bce9a2f 100644 --- a/vec-layout.md +++ b/vec-layout.md @@ -13,15 +13,64 @@ pub struct Vec { # fn main() {} ``` -And indeed this would compile. Unfortunately, it would be incorrect. The -compiler will give us too strict variance, so e.g. an `&Vec<&'static str>` +And indeed this would compile. Unfortunately, it would be incorrect. First, the +compiler will give us too strict variance. So a `&Vec<&'static str>` couldn't be used where an `&Vec<&'a str>` was expected. More importantly, it -will give incorrect ownership information to dropck, as it will conservatively -assume we don't own any values of type `T`. See [the chapter on ownership and -lifetimes] (lifetimes.html) for details. +will give incorrect ownership information to the drop checker, as it will +conservatively assume we don't own any values of type `T`. See [the chapter +on ownership and lifetimes][ownership] for all the details on variance and +drop check. -As we saw in the lifetimes chapter, we should use `Unique` in place of -`*mut T` when we have a raw pointer to an allocation we own: +As we saw in the ownership chapter, we should use `Unique` in place of +`*mut T` when we have a raw pointer to an allocation we own. Unique is unstable, +so we'd like to not use it if possible, though. + +As a recap, Unique is a wrapper around a raw pointer that declares that: + +* We are variant over `T` +* We may own a value of type `T` (for drop check) +* We are Send/Sync if `T` is Send/Sync +* We deref to `*mut T` (so it largely acts like a `*mut` in our code) +* Our pointer is never null (so `Option>` is null-pointer-optimized) + +We can implement all of the above requirements except for the last +one in stable Rust: + +```rust +use std::marker::PhantomData; +use std::ops::Deref; +use std::mem; + +struct Unique { + ptr: *const T, // *const for variance + _marker: PhantomData, // For the drop checker +} + +// Deriving Send and Sync is safe because we are the Unique owners +// of this data. It's like Unique is "just" T. +unsafe impl Send for Unique {} +unsafe impl Sync for Unique {} + +impl Unique { + pub fn new(ptr: *mut T) -> Self { + Unique { ptr: ptr, _marker: PhantomData } + } +} + +impl Deref for Unique { + type Target = *mut T; + fn deref(&self) -> &*mut T { + // There's no way to cast the *const to a *mut + // while also taking a reference. So we just + // transmute it since it's all "just pointers". + unsafe { mem::transmute(&self.ptr) } + } +} +``` + +Unfortunately the mechanism for stating that your value is non-zero is +unstable and unlikely to be stabilized soon. As such we're just going to +take the hit and use std's Unique: ```rust @@ -38,29 +87,11 @@ pub struct Vec { # fn main() {} ``` -As a recap, Unique is a wrapper around a raw pointer that declares that: - -* We may own a value of type `T` -* We are Send/Sync iff `T` is Send/Sync -* Our pointer is never null (and therefore `Option` is - null-pointer-optimized) - -That last point is subtle. First, it makes `Unique::new` unsafe to call, because -putting `null` inside of it is Undefined Behaviour. It also throws a -wrench in an important feature of Vec (and indeed all of the std collections): -an empty Vec doesn't actually allocate at all. So if we can't allocate, -but also can't put a null pointer in `ptr`, what do we do in -`Vec::new`? Well, we just put some other garbage in there! - -This is perfectly fine because we already have `cap == 0` as our sentinel for no -allocation. We don't even need to handle it specially in almost any code because -we usually need to check if `cap > len` or `len > 0` anyway. The traditional -Rust value to put here is `0x01`. The standard library actually exposes this -as `std::rt::heap::EMPTY`. There are quite a few places where we'll want to use -`heap::EMPTY` because there's no real allocation to talk about but `null` would -make the compiler angry. - -All of the `heap` API is totally unstable under the `heap_api` feature, though. -We could trivially define `heap::EMPTY` ourselves, but we'll want the rest of -the `heap` API anyway, so let's just get that dependency over with. +If you don't care about the null-pointer optimization, then you can use the +stable code. However we will be designing the rest of the code around enabling +the optimization. In particular, `Unique::new` is unsafe to call, because +putting `null` inside of it is Undefined Behaviour. Our stable Unique doesn't +need `new` to be unsafe because it doesn't make any interesting guarantees about +its contents. +[ownership]: ownership.html diff --git a/vec.md b/vec.md index a613f25..39d9686 100644 --- a/vec.md +++ b/vec.md @@ -2,5 +2,19 @@ To bring everything together, we're going to write `std::Vec` from scratch. Because all the best tools for writing unsafe code are unstable, this -project will only work on nightly (as of Rust 1.2.0). +project will only work on nightly (as of Rust 1.2.0). With the exception of the +allocator API, much of the unstable code we'll use is expected to be stabilized +in a similar form as it is today. +However we will generally try to avoid unstable code where possible. In +particular we won't use any intrinsics that could make a code a little +bit nicer or efficient because intrinsics are permanently unstable. Although +many intrinsics *do* become stabilized elsewhere (`std::ptr` and `str::mem` +consist of many intrinsics). + +Ultimately this means out implementation may not take advantage of all +possible optimizations, though it will be by no means *naive*. We will +definitely get into the weeds over nitty-gritty details, even +when the problem doesn't *really* merit it. + +You wanted advanced. We're gonna go advanced.