diff --git a/vec.md b/vec.md index 116626f..d38265b 100644 --- a/vec.md +++ b/vec.md @@ -1,17 +1,21 @@ % Example: Implementing Vec +TODO: audit for non-ZST offsets from heap::empty + To bring everything together, we're going to write `std::Vec` from scratch. Because the all the best tools for writing unsafe code are unstable, this project will only work on nightly (as of Rust 1.2.0). +# Layout + First off, we need to come up with the struct layout. Naively we want this design: ``` struct Vec { - ptr: *mut T, - cap: usize, - len: usize, + ptr: *mut T, + cap: usize, + len: usize, } ``` @@ -29,7 +33,7 @@ when we have a raw pointer to an allocation we own: ``` #![feature(unique)] -use std::ptr::Unique; +use std::ptr::{Unique, self}; pub struct Vec { ptr: Unique, @@ -63,6 +67,8 @@ All of the `heap` API is totally unstable under the `alloc` feature, though. We could trivially define `heap::EMPTY` ourselves, but we'll want the rest of the `heap` API anyway, so let's just get that dependency over with. +# Allocating Memory + So: ```rust @@ -72,14 +78,14 @@ use std::rt::heap::EMPTY; use std::mem; impl Vec { - fn new() -> Self { - assert!(mem::size_of::() != 0, "We're not ready to handle ZSTs"); - unsafe { - // need to cast EMPTY to the actual ptr type we want, let - // inference handle it. - Vec { ptr: Unique::new(heap::EMPTY as *mut _), len: 0, cap: 0 } - } - } + fn new() -> Self { + assert!(mem::size_of::() != 0, "We're not ready to handle ZSTs"); + unsafe { + // need to cast EMPTY to the actual ptr type we want, let + // inference handle it. + Vec { ptr: Unique::new(heap::EMPTY as *mut _), len: 0, cap: 0 } + } + } } ``` @@ -103,10 +109,38 @@ fn oom() { } ``` -Okay, now we can write growing: +Okay, now we can write growing. Roughly, we want to have this logic: + +```text +if cap == 0: + allocate() + cap = 1 +else + reallocate + cap *= 2 +``` + +But Rust's only supported allocator API is so low level that we'll need to +do a fair bit of extra work, though. We also need to guard against some special +conditions that can occur with really large allocations. In particular, we index +into arrays using unsigned integers, but `ptr::offset` takes signed integers. This +means Bad Things will happen if we ever manage to grow to contain more than +`isize::MAX` elements. Thankfully, this isn't something we need to worry about +in most cases. + +On 64-bit targets we're artifically limited to only 48-bits, so we'll run out +of memory far before we reach that point. However on 32-bit targets, particularly +those with extensions to use more of the address space, it's theoretically possible +to successfully allocate more than `isize::MAX` bytes of memory. Still, we only +really need to worry about that if we're allocating elements that are a byte large. +Anything else will use up too much space. + +However since this is a tutorial, we're not going to be particularly optimal here, +and just unconditionally check, rather than use clever platform-specific `cfg`s. ```rust fn grow(&mut self) { + // this is all pretty delicate, so let's say it's all unsafe unsafe { let align = mem::min_align_of::(); let elem_size = mem::size_of::(); @@ -115,16 +149,31 @@ fn grow(&mut self) { let ptr = heap::allocate(elem_size, align); (1, ptr) } else { - let new_cap = 2 * self.cap; + // as an invariant, we can assume that `self.cap < isize::MAX`, + // so this doesn't need to be checked. + let new_cap = self.cap * 2; + // Similarly this can't overflow due to previously allocating this + let old_num_bytes = self.cap * elem_size; + + // check that the new allocation doesn't exceed `isize::MAX` at all + // regardless of the actual size of the capacity. This combines the + // `new_cap <= isize::MAX` and `new_num_bytes <= usize::MAX` checks + // we need to make. We lose the ability to allocate e.g. 2/3rds of + // the address space with a single Vec of i16's on 32-bit though. + // Alas, poor Yorick -- I knew him, Horatio. + assert!(old_num_bytes <= (::std::isize::MAX as usize) / 2, + "capacity overflow"); + + let new_num_bytes = old_num_bytes * 2; let ptr = heap::reallocate(*self.ptr as *mut _, - self.cap * elem_size, - new_cap * elem_size, + old_num_bytes, + new_num_bytes, align); (new_cap, ptr) }; // If allocate or reallocate fail, we'll get `null` back - if ptr.is_null() { oom() } + if ptr.is_null() { oom(); } self.ptr = Unique::new(ptr as *mut _); self.cap = new_cap; @@ -132,10 +181,628 @@ fn grow(&mut self) { } ``` -There's nothing particularly tricky in here: if we're totally empty, we need -to do a fresh allocation. Otherwise, we need to reallocate the current pointer. -Although we have a subtle bug here with the multiply overflow. +Nothing particularly tricky here. Just computing sizes and alignments and doing +some careful multiplication checks. + +# Push and Pop + +Alright. We can initialize. We can allocate. Let's actually implement some +functionality! Let's start with `push`. All it needs to do is check if we're +full to grow, unconditionally write to the next index, and then increment our +length. + +To do the write we have to be careful not to evaluate the memory we want to write +to. At worst, it's truly uninitialized memory from the allocator. At best it's the +bits of some old value we popped off. Either way, we can't just index to the memory +and dereference it, because that will evaluate the memory as a valid instance of +T. Worse, `foo[idx] = x` will try to call `drop` on the old value of `foo[idx]`! + +The correct way to do this is with `ptr::write`, which just blindly overwrites the +target address with the bits of the value we provide. No evaluation involved. + +For `push`, if the old len (before push was called) is 0, then we want to write +to the 0th index. So we should offset by the old len. + +```rust +pub fn push(&mut self, elem: T) { + if self.len == self.cap { self.grow(); } + + unsafe { + ptr::write(self.ptr.offset(self.len as isize), elem); + } + + // Can't fail, we'll OOM first. + self.len += 1; +} +``` + +Easy! How about `pop`? Although this time the index we want to access is +initialized, Rust won't just let us dereference the location of memory to move +the value out, because that *would* leave the memory uninitialized! For this we +need `ptr::read`, which just copies out the bits from the target address and +intrprets it as a value of type T. This will leave the memory at this address +*logically* uninitialized, even though there is in fact a perfectly good instance +of T there. + +For `pop`, if the old len is 1, we want to read out of the 0th index. So we +should offset by the *new* len. + +```rust +pub fn pop(&mut self) -> Option { + if self.len == 0 { + None + } else { + self.len -= 1; + unsafe { + Some(ptr::read(self.ptr.offset(self.len as isize))) + } + } +} +``` + +# Deallocating + +Next we should implement Drop so that we don't massively leaks tons of resources. +The easiest way is to just call `pop` until it yields None, and then deallocate +our buffer. Note that calling `pop` is uneeded if `T: !Drop`. In theory we can +ask Rust if T needs_drop and omit the calls to `pop`. However in practice LLVM +is *really* good at removing simple side-effect free code like this, so I wouldn't +bother unless you notice it's not being stripped (in this case it is). + +We must not call `heap::deallocate` when `self.cap == 0`, as in this case we haven't +actually allocated any memory. + + +```rust +impl Drop for Vec { + fn drop(&mut self) { + if self.cap != 0 { + while let Some(_) = self.pop() { } + + let align = mem::min_align_of::(); + let elem_size = mem::size_of::(); + let num_bytes = elem_size * self.cap; + unsafe { + heap::deallocate(*self.ptr, num_bytes, align); + } + } + } +} +``` + +# Deref + +Alright! We've got a decent minimal ArrayStack implemented. We can push, we can +pop, and we can clean up after ourselves. However there's a whole mess of functionality +we'd reasonably want. In particular, we have a proper array, but none of the slice +functionality. That's actually pretty easy to solve: we can implement `Deref`. +This will magically make our Vec coerce to and behave like a slice in all sorts of +conditions. + +All we need is `slice::from_raw_parts`. + +```rust +use std::ops::Deref; + +impl Deref for Vec { + type Target = [T]; + fn deref(&self) -> &[T] { + unsafe { + ::std::slice::from_raw_parts(*self.ptr, self.len) + } + } +} +``` + +And let's do DerefMut too: + +```rust +use std::ops::DerefMut; + +impl DerefMut for Vec { + fn deref_mut(&mut self) -> &mut [T] { + unsafe { + ::std::slice::from_raw_parts_mut(*self.ptr, self.len) + } + } +} +``` + +Now we have `len`, `first`, `last`, indexing, slicing, sorting, `iter`, `iter_mut`, +and all other sorts of bells and whistles provided by slice. Sweet! + +# Insert and Remove + +Something *not* provided but slice is `insert` and `remove`, so let's do those next. + +Insert needs to shift all the elements at the target index to the right by one. +To do this we need to use `ptr::copy`, which is our version of C's `memmove`. +This copies some chunk of memory from one location to another, correctly handling +the case where the source and destination overlap (which will definitely happen +here). + +If we insert at index `i`, we want to shift the `[i .. len]` to `[i+1 .. len+1]` +using the *old* len. + +```rust +pub fn insert(&mut self, index: usize, elem: T) { + // Note: `<=` because it's valid to insert after everything + // which would be equivalent to push. + assert!(index <= self.len, "index out of bounds"); + if self.cap == self.len { self.grow(); } + + unsafe { + if index < self.len { + // ptr::copy(src, dest, len): "copy from source to dest len elems" + ptr::copy(self.ptr.offset(index as isize), + self.ptr.offset(index as isize + 1), + len - index); + } + ptr::write(self.ptr.offset(index as isize), elem); + self.len += 1; + } +} +``` + +Remove behaves in the opposite manner. We need to shift all the elements from +`[i+1 .. len + 1]` to `[i .. len]` using the *new* len. + +```rust +pub fn remove(&mut self, index: usize) -> T { + // Note: `<` because it's *not* valid to remove after everything + assert!(index < self.len, "index out of bounds"); + unsafe { + self.len -= 1; + let result = ptr::read(self.ptr.offset(index as isize)); + ptr::copy(self.ptr.offset(index as isize + 1), + self.ptr.offset(index as isize), + len - index); + result + } +} +``` + +# IntoIter + +Let's move on to writing iterators. `iter` and `iter_mut` have already been +written for us thanks to The Magic of Deref. However there's two interesting +iterators that Vec provides that slices can't: `into_iter` and `drain`. + +IntoIter consumes the Vec by-value, and can consequently yield its elements +by-value. In order to enable this, IntoIter needs to take control of Vec's +allocation. + +IntoIter needs to be DoubleEnded as well, to enable reading from both ends. +Reading from the back could just be implemented as calling `pop`, but reading +from the front is harder. We could call `remove(0)` but that would be insanely +expensive. Instead we're going to just use ptr::read to copy values out of either +end of the Vec without mutating the buffer at all. + +To do this we're going to use a very common C idiom for array iteration. We'll +make two pointers; one that points to the start of the array, and one that points +to one-element past the end. When we want an element from one end, we'll read out +the value pointed to at that end and move the pointer over by one. When the two +pointers are equal, we know we're done. + +Note that the order of read and offset are reversed for `next` and `next_back` +For `next_back` the pointer is always *after* the element it wants to read next, +while for `next` the pointer is always *at* the element it wants to read next. +To see why this is, consider the case where every element but one has been yielded. + +The array looks like this: + +```text + S E +[X, X, X, O, X, X, X] +``` + +If E pointed directly at the element it wanted to yield next, it would be +indistinguishable from the case where there are no more elements to yield. + +So we're going to use the following struct: + +```rust +struct IntoIter { + buf: Unique, + cap: usize, + start: *const T, + end: *const T, +} +``` + +And initialize it like this: + +```rust +impl Vec { + fn into_iter(self) -> IntoIter { + // Can't destructure Vec since it's Drop + let ptr = self.ptr; + let cap = self.cap; + let len = self.len; + + // Make sure not to drop Vec since that will free the buffer + mem::forget(self); + + unsafe { + IntoIter { + buf: ptr, + cap: cap, + start: *ptr, + end: ptr.offset(len as isize), + } + } + } +} +``` + +Here's iterating forward: + +```rust +impl Iterator for IntoIter { + type Item = T; + fn next(&mut self) -> Option { + if self.start == self.end { + None + } else { + unsafe { + let result = ptr::read(self.start); + self.start = self.start.offset(1); + Some(result) + } + } + } + + fn size_hint(&self) -> (usize, Option) { + let len = self.end as usize - self.start as usize; + (len, Some(len)) + } +} +``` + +And here's iterating backwards. + +```rust +impl DoubleEndedIterator for IntoIter { + fn next_back(&mut self) -> Option { + if self.start == self.end { + None + } else { + unsafe { + self.end = self.end.offset(-1); + Some(ptr::read(self.end)) + } + } + } +} + +Because IntoIter takes ownership of its allocation, it needs to implement Drop +to free it. However it *also* wants to implement Drop to drop any elements it +contains that weren't yielded. + + +```rust +impl Drop for IntoIter { + fn drop(&mut self) { + if self.cap != 0 { + // drop any remaining elements + for _ in &mut *self {} + + let align = mem::min_align_of::(); + let elem_size = mem::size_of::(); + let num_bytes = elem_size * self.cap; + unsafe { + heap::deallocate(*self.buf as *mut _, num_bytes, align); + } + } + } +} +``` + +We've actually reached an interesting situation here: we've duplicated the logic +for specifying a buffer and freeing its memory. Now that we've implemented it and +identified *actual* logic duplication, this is a good time to perform some logic +compression. + +We're going to abstract out the `(ptr, cap)` pair and give them the logic for +allocating, growing, and freeing: + +```rust + +struct RawVec { + ptr: Unique, + cap: usize, +} + +impl RawVec { + fn new() -> Self { + assert!(mem::size_of::() != 0, "TODO: implement ZST support"); + unsafe { + RawVec { ptr: Unique::new(heap::EMPTY as *mut T), cap: 0 } + } + } + + // unchanged from Vec + fn grow(&mut self) { + unsafe { + let align = mem::min_align_of::(); + let elem_size = mem::size_of::(); + + let (new_cap, ptr) = if self.cap == 0 { + let ptr = heap::allocate(elem_size, align); + (1, ptr) + } else { + let new_cap = 2 * self.cap; + let ptr = heap::reallocate(*self.ptr as *mut _, + self.cap * elem_size, + new_cap * elem_size, + align); + (new_cap, ptr) + }; + + // If allocate or reallocate fail, we'll get `null` back + if ptr.is_null() { oom() } + + self.ptr = Unique::new(ptr as *mut _); + self.cap = new_cap; + } + } +} + + +impl Drop for RawVec { + fn drop(&mut self) { + if self.cap != 0 { + let align = mem::min_align_of::(); + let elem_size = mem::size_of::(); + let num_bytes = elem_size * self.cap; + unsafe { + heap::deallocate(*self.ptr as *mut _, num_bytes, align); + } + } + } +} +``` + +And change vec as follows: + +```rust +pub struct Vec { + buf: RawVec, + len: usize, +} + +impl Vec { + fn ptr(&self) -> *mut T { *self.buf.ptr } + + fn cap(&self) -> usize { self.buf.cap } + + pub fn new() -> Self { + Vec { buf: RawVec::new(), len: 0 } + } + + // push/pop/insert/remove largely unchanged: + // * `self.ptr -> self.ptr()` + // * `self.cap -> self.cap()` + // * `self.grow -> self.buf.grow()` +} + +impl Drop for Vec { + fn drop(&mut self) { + while let Some(_) = self.pop() {} + // deallocation is handled by RawVec + } +} +``` + +And finally we can really simplify IntoIter: + +```rust +struct IntoIter { + _buf: RawVec, // we don't actually care about this. Just need it to live. + start: *const T, + end: *const T, +} + +// next and next_back litterally unchanged since they never referred to the buf + +impl Drop for IntoIter { + fn drop(&mut self) { + // only need to ensure all our elements are read; + // buffer will clean itself up afterwards. + for _ in &mut *self {} + } +} + +impl Vec { + pub fn into_iter(self) -> IntoIter { + unsafe { + // need to use ptr::read to unsafely move the buf out since it's + // not Copy. + let buf = ptr::read(&self.buf); + let len = self.len; + mem::forget(self); + + IntoIter { + start: *buf.ptr, + end: buf.ptr.offset(len as isize), + _buf: buf, + } + } + } +} +``` + +Much better. + +# Drain + +Let's move on to Drain. Drain is largely the same as IntoIter, except that +instead of consuming the Vec, it borrows the Vec and leaves its allocation +free. For now we'll only implement the "basic" full-range version. + +```rust,ignore +use std::marker::PhantomData; + +struct Drain<'a, T: 'a> { + vec: PhantomData<&'a mut Vec> + start: *const T, + end: *const T, +} + +impl<'a, T> Iterator for Drain<'a, T> { + type Item = T; + fn next(&mut self) -> Option { + if self.start == self.end { + None +``` + +-- wait, this is seeming familiar. Let's do some more compression. Both +IntoIter and Drain have the exact same structure, let's just factor it out. + +```rust +struct RawValIter { + start: *const T, + end: *const T, +} + +impl RawValIter { + // unsafe to construct because it has no associated lifetimes. + // This is necessary to store a RawValIter in the same struct as + // its actual allocation. OK since it's a private implementation + // detail. + unsafe fn new(slice: &[T]) -> Self { + RawValIter { + start: slice.as_ptr(), + end: slice.as_ptr().offset(slice.len() as isize), + } + } +} + +// Iterator and DoubleEndedIterator impls identical to IntoIter. +``` + +And IntoIter becomes the following: + +``` +pub struct IntoIter { + _buf: RawVec, // we don't actually care about this. Just need it to live. + iter: RawValIter, +} + +impl Iterator for IntoIter { + type Item = T; + fn next(&mut self) -> Option { self.iter.next() } + fn size_hint(&self) -> (usize, Option) { self.iter.size_hint() } +} + +impl DoubleEndedIterator for IntoIter { + fn next_back(&mut self) -> Option { self.iter.next_back() } +} + +impl Drop for IntoIter { + fn drop(&mut self) { + for _ in &mut self.iter {} + } +} + +impl Vec { + pub fn into_iter(self) -> IntoIter { + unsafe { + let iter = RawValIter::new(&self); + let buf = ptr::read(&self.buf); + mem::forget(self); + + IntoIter { + iter: iter, + _buf: buf, + } + } + } +} +``` + +Note that I've left a few quirks in this design to make upgrading Drain to work +with arbitrary subranges a bit easier. In particular we *could* have RawValIter +drain itself on drop, but that won't work right for a more complex Drain. +We also take a slice to simplify Drain initialization. + +Alright, now Drain is really easy: + +```rust +use std::marker::PhantomData; + +pub struct Drain<'a, T: 'a> { + vec: PhantomData<&'a mut Vec>, + iter: RawValIter, +} + +impl<'a, T> Iterator for Drain<'a, T> { + type Item = T; + fn next(&mut self) -> Option { self.iter.next_back() } + fn size_hint(&self) -> (usize, Option) { self.iter.size_hint() } +} + +impl<'a, T> DoubleEndedIterator for Drain<'a, T> { + fn next_back(&mut self) -> Option { self.iter.next_back() } +} + +impl<'a, T> Drop for Drain<'a, T> { + fn drop(&mut self) { + for _ in &mut self.iter {} + } +} + +impl Vec { + pub fn drain(&mut self) -> Drain { + // this is a mem::forget safety thing. If Drain is forgotten, we just + // leak the whole Vec's contents. Also we need to do this *eventually* + // anyway, so why not do it now? + self.len = 0; + + unsafe { + Drain { + iter: RawValIter::new(&self), + vec: PhantomData, + } + } + } +} +``` + + +# Handling Zero-Sized Types + +It's time. We're going to fight the spectre that is zero-sized types. Safe Rust +*never* needs to care about this, but Vec is very intensive on raw pointers and +raw allocations, which are exactly the *only* two things that care about +zero-sized types. We need to be careful of two things: + +* The raw allocator API has undefined behaviour if you pass in 0 for an + allocation size. +* raw pointer offsets are no-ops for zero-sized types, which will break our + C-style pointer iterator + +Thankfully we abstracted out pointer-iterators and allocating handling into +RawValIter and RawVec respectively. How mysteriously convenient. + + + +## Allocating Zero-Sized Types + +So if the allocator API doesn't support zero-sized allocations, what on earth +do we store as our allocation? Why, `heap::EMPTY` of course! Almost every operation +with a ZST is a no-op since ZSTs have exactly one value, and therefore no state needs +to be considered to store or load them. This actually extends to `ptr::read` and +`ptr::write`: they won't actually look at the pointer at all. As such we *never* need +to change the pointer. + +TODO + +## Iterating Zero-Sized Types + +TODO -TODO: rest of this +## Advanced Drain +TODO? Not clear if informative