diff --git a/SUMMARY.md b/SUMMARY.md index 8a8ea6d..4b38b7f 100644 --- a/SUMMARY.md +++ b/SUMMARY.md @@ -45,5 +45,6 @@ * [Insert and Remove](vec-insert-remove.md) * [IntoIter](vec-into-iter.md) * [Drain](vec-drain.md) + * [Handling Zero-Sized Types](vec-zsts.md) * [Final Code](vec-final.md) * [Implementing Arc and Mutex](arc-and-mutex.md) diff --git a/vec-zsts.md b/vec-zsts.md new file mode 100644 index 0000000..931aed3 --- /dev/null +++ b/vec-zsts.md @@ -0,0 +1,176 @@ +% Handling Zero-Sized Types + +It's time. We're going to fight the spectre that is zero-sized types. Safe Rust +*never* needs to care about this, but Vec is very intensive on raw pointers and +raw allocations, which are exactly the *only* two things that care about +zero-sized types. We need to be careful of two things: + +* The raw allocator API has undefined behaviour if you pass in 0 for an + allocation size. +* raw pointer offsets are no-ops for zero-sized types, which will break our + C-style pointer iterator. + +Thankfully we abstracted out pointer-iterators and allocating handling into +RawValIter and RawVec respectively. How mysteriously convenient. + + + + +## Allocating Zero-Sized Types + +So if the allocator API doesn't support zero-sized allocations, what on earth +do we store as our allocation? Why, `heap::EMPTY` of course! Almost every operation +with a ZST is a no-op since ZSTs have exactly one value, and therefore no state needs +to be considered to store or load them. This actually extends to `ptr::read` and +`ptr::write`: they won't actually look at the pointer at all. As such we *never* need +to change the pointer. + +Note however that our previous reliance on running out of memory before overflow is +no longer valid with zero-sized types. We must explicitly guard against capacity +overflow for zero-sized types. + +Due to our current architecture, all this means is writing 3 guards, one in each +method of RawVec. + +```rust,ignore +impl RawVec { + fn new() -> Self { + unsafe { + // !0 is usize::MAX. This branch should be stripped at compile time. + let cap = if mem::size_of::() == 0 { !0 } else { 0 }; + + // heap::EMPTY doubles as "unallocated" and "zero-sized allocation" + RawVec { ptr: Unique::new(heap::EMPTY as *mut T), cap: cap } + } + } + + fn grow(&mut self) { + unsafe { + let elem_size = mem::size_of::(); + + // since we set the capacity to usize::MAX when elem_size is + // 0, getting to here necessarily means the Vec is overfull. + assert!(elem_size != 0, "capacity overflow"); + + let align = mem::align_of::(); + + let (new_cap, ptr) = if self.cap == 0 { + let ptr = heap::allocate(elem_size, align); + (1, ptr) + } else { + let new_cap = 2 * self.cap; + let ptr = heap::reallocate(*self.ptr as *mut _, + self.cap * elem_size, + new_cap * elem_size, + align); + (new_cap, ptr) + }; + + // If allocate or reallocate fail, we'll get `null` back + if ptr.is_null() { oom() } + + self.ptr = Unique::new(ptr as *mut _); + self.cap = new_cap; + } + } +} + +impl Drop for RawVec { + fn drop(&mut self) { + let elem_size = mem::size_of::(); + + // don't free zero-sized allocations, as they were never allocated. + if self.cap != 0 && elem_size != 0 { + let align = mem::align_of::(); + + let num_bytes = elem_size * self.cap; + unsafe { + heap::deallocate(*self.ptr as *mut _, num_bytes, align); + } + } + } +} +``` + +That's it. We support pushing and popping zero-sized types now. Our iterators +(that aren't provided by slice Deref) are still busted, though. + + + + +## Iterating Zero-Sized Types + +Zero-sized offsets are no-ops. This means that our current design will always +initialize `start` and `end` as the same value, and our iterators will yield +nothing. The current solution to this is to cast the pointers to integers, +increment, and then cast them back: + +```rust,ignore +impl RawValIter { + unsafe fn new(slice: &[T]) -> Self { + RawValIter { + start: slice.as_ptr(), + end: if mem::size_of::() == 0 { + ((slice.as_ptr() as usize) + slice.len()) as *const _ + } else if slice.len() == 0 { + slice.as_ptr() + } else { + slice.as_ptr().offset(slice.len() as isize) + } + } + } +} +``` + +Now we have a different bug. Instead of our iterators not running at all, our +iterators now run *forever*. We need to do the same trick in our iterator impls. +Also, our size_hint computation code will divide by 0 for ZSTs. Since we'll +basically be treating the two pointers as if they point to bytes, we'll just +map size 0 to divide by 1. + +```rust,ignore +impl Iterator for RawValIter { + type Item = T; + fn next(&mut self) -> Option { + if self.start == self.end { + None + } else { + unsafe { + let result = ptr::read(self.start); + self.start = if mem::size_of::() == 0 { + (self.start as usize + 1) as *const _ + } else { + self.start.offset(1); + } + Some(result) + } + } + } + + fn size_hint(&self) -> (usize, Option) { + let elem_size = mem::size_of::(); + let len = (self.end as usize - self.start as usize) + / if elem_size == 0 { 1 } else { elem_size }; + (len, Some(len)) + } +} + +impl DoubleEndedIterator for RawValIter { + fn next_back(&mut self) -> Option { + if self.start == self.end { + None + } else { + unsafe { + self.end = if mem::size_of::() == 0 { + (self.end as usize - 1) as *const _ + } else { + self.end.offset(-1); + } + Some(ptr::read(self.end)) + } + } + } +} +``` + +And that's it. Iteration works!