mirror of https://github.com/rust-lang/nomicon
You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
179 lines
6.1 KiB
179 lines
6.1 KiB
# Handling Zero-Sized Types
|
|
|
|
It's time. We're going to fight the specter that is zero-sized types. Safe Rust
|
|
*never* needs to care about this, but Vec is very intensive on raw pointers and
|
|
raw allocations, which are exactly the two things that care about
|
|
zero-sized types. We need to be careful of two things:
|
|
|
|
* The raw allocator API has undefined behavior if you pass in 0 for an
|
|
allocation size.
|
|
* raw pointer offsets are no-ops for zero-sized types, which will break our
|
|
C-style pointer iterator.
|
|
|
|
Thankfully we abstracted out pointer-iterators and allocating handling into
|
|
`RawValIter` and `RawVec` respectively. How mysteriously convenient.
|
|
|
|
## Allocating Zero-Sized Types
|
|
|
|
So if the allocator API doesn't support zero-sized allocations, what on earth
|
|
do we store as our allocation? `NonNull::dangling()` of course! Almost every operation
|
|
with a ZST is a no-op since ZSTs have exactly one value, and therefore no state needs
|
|
to be considered to store or load them. This actually extends to `ptr::read` and
|
|
`ptr::write`: they won't actually look at the pointer at all. As such we never need
|
|
to change the pointer.
|
|
|
|
Note however that our previous reliance on running out of memory before overflow is
|
|
no longer valid with zero-sized types. We must explicitly guard against capacity
|
|
overflow for zero-sized types.
|
|
|
|
Due to our current architecture, all this means is writing 3 guards, one in each
|
|
method of `RawVec`.
|
|
|
|
```rust,ignore
|
|
impl<T> RawVec<T> {
|
|
fn new() -> Self {
|
|
// !0 is usize::MAX. This branch should be stripped at compile time.
|
|
let cap = if mem::size_of::<T>() == 0 { !0 } else { 0 };
|
|
|
|
// `NonNull::dangling()` doubles as "unallocated" and "zero-sized allocation"
|
|
RawVec {
|
|
ptr: NonNull::dangling(),
|
|
cap: cap,
|
|
_marker: PhantomData,
|
|
}
|
|
}
|
|
|
|
fn grow(&mut self) {
|
|
// since we set the capacity to usize::MAX when T has size 0,
|
|
// getting to here necessarily means the Vec is overfull.
|
|
assert!(mem::size_of::<T>() != 0, "capacity overflow");
|
|
|
|
let (new_cap, new_layout) = if self.cap == 0 {
|
|
(1, Layout::array::<T>(1).unwrap())
|
|
} else {
|
|
// This can't overflow because we ensure self.cap <= isize::MAX.
|
|
let new_cap = 2 * self.cap;
|
|
|
|
// `Layout::array` checks that the number of bytes is <= usize::MAX,
|
|
// but this is redundant since old_layout.size() <= isize::MAX,
|
|
// so the `unwrap` should never fail.
|
|
let new_layout = Layout::array::<T>(new_cap).unwrap();
|
|
(new_cap, new_layout)
|
|
};
|
|
|
|
// Ensure that the new allocation doesn't exceed `isize::MAX` bytes.
|
|
assert!(new_layout.size() <= isize::MAX as usize, "Allocation too large");
|
|
|
|
let new_ptr = if self.cap == 0 {
|
|
unsafe { alloc::alloc(new_layout) }
|
|
} else {
|
|
let old_layout = Layout::array::<T>(self.cap).unwrap();
|
|
let old_ptr = self.ptr.as_ptr() as *mut u8;
|
|
unsafe { alloc::realloc(old_ptr, old_layout, new_layout.size()) }
|
|
};
|
|
|
|
// If allocation fails, `new_ptr` will be null, in which case we abort.
|
|
self.ptr = match NonNull::new(new_ptr as *mut T) {
|
|
Some(p) => p,
|
|
None => alloc::handle_alloc_error(new_layout),
|
|
};
|
|
self.cap = new_cap;
|
|
}
|
|
}
|
|
|
|
impl<T> Drop for RawVec<T> {
|
|
fn drop(&mut self) {
|
|
let elem_size = mem::size_of::<T>();
|
|
|
|
if self.cap != 0 && elem_size != 0 {
|
|
unsafe {
|
|
alloc::dealloc(
|
|
self.ptr.as_ptr() as *mut u8,
|
|
Layout::array::<T>(self.cap).unwrap(),
|
|
);
|
|
}
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
That's it. We support pushing and popping zero-sized types now. Our iterators
|
|
(that aren't provided by slice Deref) are still busted, though.
|
|
|
|
## Iterating Zero-Sized Types
|
|
|
|
Zero-sized offsets are no-ops. This means that our current design will always
|
|
initialize `start` and `end` as the same value, and our iterators will yield
|
|
nothing. The current solution to this is to cast the pointers to integers,
|
|
increment, and then cast them back:
|
|
|
|
```rust,ignore
|
|
impl<T> RawValIter<T> {
|
|
unsafe fn new(slice: &[T]) -> Self {
|
|
RawValIter {
|
|
start: slice.as_ptr(),
|
|
end: if mem::size_of::<T>() == 0 {
|
|
((slice.as_ptr() as usize) + slice.len()) as *const _
|
|
} else if slice.len() == 0 {
|
|
slice.as_ptr()
|
|
} else {
|
|
slice.as_ptr().add(slice.len())
|
|
},
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
Now we have a different bug. Instead of our iterators not running at all, our
|
|
iterators now run *forever*. We need to do the same trick in our iterator impls.
|
|
Also, our size_hint computation code will divide by 0 for ZSTs. Since we'll
|
|
basically be treating the two pointers as if they point to bytes, we'll just
|
|
map size 0 to divide by 1.
|
|
|
|
```rust,ignore
|
|
impl<T> Iterator for RawValIter<T> {
|
|
type Item = T;
|
|
fn next(&mut self) -> Option<T> {
|
|
if self.start == self.end {
|
|
None
|
|
} else {
|
|
unsafe {
|
|
let result = ptr::read(self.start);
|
|
self.start = if mem::size_of::<T>() == 0 {
|
|
(self.start as usize + 1) as *const _
|
|
} else {
|
|
self.start.offset(1)
|
|
};
|
|
Some(result)
|
|
}
|
|
}
|
|
}
|
|
|
|
fn size_hint(&self) -> (usize, Option<usize>) {
|
|
let elem_size = mem::size_of::<T>();
|
|
let len = (self.end as usize - self.start as usize)
|
|
/ if elem_size == 0 { 1 } else { elem_size };
|
|
(len, Some(len))
|
|
}
|
|
}
|
|
|
|
impl<T> DoubleEndedIterator for RawValIter<T> {
|
|
fn next_back(&mut self) -> Option<T> {
|
|
if self.start == self.end {
|
|
None
|
|
} else {
|
|
unsafe {
|
|
self.end = if mem::size_of::<T>() == 0 {
|
|
(self.end as usize - 1) as *const _
|
|
} else {
|
|
self.end.offset(-1)
|
|
};
|
|
Some(ptr::read(self.end))
|
|
}
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
And that's it. Iteration works!
|