Adjust Vec to build on stable Rust (#223)

Co-authored-by: Yuki Okushi <jtitor@2k36.org>
pull/265/head
Brent Kerby 4 years ago committed by GitHub
parent 132a746984
commit 951371fb74
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

@ -1,41 +1,52 @@
# Allocating Memory # Allocating Memory
Using Unique throws a wrench in an important feature of Vec (and indeed all of Using `NonNull` throws a wrench in an important feature of Vec (and indeed all of
the std collections): an empty Vec doesn't actually allocate at all. So if we the std collections): creating an empty Vec doesn't actually allocate at all. This
can't allocate, but also can't put a null pointer in `ptr`, what do we do in is not the same as allocating a zero-sized memory block, which is not allowed by
`Vec::new`? Well, we just put some other garbage in there! the global allocator (it results in undefined behavior!). So if we can't allocate,
but also can't put a null pointer in `ptr`, what do we do in `Vec::new`? Well, we
just put some other garbage in there!
This is perfectly fine because we already have `cap == 0` as our sentinel for no This is perfectly fine because we already have `cap == 0` as our sentinel for no
allocation. We don't even need to handle it specially in almost any code because allocation. We don't even need to handle it specially in almost any code because
we usually need to check if `cap > len` or `len > 0` anyway. The recommended we usually need to check if `cap > len` or `len > 0` anyway. The recommended
Rust value to put here is `mem::align_of::<T>()`. Unique provides a convenience Rust value to put here is `mem::align_of::<T>()`. `NonNull` provides a convenience
for this: `Unique::dangling()`. There are quite a few places where we'll for this: `NonNull::dangling()`. There are quite a few places where we'll
want to use `dangling` because there's no real allocation to talk about but want to use `dangling` because there's no real allocation to talk about but
`null` would make the compiler do bad things. `null` would make the compiler do bad things.
So: So:
```rust,ignore ```rust,ignore
use std::mem;
impl<T> Vec<T> { impl<T> Vec<T> {
fn new() -> Self { fn new() -> Self {
assert!(mem::size_of::<T>() != 0, "We're not ready to handle ZSTs"); assert!(mem::size_of::<T>() != 0, "We're not ready to handle ZSTs");
Vec { ptr: Unique::dangling(), len: 0, cap: 0 } Vec {
ptr: NonNull::dangling(),
len: 0,
cap: 0,
_marker: PhantomData,
}
} }
} }
# fn main() {}
``` ```
I slipped in that assert there because zero-sized types will require some I slipped in that assert there because zero-sized types will require some
special handling throughout our code, and I want to defer the issue for now. special handling throughout our code, and I want to defer the issue for now.
Without this assert, some of our early drafts will do some Very Bad Things. Without this assert, some of our early drafts will do some Very Bad Things.
Next we need to figure out what to actually do when we *do* want space. For Next we need to figure out what to actually do when we *do* want space. For that,
that, we'll need to use the rest of the heap APIs. These basically allow us to we use the global allocation functions [`alloc`][alloc], [`realloc`][realloc],
talk directly to Rust's allocator (`malloc` on Unix platforms and `HeapAlloc` and [`dealloc`][dealloc] which are available in stable Rust in
on Windows by default). [`std::alloc`][std_alloc]. These functions are expected to become deprecated in
favor of the methods of [`std::alloc::Global`][Global] after this type is stabilized.
We'll also need a way to handle out-of-memory (OOM) conditions. The standard We'll also need a way to handle out-of-memory (OOM) conditions. The standard
library calls `std::alloc::oom()`, which in turn calls the `oom` langitem, library provides a function [`alloc::handle_alloc_error`][handle_alloc_error],
which aborts the program in a platform-specific manner. which will abort the program in a platform-specific manner.
The reason we abort and don't panic is because unwinding can cause allocations The reason we abort and don't panic is because unwinding can cause allocations
to happen, and that seems like a bad thing to do when your allocator just came to happen, and that seems like a bad thing to do when your allocator just came
back with "hey I don't have any more memory". back with "hey I don't have any more memory".
@ -152,52 +163,48 @@ such we will guard against this case explicitly.
Ok with all the nonsense out of the way, let's actually allocate some memory: Ok with all the nonsense out of the way, let's actually allocate some memory:
```rust,ignore ```rust,ignore
fn grow(&mut self) { use std::alloc::{self, Layout};
// this is all pretty delicate, so let's say it's all unsafe
unsafe { impl<T> Vec<T> {
let elem_size = mem::size_of::<T>(); fn grow(&mut self) {
let (new_cap, new_layout) = if self.cap == 0 {
let (new_cap, ptr) = if self.cap == 0 { (1, Layout::array::<T>(1).unwrap())
let ptr = Global.allocate(Layout::array::<T>(1).unwrap());
(1, ptr)
} else { } else {
// as an invariant, we can assume that `self.cap < isize::MAX`, // This can't overflow since self.cap <= isize::MAX.
// so this doesn't need to be checked.
let new_cap = 2 * self.cap; let new_cap = 2 * self.cap;
// Similarly this can't overflow due to previously allocating this
let old_num_bytes = self.cap * elem_size; // `Layout::array` checks that the number of bytes is <= usize::MAX,
// but this is redundant since old_layout.size() <= isize::MAX,
// check that the new allocation doesn't exceed `isize::MAX` at all // so the `unwrap` should never fail.
// regardless of the actual size of the capacity. This combines the let new_layout = Layout::array::<T>(new_cap).unwrap();
// `new_cap <= isize::MAX` and `new_num_bytes <= usize::MAX` checks (new_cap, new_layout)
// we need to make. We lose the ability to allocate e.g. 2/3rds of
// the address space with a single Vec of i16's on 32-bit though.
// Alas, poor Yorick -- I knew him, Horatio.
assert!(old_num_bytes <= (isize::MAX as usize) / 2,
"capacity overflow");
let c: NonNull<T> = self.ptr.into();
let ptr = Global.grow(c.cast(),
Layout::array::<T>(self.cap).unwrap(),
Layout::array::<T>(new_cap).unwrap());
(new_cap, ptr)
}; };
// If allocate or reallocate fail, oom // Ensure that the new allocation doesn't exceed `isize::MAX` bytes.
if ptr.is_err() { assert!(new_layout.size() <= isize::MAX as usize, "Allocation too large");
handle_alloc_error(Layout::from_size_align_unchecked(
new_cap * elem_size,
mem::align_of::<T>(),
))
}
let ptr = ptr.unwrap(); let new_ptr = if self.cap == 0 {
unsafe { alloc::alloc(new_layout) }
} else {
let old_layout = Layout::array::<T>(self.cap).unwrap();
let old_ptr = self.ptr.as_ptr() as *mut u8;
unsafe { alloc::realloc(old_ptr, old_layout, new_layout.size()) }
};
self.ptr = Unique::new_unchecked(ptr.as_ptr() as *mut _); // If allocation fails, `new_ptr` will be null, in which case we abort.
self.ptr = match NonNull::new(new_ptr as *mut T) {
Some(p) => p,
None => alloc::handle_alloc_error(new_layout),
};
self.cap = new_cap; self.cap = new_cap;
} }
} }
# fn main() {}
``` ```
Nothing particularly tricky here. Just computing sizes and alignments and doing [Global]: ../std/alloc/struct.Global.html
some careful multiplication checks. [handle_alloc_error]: ../alloc/alloc/fn.handle_alloc_error.html
[alloc]: ../alloc/alloc/fn.alloc.html
[realloc]: ../alloc/alloc/fn.realloc.html
[dealloc]: ../alloc/alloc/fn.dealloc.html
[std_alloc]: ../alloc/alloc/index.html

@ -7,20 +7,17 @@ ask Rust if `T` `needs_drop` and omit the calls to `pop`. However in practice
LLVM is *really* good at removing simple side-effect free code like this, so I LLVM is *really* good at removing simple side-effect free code like this, so I
wouldn't bother unless you notice it's not being stripped (in this case it is). wouldn't bother unless you notice it's not being stripped (in this case it is).
We must not call `Global.deallocate` when `self.cap == 0`, as in this case we We must not call `alloc::dealloc` when `self.cap == 0`, as in this case we
haven't actually allocated any memory. haven't actually allocated any memory.
```rust,ignore ```rust,ignore
impl<T> Drop for Vec<T> { impl<T> Drop for Vec<T> {
fn drop(&mut self) { fn drop(&mut self) {
if self.cap != 0 { if self.cap != 0 {
while let Some(_) = self.pop() { } while let Some(_) = self.pop() { }
let layout = Layout::array::<T>(self.cap).unwrap();
unsafe { unsafe {
let c: NonNull<T> = self.ptr.into(); alloc::dealloc(self.ptr.as_ptr() as *mut u8, layout);
Global.deallocate(c.cast(),
Layout::array::<T>(self.cap).unwrap());
} }
} }
} }

@ -1,79 +1,85 @@
# The Final Code # The Final Code
```rust ```rust
#![feature(ptr_internals)] use std::alloc::{self, Layout};
#![feature(allocator_api)] use std::marker::PhantomData;
#![feature(alloc_layout_extra)]
use std::ptr::{Unique, NonNull, self};
use std::mem; use std::mem;
use std::ops::{Deref, DerefMut}; use std::ops::{Deref, DerefMut};
use std::marker::PhantomData; use std::ptr::{self, NonNull};
use std::alloc::{
Allocator,
Global,
GlobalAlloc,
Layout,
handle_alloc_error
};
struct RawVec<T> { struct RawVec<T> {
ptr: Unique<T>, ptr: NonNull<T>,
cap: usize, cap: usize,
_marker: PhantomData<T>,
} }
unsafe impl<T: Send> Send for RawVec<T> {}
unsafe impl<T: Sync> Sync for RawVec<T> {}
impl<T> RawVec<T> { impl<T> RawVec<T> {
fn new() -> Self { fn new() -> Self {
// !0 is usize::MAX. This branch should be stripped at compile time. // !0 is usize::MAX. This branch should be stripped at compile time.
let cap = if mem::size_of::<T>() == 0 { !0 } else { 0 }; let cap = if mem::size_of::<T>() == 0 { !0 } else { 0 };
// Unique::dangling() doubles as "unallocated" and "zero-sized allocation" // `NonNull::dangling()` doubles as "unallocated" and "zero-sized allocation"
RawVec { ptr: Unique::dangling(), cap: cap } RawVec {
ptr: NonNull::dangling(),
cap: cap,
_marker: PhantomData,
}
} }
fn grow(&mut self) { fn grow(&mut self) {
unsafe { // since we set the capacity to usize::MAX when T has size 0,
let elem_size = mem::size_of::<T>(); // getting to here necessarily means the Vec is overfull.
assert!(mem::size_of::<T>() != 0, "capacity overflow");
// since we set the capacity to usize::MAX when elem_size is
// 0, getting to here necessarily means the Vec is overfull.
assert!(elem_size != 0, "capacity overflow");
let (new_cap, ptr) = if self.cap == 0 { let (new_cap, new_layout) = if self.cap == 0 {
let ptr = Global.allocate(Layout::array::<T>(1).unwrap()); (1, Layout::array::<T>(1).unwrap())
(1, ptr) } else {
} else { // This can't overflow because we ensure self.cap <= isize::MAX.
let new_cap = 2 * self.cap; let new_cap = 2 * self.cap;
let c: NonNull<T> = self.ptr.into();
let ptr = Global.grow(c.cast(), // `Layout::array` checks that the number of bytes is <= usize::MAX,
Layout::array::<T>(self.cap).unwrap(), // but this is redundant since old_layout.size() <= isize::MAX,
Layout::array::<T>(new_cap).unwrap()); // so the `unwrap` should never fail.
(new_cap, ptr) let new_layout = Layout::array::<T>(new_cap).unwrap();
}; (new_cap, new_layout)
};
// If allocate or reallocate fail, oom
if ptr.is_err() { // Ensure that the new allocation doesn't exceed `isize::MAX` bytes.
handle_alloc_error(Layout::from_size_align_unchecked( assert!(
new_cap * elem_size, new_layout.size() <= isize::MAX as usize,
mem::align_of::<T>(), "Allocation too large"
)) );
}
let ptr = ptr.unwrap(); let new_ptr = if self.cap == 0 {
unsafe { alloc::alloc(new_layout) }
self.ptr = Unique::new_unchecked(ptr.as_ptr() as *mut _); } else {
self.cap = new_cap; let old_layout = Layout::array::<T>(self.cap).unwrap();
} let old_ptr = self.ptr.as_ptr() as *mut u8;
unsafe { alloc::realloc(old_ptr, old_layout, new_layout.size()) }
};
// If allocation fails, `new_ptr` will be null, in which case we abort.
self.ptr = match NonNull::new(new_ptr as *mut T) {
Some(p) => p,
None => alloc::handle_alloc_error(new_layout),
};
self.cap = new_cap;
} }
} }
impl<T> Drop for RawVec<T> { impl<T> Drop for RawVec<T> {
fn drop(&mut self) { fn drop(&mut self) {
let elem_size = mem::size_of::<T>(); let elem_size = mem::size_of::<T>();
if self.cap != 0 && elem_size != 0 { if self.cap != 0 && elem_size != 0 {
unsafe { unsafe {
let c: NonNull<T> = self.ptr.into(); alloc::dealloc(
Global.deallocate(c.cast(), self.ptr.as_ptr() as *mut u8,
Layout::array::<T>(self.cap).unwrap()); Layout::array::<T>(self.cap).unwrap(),
);
} }
} }
} }
@ -85,21 +91,30 @@ pub struct Vec<T> {
} }
impl<T> Vec<T> { impl<T> Vec<T> {
fn ptr(&self) -> *mut T { self.buf.ptr.as_ptr() } fn ptr(&self) -> *mut T {
self.buf.ptr.as_ptr()
}
fn cap(&self) -> usize { self.buf.cap } fn cap(&self) -> usize {
self.buf.cap
}
pub fn new() -> Self { pub fn new() -> Self {
Vec { buf: RawVec::new(), len: 0 } Vec {
buf: RawVec::new(),
len: 0,
}
} }
pub fn push(&mut self, elem: T) { pub fn push(&mut self, elem: T) {
if self.len == self.cap() { self.buf.grow(); } if self.len == self.cap() {
self.buf.grow();
}
unsafe { unsafe {
ptr::write(self.ptr().offset(self.len as isize), elem); ptr::write(self.ptr().offset(self.len as isize), elem);
} }
// Can't fail, we'll OOM first. // Can't overflow, we'll OOM first.
self.len += 1; self.len += 1;
} }
@ -108,22 +123,22 @@ impl<T> Vec<T> {
None None
} else { } else {
self.len -= 1; self.len -= 1;
unsafe { unsafe { Some(ptr::read(self.ptr().offset(self.len as isize))) }
Some(ptr::read(self.ptr().offset(self.len as isize)))
}
} }
} }
pub fn insert(&mut self, index: usize, elem: T) { pub fn insert(&mut self, index: usize, elem: T) {
assert!(index <= self.len, "index out of bounds"); assert!(index <= self.len, "index out of bounds");
if self.cap() == self.len { self.buf.grow(); } if self.cap() == self.len {
self.buf.grow();
}
unsafe { unsafe {
if index < self.len { ptr::copy(
ptr::copy(self.ptr().offset(index as isize), self.ptr().offset(index as isize),
self.ptr().offset(index as isize + 1), self.ptr().offset(index as isize + 1),
self.len - index); self.len - index,
} );
ptr::write(self.ptr().offset(index as isize), elem); ptr::write(self.ptr().offset(index as isize), elem);
self.len += 1; self.len += 1;
} }
@ -134,9 +149,11 @@ impl<T> Vec<T> {
unsafe { unsafe {
self.len -= 1; self.len -= 1;
let result = ptr::read(self.ptr().offset(index as isize)); let result = ptr::read(self.ptr().offset(index as isize));
ptr::copy(self.ptr().offset(index as isize + 1), ptr::copy(
self.ptr().offset(index as isize), self.ptr().offset(index as isize + 1),
self.len - index); self.ptr().offset(index as isize),
self.len - index,
);
result result
} }
} }
@ -181,24 +198,16 @@ impl<T> Drop for Vec<T> {
impl<T> Deref for Vec<T> { impl<T> Deref for Vec<T> {
type Target = [T]; type Target = [T];
fn deref(&self) -> &[T] { fn deref(&self) -> &[T] {
unsafe { unsafe { std::slice::from_raw_parts(self.ptr(), self.len) }
std::slice::from_raw_parts(self.ptr(), self.len)
}
} }
} }
impl<T> DerefMut for Vec<T> { impl<T> DerefMut for Vec<T> {
fn deref_mut(&mut self) -> &mut [T] { fn deref_mut(&mut self) -> &mut [T] {
unsafe { unsafe { std::slice::from_raw_parts_mut(self.ptr(), self.len) }
std::slice::from_raw_parts_mut(self.ptr(), self.len)
}
} }
} }
struct RawValIter<T> { struct RawValIter<T> {
start: *const T, start: *const T,
end: *const T, end: *const T,
@ -239,8 +248,8 @@ impl<T> Iterator for RawValIter<T> {
fn size_hint(&self) -> (usize, Option<usize>) { fn size_hint(&self) -> (usize, Option<usize>) {
let elem_size = mem::size_of::<T>(); let elem_size = mem::size_of::<T>();
let len = (self.end as usize - self.start as usize) let len = (self.end as usize - self.start as usize) /
/ if elem_size == 0 { 1 } else { elem_size }; if elem_size == 0 { 1 } else { elem_size };
(len, Some(len)) (len, Some(len))
} }
} }
@ -262,9 +271,6 @@ impl<T> DoubleEndedIterator for RawValIter<T> {
} }
} }
pub struct IntoIter<T> { pub struct IntoIter<T> {
_buf: RawVec<T>, // we don't actually care about this. Just need it to live. _buf: RawVec<T>, // we don't actually care about this. Just need it to live.
iter: RawValIter<T>, iter: RawValIter<T>,
@ -272,12 +278,18 @@ pub struct IntoIter<T> {
impl<T> Iterator for IntoIter<T> { impl<T> Iterator for IntoIter<T> {
type Item = T; type Item = T;
fn next(&mut self) -> Option<T> { self.iter.next() } fn next(&mut self) -> Option<T> {
fn size_hint(&self) -> (usize, Option<usize>) { self.iter.size_hint() } self.iter.next()
}
fn size_hint(&self) -> (usize, Option<usize>) {
self.iter.size_hint()
}
} }
impl<T> DoubleEndedIterator for IntoIter<T> { impl<T> DoubleEndedIterator for IntoIter<T> {
fn next_back(&mut self) -> Option<T> { self.iter.next_back() } fn next_back(&mut self) -> Option<T> {
self.iter.next_back()
}
} }
impl<T> Drop for IntoIter<T> { impl<T> Drop for IntoIter<T> {
@ -286,9 +298,6 @@ impl<T> Drop for IntoIter<T> {
} }
} }
pub struct Drain<'a, T: 'a> { pub struct Drain<'a, T: 'a> {
vec: PhantomData<&'a mut Vec<T>>, vec: PhantomData<&'a mut Vec<T>>,
iter: RawValIter<T>, iter: RawValIter<T>,
@ -296,12 +305,18 @@ pub struct Drain<'a, T: 'a> {
impl<'a, T> Iterator for Drain<'a, T> { impl<'a, T> Iterator for Drain<'a, T> {
type Item = T; type Item = T;
fn next(&mut self) -> Option<T> { self.iter.next() } fn next(&mut self) -> Option<T> {
fn size_hint(&self) -> (usize, Option<usize>) { self.iter.size_hint() } self.iter.next()
}
fn size_hint(&self) -> (usize, Option<usize>) {
self.iter.size_hint()
}
} }
impl<'a, T> DoubleEndedIterator for Drain<'a, T> { impl<'a, T> DoubleEndedIterator for Drain<'a, T> {
fn next_back(&mut self) -> Option<T> { self.iter.next_back() } fn next_back(&mut self) -> Option<T> {
self.iter.next_back()
}
} }
impl<'a, T> Drop for Drain<'a, T> { impl<'a, T> Drop for Drain<'a, T> {
@ -321,6 +336,7 @@ impl<'a, T> Drop for Drain<'a, T> {
# #
# mod tests { # mod tests {
# use super::*; # use super::*;
#
# pub fn create_push_pop() { # pub fn create_push_pop() {
# let mut v = Vec::new(); # let mut v = Vec::new();
# v.push(1); # v.push(1);

@ -20,12 +20,10 @@ pub fn insert(&mut self, index: usize, elem: T) {
if self.cap == self.len { self.grow(); } if self.cap == self.len { self.grow(); }
unsafe { unsafe {
if index < self.len { // ptr::copy(src, dest, len): "copy from src to dest len elems"
// ptr::copy(src, dest, len): "copy from source to dest len elems" ptr::copy(self.ptr.as_ptr().offset(index as isize),
ptr::copy(self.ptr.as_ptr().offset(index as isize), self.ptr.as_ptr().offset(index as isize + 1),
self.ptr.as_ptr().offset(index as isize + 1), self.len - index);
self.len - index);
}
ptr::write(self.ptr.as_ptr().offset(index as isize), elem); ptr::write(self.ptr.as_ptr().offset(index as isize), elem);
self.len += 1; self.len += 1;
} }

@ -44,10 +44,11 @@ So we're going to use the following struct:
```rust,ignore ```rust,ignore
pub struct IntoIter<T> { pub struct IntoIter<T> {
buf: Unique<T>, buf: NonNull<T>,
cap: usize, cap: usize,
start: *const T, start: *const T,
end: *const T, end: *const T,
_marker: PhantomData<T>,
} }
``` ```
@ -75,6 +76,7 @@ impl<T> Vec<T> {
} else { } else {
ptr.as_ptr().offset(len as isize) ptr.as_ptr().offset(len as isize)
}, },
_marker: PhantomData,
} }
} }
} }
@ -134,11 +136,9 @@ impl<T> Drop for IntoIter<T> {
if self.cap != 0 { if self.cap != 0 {
// drop any remaining elements // drop any remaining elements
for _ in &mut *self {} for _ in &mut *self {}
let layout = Layout::array::<T>(self.cap).unwrap();
unsafe { unsafe {
let c: NonNull<T> = self.buf.into(); alloc::dealloc(self.buf.as_ptr() as *mut u8, layout);
Global.deallocate(c.cast(),
Layout::array::<T>(self.cap).unwrap());
} }
} }
} }

@ -22,64 +22,39 @@ conservatively assume we don't own any values of type `T`. See [the chapter
on ownership and lifetimes][ownership] for all the details on variance and on ownership and lifetimes][ownership] for all the details on variance and
drop check. drop check.
As we saw in the ownership chapter, we should use `Unique<T>` in place of As we saw in the ownership chapter, the standard library uses `Unique<T>` in place of
`*mut T` when we have a raw pointer to an allocation we own. Unique is unstable, `*mut T` when it has a raw pointer to an allocation that it owns. Unique is unstable,
so we'd like to not use it if possible, though. so we'd like to not use it if possible, though.
As a recap, Unique is a wrapper around a raw pointer that declares that: As a recap, Unique is a wrapper around a raw pointer that declares that:
* We are variant over `T` * We are covariant over `T`
* We may own a value of type `T` (for drop check) * We may own a value of type `T` (for drop check)
* We are Send/Sync if `T` is Send/Sync * We are Send/Sync if `T` is Send/Sync
* Our pointer is never null (so `Option<Vec<T>>` is null-pointer-optimized) * Our pointer is never null (so `Option<Vec<T>>` is null-pointer-optimized)
We can implement all of the above requirements except for the last We can implement all of the above requirements in stable Rust. To do this, instead
one in stable Rust: of using `Unique<T>` we will use [`NonNull<T>`][NonNull], another wrapper around a
raw pointer, which gives us two of the above properties, namely it is covariant
over `T` and is declared to never be null. By adding a `PhantomData<T>` (for drop
check) and implementing Send/Sync if `T` is, we get the same results as using
`Unique<T>`:
```rust,ignore ```rust
use std::ptr::NonNull;
use std::marker::PhantomData; use std::marker::PhantomData;
use std::ops::Deref;
use std::mem;
struct Unique<T> {
ptr: *const T, // *const for variance
_marker: PhantomData<T>, // For the drop checker
}
// Deriving Send and Sync is safe because we are the Unique owners
// of this data. It's like Unique<T> is "just" T.
unsafe impl<T: Send> Send for Unique<T> {}
unsafe impl<T: Sync> Sync for Unique<T> {}
impl<T> Unique<T> {
pub fn new(ptr: *mut T) -> Self {
Unique { ptr: ptr, _marker: PhantomData }
}
pub fn as_ptr(&self) -> *mut T {
self.ptr as *mut T
}
}
```
Unfortunately the mechanism for stating that your value is non-zero is
unstable and unlikely to be stabilized soon. As such we're just going to
take the hit and use std's Unique:
```rust,ignore
pub struct Vec<T> { pub struct Vec<T> {
ptr: Unique<T>, ptr: NonNull<T>,
cap: usize, cap: usize,
len: usize, len: usize,
_marker: PhantomData<T>,
} }
```
If you don't care about the null-pointer optimization, then you can use the unsafe impl<T: Send> Send for Vec<T> {}
stable code. However we will be designing the rest of the code around enabling unsafe impl<T: Sync> Sync for Vec<T> {}
this optimization. It should be noted that `Unique::new` is unsafe to call, because # fn main() {}
putting `null` inside of it is Undefined Behavior. Our stable Unique doesn't ```
need `new` to be unsafe because it doesn't make any interesting guarantees about
its contents.
[ownership]: ownership.html [ownership]: ownership.html
[NonNull]: ../std/ptr/struct.NonNull.html

@ -10,56 +10,64 @@ allocating, growing, and freeing:
```rust,ignore ```rust,ignore
struct RawVec<T> { struct RawVec<T> {
ptr: Unique<T>, ptr: NonNull<T>,
cap: usize, cap: usize,
_marker: PhantomData<T>,
} }
unsafe impl<T: Send> Send for RawVec<T> {}
unsafe impl<T: Sync> Sync for RawVec<T> {}
impl<T> RawVec<T> { impl<T> RawVec<T> {
fn new() -> Self { fn new() -> Self {
assert!(mem::size_of::<T>() != 0, "We're not ready to handle ZSTs"); assert!(mem::size_of::<T>() != 0, "TODO: implement ZST support");
RawVec { ptr: Unique::dangling(), cap: 0 } RawVec {
ptr: NonNull::dangling(),
cap: 0,
_marker: PhantomData,
}
} }
// unchanged from Vec
fn grow(&mut self) { fn grow(&mut self) {
unsafe { let (new_cap, new_layout) = if self.cap == 0 {
let elem_size = mem::size_of::<T>(); (1, Layout::array::<T>(1).unwrap())
} else {
let (new_cap, ptr) = if self.cap == 0 { // This can't overflow because we ensure self.cap <= isize::MAX.
let ptr = Global.allocate(Layout::array::<T>(1).unwrap()); let new_cap = 2 * self.cap;
(1, ptr)
} else { // Layout::array checks that the number of bytes is <= usize::MAX,
let new_cap = 2 * self.cap; // but this is redundant since old_layout.size() <= isize::MAX,
let c: NonNull<T> = self.ptr.into(); // so the `unwrap` should never fail.
let ptr = Global.grow(c.cast(), let new_layout = Layout::array::<T>(new_cap).unwrap();
Layout::array::<T>(self.cap).unwrap(), (new_cap, new_layout)
Layout::array::<T>(new_cap).unwrap()); };
(new_cap, ptr)
}; // Ensure that the new allocation doesn't exceed `isize::MAX` bytes.
assert!(new_layout.size() <= isize::MAX as usize, "Allocation too large");
// If allocate or reallocate fail, oom
if ptr.is_err() { let new_ptr = if self.cap == 0 {
handle_alloc_error(Layout::from_size_align_unchecked( unsafe { alloc::alloc(new_layout) }
new_cap * elem_size, } else {
mem::align_of::<T>(), let old_layout = Layout::array::<T>(self.cap).unwrap();
)) let old_ptr = self.ptr.as_ptr() as *mut u8;
} unsafe { alloc::realloc(old_ptr, old_layout, new_layout.size()) }
};
let ptr = ptr.unwrap();
// If allocation fails, `new_ptr` will be null, in which case we abort.
self.ptr = Unique::new_unchecked(ptr.as_ptr() as *mut _); self.ptr = match NonNull::new(new_ptr as *mut T) {
self.cap = new_cap; Some(p) => p,
} None => alloc::handle_alloc_error(new_layout),
};
self.cap = new_cap;
} }
} }
impl<T> Drop for RawVec<T> { impl<T> Drop for RawVec<T> {
fn drop(&mut self) { fn drop(&mut self) {
if self.cap != 0 { if self.cap != 0 {
let layout = Layout::array::<T>(self.cap).unwrap();
unsafe { unsafe {
let c: NonNull<T> = self.ptr.into(); alloc::dealloc(self.ptr.as_ptr() as *mut u8, layout);
Global.deallocate(c.cast(),
Layout::array::<T>(self.cap).unwrap());
} }
} }
} }
@ -75,18 +83,25 @@ pub struct Vec<T> {
} }
impl<T> Vec<T> { impl<T> Vec<T> {
fn ptr(&self) -> *mut T { self.buf.ptr.as_ptr() } fn ptr(&self) -> *mut T {
self.buf.ptr.as_ptr()
}
fn cap(&self) -> usize { self.buf.cap } fn cap(&self) -> usize {
self.buf.cap
}
pub fn new() -> Self { pub fn new() -> Self {
Vec { buf: RawVec::new(), len: 0 } Vec {
buf: RawVec::new(),
len: 0,
}
} }
// push/pop/insert/remove largely unchanged: // push/pop/insert/remove largely unchanged:
// * `self.ptr.as_ptr() -> self.ptr()` // * `self.ptr.as_ptr() -> self.ptr()`
// * `self.cap -> self.cap()` // * `self.cap -> self.cap()`
// * `self.grow -> self.buf.grow()` // * `self.grow() -> self.buf.grow()`
} }
impl<T> Drop for Vec<T> { impl<T> Drop for Vec<T> {

@ -19,7 +19,7 @@ Thankfully we abstracted out pointer-iterators and allocating handling into
## Allocating Zero-Sized Types ## Allocating Zero-Sized Types
So if the allocator API doesn't support zero-sized allocations, what on earth So if the allocator API doesn't support zero-sized allocations, what on earth
do we store as our allocation? `Unique::dangling()` of course! Almost every operation do we store as our allocation? `NonNull::dangling()` of course! Almost every operation
with a ZST is a no-op since ZSTs have exactly one value, and therefore no state needs with a ZST is a no-op since ZSTs have exactly one value, and therefore no state needs
to be considered to store or load them. This actually extends to `ptr::read` and to be considered to store or load them. This actually extends to `ptr::read` and
`ptr::write`: they won't actually look at the pointer at all. As such we never need `ptr::write`: they won't actually look at the pointer at all. As such we never need
@ -38,43 +38,49 @@ impl<T> RawVec<T> {
// !0 is usize::MAX. This branch should be stripped at compile time. // !0 is usize::MAX. This branch should be stripped at compile time.
let cap = if mem::size_of::<T>() == 0 { !0 } else { 0 }; let cap = if mem::size_of::<T>() == 0 { !0 } else { 0 };
// Unique::dangling() doubles as "unallocated" and "zero-sized allocation" // `NonNull::dangling()` doubles as "unallocated" and "zero-sized allocation"
RawVec { ptr: Unique::dangling(), cap: cap } RawVec {
ptr: NonNull::dangling(),
cap: cap,
_marker: PhantomData,
}
} }
fn grow(&mut self) { fn grow(&mut self) {
unsafe { // since we set the capacity to usize::MAX when T has size 0,
let elem_size = mem::size_of::<T>(); // getting to here necessarily means the Vec is overfull.
assert!(mem::size_of::<T>() != 0, "capacity overflow");
// since we set the capacity to usize::MAX when elem_size is let (new_cap, new_layout) = if self.cap == 0 {
// 0, getting to here necessarily means the Vec is overfull. (1, Layout::array::<T>(1).unwrap())
assert!(elem_size != 0, "capacity overflow"); } else {
// This can't overflow because we ensure self.cap <= isize::MAX.
let new_cap = 2 * self.cap;
let (new_cap, ptr) = if self.cap == 0 { // `Layout::array` checks that the number of bytes is <= usize::MAX,
let ptr = Global.allocate(Layout::array::<T>(1).unwrap()); // but this is redundant since old_layout.size() <= isize::MAX,
(1, ptr) // so the `unwrap` should never fail.
} else { let new_layout = Layout::array::<T>(new_cap).unwrap();
let new_cap = 2 * self.cap; (new_cap, new_layout)
let c: NonNull<T> = self.ptr.into(); };
let ptr = Global.grow(c.cast(),
Layout::array::<T>(self.cap).unwrap(),
Layout::array::<T>(new_cap).unwrap());
(new_cap, ptr)
};
// If allocate or reallocate fail, oom
if ptr.is_err() {
handle_alloc_error(Layout::from_size_align_unchecked(
new_cap * elem_size,
mem::align_of::<T>(),
))
}
let ptr = ptr.unwrap(); // Ensure that the new allocation doesn't exceed `isize::MAX` bytes.
assert!(new_layout.size() <= isize::MAX as usize, "Allocation too large");
self.ptr = Unique::new_unchecked(ptr.as_ptr() as *mut _); let new_ptr = if self.cap == 0 {
self.cap = new_cap; unsafe { alloc::alloc(new_layout) }
} } else {
let old_layout = Layout::array::<T>(self.cap).unwrap();
let old_ptr = self.ptr.as_ptr() as *mut u8;
unsafe { alloc::realloc(old_ptr, old_layout, new_layout.size()) }
};
// If allocation fails, `new_ptr` will be null, in which case we abort.
self.ptr = match NonNull::new(new_ptr as *mut T) {
Some(p) => p,
None => alloc::handle_alloc_error(new_layout),
};
self.cap = new_cap;
} }
} }
@ -82,12 +88,12 @@ impl<T> Drop for RawVec<T> {
fn drop(&mut self) { fn drop(&mut self) {
let elem_size = mem::size_of::<T>(); let elem_size = mem::size_of::<T>();
// don't free zero-sized allocations, as they were never allocated.
if self.cap != 0 && elem_size != 0 { if self.cap != 0 && elem_size != 0 {
unsafe { unsafe {
let c: NonNull<T> = self.ptr.into(); alloc::dealloc(
Global.deallocate(c.cast(), self.ptr.as_ptr() as *mut u8,
Layout::array::<T>(self.cap).unwrap()); Layout::array::<T>(self.cap).unwrap(),
);
} }
} }
} }

@ -1,16 +1,10 @@
# Example: Implementing Vec # Example: Implementing Vec
To bring everything together, we're going to write `std::Vec` from scratch. To bring everything together, we're going to write `std::Vec` from scratch.
Because all the best tools for writing unsafe code are unstable, this We will limit ourselves to stable Rust. In particular we won't use any
project will only work on nightly (as of Rust 1.9.0). With the exception of the intrinsics that could make our code a little bit nicer or efficient because
allocator API, much of the unstable code we'll use is expected to be stabilized intrinsics are permanently unstable. Although many intrinsics *do* become
in a similar form as it is today. stabilized elsewhere (`std::ptr` and `std::mem` consist of many intrinsics).
However we will generally try to avoid unstable code where possible. In
particular we won't use any intrinsics that could make a code a little
bit nicer or efficient because intrinsics are permanently unstable. Although
many intrinsics *do* become stabilized elsewhere (`std::ptr` and `std::mem`
consist of many intrinsics).
Ultimately this means our implementation may not take advantage of all Ultimately this means our implementation may not take advantage of all
possible optimizations, though it will be by no means *naive*. We will possible optimizations, though it will be by no means *naive*. We will

Loading…
Cancel
Save