|
|
|
% Allocating Memory
|
|
|
|
|
|
|
|
Using Unique throws a wrench in an important feature of Vec (and indeed all of
|
|
|
|
the std collections): an empty Vec doesn't actually allocate at all. So if we
|
|
|
|
can't allocate, but also can't put a null pointer in `ptr`, what do we do in
|
|
|
|
`Vec::new`? Well, we just put some other garbage in there!
|
|
|
|
|
|
|
|
This is perfectly fine because we already have `cap == 0` as our sentinel for no
|
|
|
|
allocation. We don't even need to handle it specially in almost any code because
|
|
|
|
we usually need to check if `cap > len` or `len > 0` anyway. The traditional
|
|
|
|
Rust value to put here is `0x01`. The standard library actually exposes this
|
|
|
|
as `std::rt::heap::EMPTY`. There are quite a few places where we'll
|
|
|
|
want to use `heap::EMPTY` because there's no real allocation to talk about but
|
|
|
|
`null` would make the compiler do bad things.
|
|
|
|
|
|
|
|
All of the `heap` API is totally unstable under the `heap_api` feature, though.
|
|
|
|
We could trivially define `heap::EMPTY` ourselves, but we'll want the rest of
|
|
|
|
the `heap` API anyway, so let's just get that dependency over with.
|
|
|
|
|
|
|
|
So:
|
|
|
|
|
|
|
|
```rust,ignore
|
|
|
|
#![feature(heap_api)]
|
|
|
|
|
|
|
|
use std::rt::heap::EMPTY;
|
|
|
|
use std::mem;
|
|
|
|
|
|
|
|
impl<T> Vec<T> {
|
|
|
|
fn new() -> Self {
|
|
|
|
assert!(mem::size_of::<T>() != 0, "We're not ready to handle ZSTs");
|
|
|
|
unsafe {
|
|
|
|
// need to cast EMPTY to the actual ptr type we want, let
|
|
|
|
// inference handle it.
|
|
|
|
Vec { ptr: Unique::new(heap::EMPTY as *mut _), len: 0, cap: 0 }
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
```
|
|
|
|
|
|
|
|
I slipped in that assert there because zero-sized types will require some
|
|
|
|
special handling throughout our code, and I want to defer the issue for now.
|
|
|
|
Without this assert, some of our early drafts will do some Very Bad Things.
|
|
|
|
|
|
|
|
Next we need to figure out what to actually do when we *do* want space. For
|
|
|
|
that, we'll need to use the rest of the heap APIs. These basically allow us to
|
|
|
|
talk directly to Rust's allocator (jemalloc by default).
|
|
|
|
|
|
|
|
We'll also need a way to handle out-of-memory (OOM) conditions. The standard
|
|
|
|
library calls the `abort` intrinsic, which just calls an illegal instruction to
|
|
|
|
crash the whole program. The reason we abort and don't panic is because
|
|
|
|
unwinding can cause allocations to happen, and that seems like a bad thing to do
|
|
|
|
when your allocator just came back with "hey I don't have any more memory".
|
|
|
|
|
|
|
|
Of course, this is a bit silly since most platforms don't actually run out of
|
|
|
|
memory in a conventional way. Your operating system will probably kill the
|
|
|
|
application by another means if you legitimately start using up all the memory.
|
|
|
|
The most likely way we'll trigger OOM is by just asking for ludicrous quantities
|
|
|
|
of memory at once (e.g. half the theoretical address space). As such it's
|
|
|
|
*probably* fine to panic and nothing bad will happen. Still, we're trying to be
|
|
|
|
like the standard library as much as possible, so we'll just kill the whole
|
|
|
|
program.
|
|
|
|
|
|
|
|
We said we don't want to use intrinsics, so doing exactly what `std` does is
|
|
|
|
out. Instead, we'll call `std::process::exit` with some random number.
|
|
|
|
|
|
|
|
```rust
|
|
|
|
fn oom() {
|
|
|
|
::std::process::exit(-9999);
|
|
|
|
}
|
|
|
|
```
|
|
|
|
|
|
|
|
Okay, now we can write growing. Roughly, we want to have this logic:
|
|
|
|
|
|
|
|
```text
|
|
|
|
if cap == 0:
|
|
|
|
allocate()
|
|
|
|
cap = 1
|
|
|
|
else:
|
|
|
|
reallocate()
|
|
|
|
cap *= 2
|
|
|
|
```
|
|
|
|
|
|
|
|
But Rust's only supported allocator API is so low level that we'll need to do a
|
|
|
|
fair bit of extra work. We also need to guard against some special
|
|
|
|
conditions that can occur with really large allocations or empty allocations.
|
|
|
|
|
|
|
|
In particular, `ptr::offset` will cause us a lot of trouble, because it has
|
|
|
|
the semantics of LLVM's GEP inbounds instruction. If you're fortunate enough to
|
|
|
|
not have dealt with this instruction, here's the basic story with GEP: alias
|
|
|
|
analysis, alias analysis, alias analysis. It's super important to an optimizing
|
|
|
|
compiler to be able to reason about data dependencies and aliasing.
|
|
|
|
|
|
|
|
As a simple example, consider the following fragment of code:
|
|
|
|
|
|
|
|
```rust
|
|
|
|
# let x = &mut 0;
|
|
|
|
# let y = &mut 0;
|
|
|
|
*x *= 7;
|
|
|
|
*y *= 3;
|
|
|
|
```
|
|
|
|
|
|
|
|
If the compiler can prove that `x` and `y` point to different locations in
|
|
|
|
memory, the two operations can in theory be executed in parallel (by e.g.
|
|
|
|
loading them into different registers and working on them independently).
|
|
|
|
However the compiler can't do this in general because if x and y point to
|
|
|
|
the same location in memory, the operations need to be done to the same value,
|
|
|
|
and they can't just be merged afterwards.
|
|
|
|
|
|
|
|
When you use GEP inbounds, you are specifically telling LLVM that the offsets
|
|
|
|
you're about to do are within the bounds of a single "allocated" entity. The
|
|
|
|
ultimate payoff being that LLVM can assume that if two pointers are known to
|
|
|
|
point to two disjoint objects, all the offsets of those pointers are *also*
|
|
|
|
known to not alias (because you won't just end up in some random place in
|
|
|
|
memory). LLVM is heavily optimized to work with GEP offsets, and inbounds
|
|
|
|
offsets are the best of all, so it's important that we use them as much as
|
|
|
|
possible.
|
|
|
|
|
|
|
|
So that's what GEP's about, how can it cause us trouble?
|
|
|
|
|
|
|
|
The first problem is that we index into arrays with unsigned integers, but
|
|
|
|
GEP (and as a consequence `ptr::offset`) takes a signed integer. This means
|
|
|
|
that half of the seemingly valid indices into an array will overflow GEP and
|
|
|
|
actually go in the wrong direction! As such we must limit all allocations to
|
|
|
|
`isize::MAX` elements. This actually means we only need to worry about
|
|
|
|
byte-sized objects, because e.g. `> isize::MAX` `u16`s will truly exhaust all of
|
|
|
|
the system's memory. However in order to avoid subtle corner cases where someone
|
|
|
|
reinterprets some array of `< isize::MAX` objects as bytes, std limits all
|
|
|
|
allocations to `isize::MAX` bytes.
|
|
|
|
|
|
|
|
On all 64-bit targets that Rust currently supports we're artificially limited
|
|
|
|
to significantly less than all 64 bits of the address space (modern x64
|
|
|
|
platforms only expose 48-bit addressing), so we can rely on just running out of
|
|
|
|
memory first. However on 32-bit targets, particularly those with extensions to
|
|
|
|
use more of the address space (PAE x86 or x32), it's theoretically possible to
|
|
|
|
successfully allocate more than `isize::MAX` bytes of memory.
|
|
|
|
|
|
|
|
However since this is a tutorial, we're not going to be particularly optimal
|
|
|
|
here, and just unconditionally check, rather than use clever platform-specific
|
|
|
|
`cfg`s.
|
|
|
|
|
|
|
|
The other corner-case we need to worry about is empty allocations. There will
|
|
|
|
be two kinds of empty allocations we need to worry about: `cap = 0` for all T,
|
|
|
|
and `cap > 0` for zero-sized types.
|
|
|
|
|
|
|
|
These cases are tricky because they come
|
|
|
|
down to what LLVM means by "allocated". LLVM's notion of an
|
|
|
|
allocation is significantly more abstract than how we usually use it. Because
|
|
|
|
LLVM needs to work with different languages' semantics and custom allocators,
|
|
|
|
it can't really intimately understand allocation. Instead, the main idea behind
|
|
|
|
allocation is "doesn't overlap with other stuff". That is, heap allocations,
|
|
|
|
stack allocations, and globals don't randomly overlap. Yep, it's about alias
|
|
|
|
analysis. As such, Rust can technically play a bit fast an loose with the notion of
|
|
|
|
an allocation as long as it's *consistent*.
|
|
|
|
|
|
|
|
Getting back to the empty allocation case, there are a couple of places where
|
|
|
|
we want to offset by 0 as a consequence of generic code. The question is then:
|
|
|
|
is it consistent to do so? For zero-sized types, we have concluded that it is
|
|
|
|
indeed consistent to do a GEP inbounds offset by an arbitrary number of
|
|
|
|
elements. This is a runtime no-op because every element takes up no space,
|
|
|
|
and it's fine to pretend that there's infinite zero-sized types allocated
|
|
|
|
at `0x01`. No allocator will ever allocate that address, because they won't
|
|
|
|
allocate `0x00` and they generally allocate to some minimal alignment higher
|
|
|
|
than a byte. Also generally the whole first page of memory is
|
|
|
|
protected from being allocated anyway (a whole 4k, on many platforms).
|
|
|
|
|
|
|
|
However what about for positive-sized types? That one's a bit trickier. In
|
|
|
|
principle, you can argue that offsetting by 0 gives LLVM no information: either
|
|
|
|
there's an element before the address or after it, but it can't know which.
|
|
|
|
However we've chosen to conservatively assume that it may do bad things. As
|
|
|
|
such we will guard against this case explicitly.
|
|
|
|
|
|
|
|
*Phew*
|
|
|
|
|
|
|
|
Ok with all the nonsense out of the way, let's actually allocate some memory:
|
|
|
|
|
|
|
|
```rust,ignore
|
|
|
|
fn grow(&mut self) {
|
|
|
|
// this is all pretty delicate, so let's say it's all unsafe
|
|
|
|
unsafe {
|
|
|
|
// current API requires us to specify size and alignment manually.
|
|
|
|
let align = mem::align_of::<T>();
|
|
|
|
let elem_size = mem::size_of::<T>();
|
|
|
|
|
|
|
|
let (new_cap, ptr) = if self.cap == 0 {
|
|
|
|
let ptr = heap::allocate(elem_size, align);
|
|
|
|
(1, ptr)
|
|
|
|
} else {
|
|
|
|
// as an invariant, we can assume that `self.cap < isize::MAX`,
|
|
|
|
// so this doesn't need to be checked.
|
|
|
|
let new_cap = self.cap * 2;
|
|
|
|
// Similarly this can't overflow due to previously allocating this
|
|
|
|
let old_num_bytes = self.cap * elem_size;
|
|
|
|
|
|
|
|
// check that the new allocation doesn't exceed `isize::MAX` at all
|
|
|
|
// regardless of the actual size of the capacity. This combines the
|
|
|
|
// `new_cap <= isize::MAX` and `new_num_bytes <= usize::MAX` checks
|
|
|
|
// we need to make. We lose the ability to allocate e.g. 2/3rds of
|
|
|
|
// the address space with a single Vec of i16's on 32-bit though.
|
|
|
|
// Alas, poor Yorick -- I knew him, Horatio.
|
|
|
|
assert!(old_num_bytes <= (::std::isize::MAX as usize) / 2,
|
|
|
|
"capacity overflow");
|
|
|
|
|
|
|
|
let new_num_bytes = old_num_bytes * 2;
|
|
|
|
let ptr = heap::reallocate(*self.ptr as *mut _,
|
|
|
|
old_num_bytes,
|
|
|
|
new_num_bytes,
|
|
|
|
align);
|
|
|
|
(new_cap, ptr)
|
|
|
|
};
|
|
|
|
|
|
|
|
// If allocate or reallocate fail, we'll get `null` back
|
|
|
|
if ptr.is_null() { oom(); }
|
|
|
|
|
|
|
|
self.ptr = Unique::new(ptr as *mut _);
|
|
|
|
self.cap = new_cap;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
```
|
|
|
|
|
|
|
|
Nothing particularly tricky here. Just computing sizes and alignments and doing
|
|
|
|
some careful multiplication checks.
|
|
|
|
|