Simple Arc implementation (without Weak refs)

This is a squash of the following commits:
- Fix code, remove WIP message as that was while writing this, and link to stable @ fixed 1.49 rather than latest nightly
- Improve wording on deref and ignore some code blocks
- Improve wording and formatting a bit cause I'm insane
- Fix links
- Fix links again because we all love relative links
- Remove unnecessary Drop import
- Use Box::from_raw instead of ptr::drop_in_place as that actually dealloc's the Box (i'm dumb and misinterpreted the std code :/); fix some desync between code in between sections
- Fix tests
pull/255/head
ThePuzzlemaker 4 years ago committed by Alexis Beingessner
parent a8584998ea
commit 9da6fbf6cd

@ -55,6 +55,13 @@
* [Handling Zero-Sized Types](vec-zsts.md) * [Handling Zero-Sized Types](vec-zsts.md)
* [Final Code](vec-final.md) * [Final Code](vec-final.md)
* [Implementing Arc and Mutex](arc-and-mutex.md) * [Implementing Arc and Mutex](arc-and-mutex.md)
* [Arc](arc.md)
* [Layout](arc-layout.md)
* [Base Code](arc-base.md)
* [Cloning](arc-clone.md)
* [Dropping](arc-drop.md)
* [Deref](arc-deref.md)
* [Final Code](arc-final.md)
* [FFI](ffi.md) * [FFI](ffi.md)
* [Beneath `std`](beneath-std.md) * [Beneath `std`](beneath-std.md)
* [#[panic_handler]](panic-handler.md) * [#[panic_handler]](panic-handler.md)

@ -4,4 +4,4 @@ Knowing the theory is all fine and good, but the *best* way to understand
something is to use it. To better understand atomics and interior mutability, something is to use it. To better understand atomics and interior mutability,
we'll be implementing versions of the standard library's Arc and Mutex types. we'll be implementing versions of the standard library's Arc and Mutex types.
TODO: ALL OF THIS OMG TODO: Mutex

@ -0,0 +1,104 @@
# Base Code
Now that we've decided the layout for our implementation of `Arc`, let's create
some basic code.
## Constructing the Arc
We'll first need a way to construct an `Arc<T>`.
This is pretty simple, as we just need to box the `ArcInner<T>` and get a
`NonNull<T>` pointer to it.
We start the reference counter at 1, as that first reference is the current
pointer. As the `Arc` is cloned or dropped, it is updated. It is okay to call
`unwrap()` on the `Option` returned by `NonNull` as `Box::into_raw` guarantees
that the pointer returned is not null.
```rust,ignore
impl<T> Arc<T> {
pub fn new(data: T) -> Arc<T> {
// We start the reference count at 1, as that first reference is the
// current pointer.
let boxed = Box::new(ArcInner {
rc: AtomicUsize::new(1),
data,
});
Arc {
// It is okay to call `.unwrap()` here as we get a pointer from
// `Box::into_raw` which is guaranteed to not be null.
ptr: NonNull::new(Box::into_raw(boxed)).unwrap(),
_marker: PhantomData,
}
}
}
```
## Send and Sync
Since we're building a concurrency primitive, we'll need to be able to send it
across threads. Thus, we can implement the `Send` and `Sync` marker traits. For
more information on these, see [the section on `Send` and
`Sync`](send-and-sync.md).
This is okay because:
* You can only get a mutable reference to the value inside an `Arc` if and only
if it is the only `Arc` referencing that data
* We use atomic counters for reference counting
```rust,ignore
unsafe impl<T: Sync + Send> Send for Arc<T> {}
unsafe impl<T: Sync + Send> Sync for Arc<T> {}
```
We need to have the bound `T: Sync + Send` because if we did not provide those
bounds, it would be possible to share values that are thread-unsafe across a
thread boundary via an `Arc`, which could possibly cause data races or
unsoundness.
## Getting the `ArcInner`
We'll now want to make a private helper function, `inner()`, which just returns
the dereferenced `NonNull` pointer.
To dereference the `NonNull<T>` pointer into a `&T`, we can call
`NonNull::as_ref`. This is unsafe, unlike the typical `as_ref` function, so we
must call it like this:
```rust,ignore
// inside the impl<T> Arc<T> block from before:
fn inner(&self) -> &ArcInner<T> {
unsafe { self.ptr.as_ref() }
}
```
This unsafety is okay because while this `Arc` is alive, we're guaranteed that
the inner pointer is valid.
Here's all the code from this section:
```rust,ignore
impl<T> Arc<T> {
pub fn new(data: T) -> Arc<T> {
// We start the reference count at 1, as that first reference is the
// current pointer.
let boxed = Box::new(ArcInner {
rc: AtomicUsize::new(1),
data,
});
Arc {
// It is okay to call `.unwrap()` here as we get a pointer from
// `Box::into_raw` which is guaranteed to not be null.
ptr: NonNull::new(Box::into_raw(boxed)).unwrap(),
_marker: PhantomData,
}
}
fn inner(&self) -> &ArcInner<T> {
// This unsafety is okay because while this Arc is alive, we're
// guaranteed that the inner pointer is valid.
unsafe { self.ptr.as_ref() }
}
}
unsafe impl<T: Sync + Send> Send for Arc<T> {}
unsafe impl<T: Sync + Send> Sync for Arc<T> {}
```

@ -0,0 +1,60 @@
# Cloning
Now that we've got some basic code set up, we'll need a way to clone the `Arc`.
Basically, we need to:
1. Get the `ArcInner` value of the `Arc`
2. Increment the atomic reference count
3. Construct a new instance of the `Arc` from the inner pointer
Next, we can update the atomic reference count as follows:
```rust,ignore
self.inner().rc.fetch_add(1, Ordering::Relaxed);
```
As described in [the standard library's implementation of `Arc` cloning][2]:
> Using a relaxed ordering is alright here, as knowledge of the original
> reference prevents other threads from erroneously deleting the object.
>
> As explained in the [Boost documentation][1]:
> > Increasing the reference counter can always be done with
> > memory_order_relaxed: New references to an object can only be formed from an
> > existing reference, and passing an existing reference from one thread to
> > another must already provide any required synchronization.
>
> [1]: https://www.boost.org/doc/libs/1_55_0/doc/html/atomic/usage_examples.html
[2]: https://github.com/rust-lang/rust/blob/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/alloc/src/sync.rs#L1171-L1181
We'll need to add another import to use `Ordering`:
```rust,ignore
use std::sync::atomic::Ordering;
```
It is possible that in some contrived programs (e.g. using `mem::forget`) that
the reference count could overflow, but it's unreasonable that would happen in
any reasonable program.
Then, we need to return a new instance of the `Arc`:
```rust,ignore
Self {
ptr: self.ptr,
_marker: PhantomData
}
```
Now, let's wrap this all up inside the `Clone` implementation:
```rust,ignore
use std::sync::atomic::Ordering;
impl<T> Clone for Arc<T> {
fn clone(&self) -> Arc<T> {
// Using a relaxed ordering is alright here as knowledge of the original
// reference prevents other threads from wrongly deleting the object.
self.inner().rc.fetch_add(1, Ordering::Relaxed);
Self {
ptr: self.ptr,
_marker: PhantomData,
}
}
}
```

@ -0,0 +1,25 @@
# Deref
Alright. We now have a way to make, clone, and destroy `Arc`s, but how do we get
to the data inside?
What we need now is an implementation of `Deref`.
We'll need to import the trait:
```rust,ignore
use std::ops::Deref;
```
And here's the implementation:
```rust,ignore
impl<T> Deref for Arc<T> {
type Target = T;
fn deref(&self) -> &T {
&self.inner().data
}
}
```
Pretty simple, eh? This simply dereferences the `NonNull` pointer to the
`ArcInner<T>`, then gets a reference to the data inside.

@ -0,0 +1,92 @@
# Dropping
We now need a way to decrease the reference count and drop the data once it is
low enough, otherwise the data will live forever on the heap.
To do this, we can implement `Drop`.
Basically, we need to:
1. Get the `ArcInner` value of the `Arc`
2. Decrement the reference count
3. If there is only one reference remaining to the data, then:
4. Atomically fence the data to prevent reordering of the use and deletion of
the data, then:
5. Drop the inner data
Now, we need to decrement the reference count. We can also bring in step 3 by
returning if the reference count is not equal to 1 (as `fetch_sub` returns the
previous value):
```rust,ignore
if self.inner().rc.fetch_sub(1, Ordering::Release) != 1 {
return;
}
```
We then need to create an atomic fence to prevent reordering of the use of the
data and deletion of the data. As described in [the standard library's
implementation of `Arc`][3]:
> This fence is needed to prevent reordering of use of the data and deletion of
> the data. Because it is marked `Release`, the decreasing of the reference
> count synchronizes with this `Acquire` fence. This means that use of the data
> happens before decreasing the reference count, which happens before this
> fence, which happens before the deletion of the data.
>
> As explained in the [Boost documentation][1],
>
> > It is important to enforce any possible access to the object in one
> > thread (through an existing reference) to *happen before* deleting
> > the object in a different thread. This is achieved by a "release"
> > operation after dropping a reference (any access to the object
> > through this reference must obviously happened before), and an
> > "acquire" operation before deleting the object.
>
> In particular, while the contents of an Arc are usually immutable, it's
> possible to have interior writes to something like a Mutex<T>. Since a Mutex
> is not acquired when it is deleted, we can't rely on its synchronization logic
> to make writes in thread A visible to a destructor running in thread B.
>
> Also note that the Acquire fence here could probably be replaced with an
> Acquire load, which could improve performance in highly-contended situations.
> See [2].
>
> [1]: https://www.boost.org/doc/libs/1_55_0/doc/html/atomic/usage_examples.html
> [2]: https://github.com/rust-lang/rust/pull/41714
[3]: https://github.com/rust-lang/rust/blob/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/alloc/src/sync.rs#L1440-L1467
To do this, we do the following:
```rust,ignore
atomic::fence(Ordering::Acquire);
```
We'll need to import `std::sync::atomic` itself:
```rust,ignore
use std::sync::atomic;
```
Finally, we can drop the data itself. We use `Box::from_raw` to drop the boxed
`ArcInner<T>` and its data. This takes a `*mut T` and not a `NonNull<T>`, so we
must convert using `NonNull::as_ptr`.
```rust,ignore
unsafe { Box::from_raw(self.ptr.as_ptr()); }
```
This is safe as we know we have the last pointer to the `ArcInner` and that its
pointer is valid.
Now, let's wrap this all up inside the `Drop` implementation:
```rust,ignore
impl<T> Drop for Arc<T> {
fn drop(&mut self) {
if self.inner().rc.fetch_sub(1, Ordering::Release) != 1 {
return;
}
// This fence is needed to prevent reordering of the use and deletion
// of the data.
atomic::fence(Ordering::Acquire);
// This is safe as we know we have the last pointer to the `ArcInner`
// and that its pointer is valid.
unsafe { Box::from_raw(self.ptr.as_ptr()); }
}
}
```

@ -0,0 +1,80 @@
# Final Code
Here's the final code, with some added comments and re-ordered imports:
```rust
use std::marker::PhantomData;
use std::ops::Deref;
use std::ptr::NonNull;
use std::sync::atomic::{self, AtomicUsize, Ordering};
pub struct Arc<T> {
ptr: NonNull<ArcInner<T>>,
_marker: PhantomData<ArcInner<T>>,
}
pub struct ArcInner<T> {
rc: AtomicUsize,
data: T,
}
impl<T> Arc<T> {
pub fn new(data: T) -> Arc<T> {
// We start the reference count at 1, as that first reference is the
// current pointer.
let boxed = Box::new(ArcInner {
rc: AtomicUsize::new(1),
data,
});
Arc {
// It is okay to call `.unwrap()` here as we get a pointer from
// `Box::into_raw` which is guaranteed to not be null.
ptr: NonNull::new(Box::into_raw(boxed)).unwrap(),
_marker: PhantomData,
}
}
fn inner(&self) -> &ArcInner<T> {
// This unsafety is okay because while this Arc is alive, we're
// guaranteed that the inner pointer is valid. Also, ArcInner<T> is
// Sync if T is Sync.
unsafe { self.ptr.as_ref() }
}
}
unsafe impl<T: Sync + Send> Send for Arc<T> {}
unsafe impl<T: Sync + Send> Sync for Arc<T> {}
impl<T> Clone for Arc<T> {
fn clone(&self) -> Arc<T> {
// Using a relaxed ordering is alright here as knowledge of the original
// reference prevents other threads from wrongly deleting the object.
self.inner().rc.fetch_add(1, Ordering::Relaxed);
Self {
ptr: self.ptr,
_marker: PhantomData,
}
}
}
impl<T> Drop for Arc<T> {
fn drop(&mut self) {
if self.inner().rc.fetch_sub(1, Ordering::Release) != 1 {
return;
}
// This fence is needed to prevent reordering of the use and deletion
// of the data.
atomic::fence(Ordering::Acquire);
// This is safe as we know we have the last pointer to the `ArcInner`
// and that its pointer is valid.
unsafe { Box::from_raw(self.ptr.as_ptr()); }
}
}
impl<T> Deref for Arc<T> {
type Target = T;
fn deref(&self) -> &T {
&self.inner().data
}
}
```

@ -0,0 +1,59 @@
# Layout
Let's start by making the layout for our implementation of `Arc`.
We'll need at least two things: a pointer to our data and a shared atomic
reference count. Since we're *building* `Arc`, we can't store the reference
count inside another `Arc`. Instead, we can get the best of both worlds and
store our data and the reference count in one structure and get a pointer to
that. This also prevents having to dereference two separate pointers inside our
code, which may also improve performance. (Yay!)
Naively, it'd looks something like this:
```rust,ignore
use std::sync::atomic;
pub struct Arc<T> {
ptr: *mut ArcInner<T>,
}
pub struct ArcInner<T> {
rc: atomic::AtomicUsize,
data: T
}
```
This would compile, however it would be incorrect. First of all, the compiler
will give us too strict variance. For example, an `Arc<&'static str>` couldn't
be used where an `Arc<&'a str>` was expected. More importantly, it will give
incorrect ownership information to the drop checker, as it will assume we don't
own any values of type `T`. As this is a structure providing shared ownership of
a value, at some point there will be an instance of this structure that entirely
owns its data. See [the chapter on ownership and lifetimes](ownership.md) for
all the details on variance and drop check.
To fix the first problem, we can use `NonNull<T>`. Note that `NonNull<T>` is a
wrapper around a raw pointer that declares that:
* We are variant over `T`
* Our pointer is never null
To fix the second problem, we can include a `PhantomData` marker containing an
`ArcInner<T>`. This will tell the drop checker that we have some notion of
ownership of a value of `ArcInner<T>` (which itself contains some `T`).
With these changes, our new structure will look like this:
```rust,ignore
use std::marker::PhantomData;
use std::ptr::NonNull;
use std::sync::atomic::AtomicUsize;
pub struct Arc<T> {
ptr: NonNull<ArcInner<T>>,
phantom: PhantomData<ArcInner<T>>
}
pub struct ArcInner<T> {
rc: AtomicUsize,
data: T
}
```

@ -0,0 +1,13 @@
# Implementing Arc
In this section, we'll be implementing a simpler version of `std::sync::Arc`.
Similarly to [the implementation of `Vec` we made earlier](vec.md), we won't be
taking advantage of as many optimizations, intrinsics, or unstable code as the
standard library may.
This implementation is loosely based on the standard library's implementation
(technically taken from `alloc::sync` in 1.49, as that's where it's actually
implemented), but it will not support weak references at the moment as they
make the implementation slightly more complex.
Please note that this section is very work-in-progress at the moment.
Loading…
Cancel
Save