diff --git a/src/SUMMARY.md b/src/SUMMARY.md index aac99d5..c732726 100644 --- a/src/SUMMARY.md +++ b/src/SUMMARY.md @@ -55,6 +55,13 @@ * [Handling Zero-Sized Types](vec-zsts.md) * [Final Code](vec-final.md) * [Implementing Arc and Mutex](arc-and-mutex.md) + * [Arc](arc.md) + * [Layout](arc-layout.md) + * [Base Code](arc-base.md) + * [Cloning](arc-clone.md) + * [Dropping](arc-drop.md) + * [Deref](arc-deref.md) + * [Final Code](arc-final.md) * [FFI](ffi.md) * [Beneath `std`](beneath-std.md) * [#[panic_handler]](panic-handler.md) diff --git a/src/arc-and-mutex.md b/src/arc-and-mutex.md index fedc7b8..f6c1583 100644 --- a/src/arc-and-mutex.md +++ b/src/arc-and-mutex.md @@ -4,4 +4,4 @@ Knowing the theory is all fine and good, but the *best* way to understand something is to use it. To better understand atomics and interior mutability, we'll be implementing versions of the standard library's Arc and Mutex types. -TODO: ALL OF THIS OMG +TODO: Mutex diff --git a/src/arc-base.md b/src/arc-base.md new file mode 100644 index 0000000..c561e19 --- /dev/null +++ b/src/arc-base.md @@ -0,0 +1,104 @@ +# Base Code + +Now that we've decided the layout for our implementation of `Arc`, let's create +some basic code. + +## Constructing the Arc + +We'll first need a way to construct an `Arc`. + +This is pretty simple, as we just need to box the `ArcInner` and get a +`NonNull` pointer to it. + +We start the reference counter at 1, as that first reference is the current +pointer. As the `Arc` is cloned or dropped, it is updated. It is okay to call +`unwrap()` on the `Option` returned by `NonNull` as `Box::into_raw` guarantees +that the pointer returned is not null. + +```rust,ignore +impl Arc { + pub fn new(data: T) -> Arc { + // We start the reference count at 1, as that first reference is the + // current pointer. + let boxed = Box::new(ArcInner { + rc: AtomicUsize::new(1), + data, + }); + Arc { + // It is okay to call `.unwrap()` here as we get a pointer from + // `Box::into_raw` which is guaranteed to not be null. + ptr: NonNull::new(Box::into_raw(boxed)).unwrap(), + _marker: PhantomData, + } + } +} +``` + +## Send and Sync + +Since we're building a concurrency primitive, we'll need to be able to send it +across threads. Thus, we can implement the `Send` and `Sync` marker traits. For +more information on these, see [the section on `Send` and +`Sync`](send-and-sync.md). + +This is okay because: +* You can only get a mutable reference to the value inside an `Arc` if and only + if it is the only `Arc` referencing that data +* We use atomic counters for reference counting + +```rust,ignore +unsafe impl Send for Arc {} +unsafe impl Sync for Arc {} +``` + +We need to have the bound `T: Sync + Send` because if we did not provide those +bounds, it would be possible to share values that are thread-unsafe across a +thread boundary via an `Arc`, which could possibly cause data races or +unsoundness. + +## Getting the `ArcInner` + +We'll now want to make a private helper function, `inner()`, which just returns +the dereferenced `NonNull` pointer. + +To dereference the `NonNull` pointer into a `&T`, we can call +`NonNull::as_ref`. This is unsafe, unlike the typical `as_ref` function, so we +must call it like this: +```rust,ignore +// inside the impl Arc block from before: +fn inner(&self) -> &ArcInner { + unsafe { self.ptr.as_ref() } +} +``` + +This unsafety is okay because while this `Arc` is alive, we're guaranteed that +the inner pointer is valid. + +Here's all the code from this section: +```rust,ignore +impl Arc { + pub fn new(data: T) -> Arc { + // We start the reference count at 1, as that first reference is the + // current pointer. + let boxed = Box::new(ArcInner { + rc: AtomicUsize::new(1), + data, + }); + Arc { + // It is okay to call `.unwrap()` here as we get a pointer from + // `Box::into_raw` which is guaranteed to not be null. + ptr: NonNull::new(Box::into_raw(boxed)).unwrap(), + _marker: PhantomData, + } + } + + fn inner(&self) -> &ArcInner { + // This unsafety is okay because while this Arc is alive, we're + // guaranteed that the inner pointer is valid. + unsafe { self.ptr.as_ref() } + } +} + +unsafe impl Send for Arc {} +unsafe impl Sync for Arc {} +``` diff --git a/src/arc-clone.md b/src/arc-clone.md new file mode 100644 index 0000000..42325f0 --- /dev/null +++ b/src/arc-clone.md @@ -0,0 +1,60 @@ +# Cloning + +Now that we've got some basic code set up, we'll need a way to clone the `Arc`. + +Basically, we need to: +1. Get the `ArcInner` value of the `Arc` +2. Increment the atomic reference count +3. Construct a new instance of the `Arc` from the inner pointer + +Next, we can update the atomic reference count as follows: +```rust,ignore +self.inner().rc.fetch_add(1, Ordering::Relaxed); +``` + +As described in [the standard library's implementation of `Arc` cloning][2]: +> Using a relaxed ordering is alright here, as knowledge of the original +> reference prevents other threads from erroneously deleting the object. +> +> As explained in the [Boost documentation][1]: +> > Increasing the reference counter can always be done with +> > memory_order_relaxed: New references to an object can only be formed from an +> > existing reference, and passing an existing reference from one thread to +> > another must already provide any required synchronization. +> +> [1]: https://www.boost.org/doc/libs/1_55_0/doc/html/atomic/usage_examples.html +[2]: https://github.com/rust-lang/rust/blob/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/alloc/src/sync.rs#L1171-L1181 + +We'll need to add another import to use `Ordering`: +```rust,ignore +use std::sync::atomic::Ordering; +``` + +It is possible that in some contrived programs (e.g. using `mem::forget`) that +the reference count could overflow, but it's unreasonable that would happen in +any reasonable program. + +Then, we need to return a new instance of the `Arc`: +```rust,ignore +Self { + ptr: self.ptr, + _marker: PhantomData +} +``` + +Now, let's wrap this all up inside the `Clone` implementation: +```rust,ignore +use std::sync::atomic::Ordering; + +impl Clone for Arc { + fn clone(&self) -> Arc { + // Using a relaxed ordering is alright here as knowledge of the original + // reference prevents other threads from wrongly deleting the object. + self.inner().rc.fetch_add(1, Ordering::Relaxed); + Self { + ptr: self.ptr, + _marker: PhantomData, + } + } +} +``` diff --git a/src/arc-deref.md b/src/arc-deref.md new file mode 100644 index 0000000..6dd04fe --- /dev/null +++ b/src/arc-deref.md @@ -0,0 +1,25 @@ +# Deref + +Alright. We now have a way to make, clone, and destroy `Arc`s, but how do we get +to the data inside? + +What we need now is an implementation of `Deref`. + +We'll need to import the trait: +```rust,ignore +use std::ops::Deref; +``` + +And here's the implementation: +```rust,ignore +impl Deref for Arc { + type Target = T; + + fn deref(&self) -> &T { + &self.inner().data + } +} +``` + +Pretty simple, eh? This simply dereferences the `NonNull` pointer to the +`ArcInner`, then gets a reference to the data inside. diff --git a/src/arc-drop.md b/src/arc-drop.md new file mode 100644 index 0000000..893a15a --- /dev/null +++ b/src/arc-drop.md @@ -0,0 +1,92 @@ +# Dropping + +We now need a way to decrease the reference count and drop the data once it is +low enough, otherwise the data will live forever on the heap. + +To do this, we can implement `Drop`. + +Basically, we need to: +1. Get the `ArcInner` value of the `Arc` +2. Decrement the reference count +3. If there is only one reference remaining to the data, then: +4. Atomically fence the data to prevent reordering of the use and deletion of + the data, then: +5. Drop the inner data + +Now, we need to decrement the reference count. We can also bring in step 3 by +returning if the reference count is not equal to 1 (as `fetch_sub` returns the +previous value): +```rust,ignore +if self.inner().rc.fetch_sub(1, Ordering::Release) != 1 { + return; +} +``` + +We then need to create an atomic fence to prevent reordering of the use of the +data and deletion of the data. As described in [the standard library's +implementation of `Arc`][3]: +> This fence is needed to prevent reordering of use of the data and deletion of +> the data. Because it is marked `Release`, the decreasing of the reference +> count synchronizes with this `Acquire` fence. This means that use of the data +> happens before decreasing the reference count, which happens before this +> fence, which happens before the deletion of the data. +> +> As explained in the [Boost documentation][1], +> +> > It is important to enforce any possible access to the object in one +> > thread (through an existing reference) to *happen before* deleting +> > the object in a different thread. This is achieved by a "release" +> > operation after dropping a reference (any access to the object +> > through this reference must obviously happened before), and an +> > "acquire" operation before deleting the object. +> +> In particular, while the contents of an Arc are usually immutable, it's +> possible to have interior writes to something like a Mutex. Since a Mutex +> is not acquired when it is deleted, we can't rely on its synchronization logic +> to make writes in thread A visible to a destructor running in thread B. +> +> Also note that the Acquire fence here could probably be replaced with an +> Acquire load, which could improve performance in highly-contended situations. +> See [2]. +> +> [1]: https://www.boost.org/doc/libs/1_55_0/doc/html/atomic/usage_examples.html +> [2]: https://github.com/rust-lang/rust/pull/41714 +[3]: https://github.com/rust-lang/rust/blob/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/alloc/src/sync.rs#L1440-L1467 + +To do this, we do the following: +```rust,ignore +atomic::fence(Ordering::Acquire); +``` + +We'll need to import `std::sync::atomic` itself: +```rust,ignore +use std::sync::atomic; +``` + +Finally, we can drop the data itself. We use `Box::from_raw` to drop the boxed +`ArcInner` and its data. This takes a `*mut T` and not a `NonNull`, so we +must convert using `NonNull::as_ptr`. + +```rust,ignore +unsafe { Box::from_raw(self.ptr.as_ptr()); } +``` + +This is safe as we know we have the last pointer to the `ArcInner` and that its +pointer is valid. + +Now, let's wrap this all up inside the `Drop` implementation: +```rust,ignore +impl Drop for Arc { + fn drop(&mut self) { + if self.inner().rc.fetch_sub(1, Ordering::Release) != 1 { + return; + } + // This fence is needed to prevent reordering of the use and deletion + // of the data. + atomic::fence(Ordering::Acquire); + // This is safe as we know we have the last pointer to the `ArcInner` + // and that its pointer is valid. + unsafe { Box::from_raw(self.ptr.as_ptr()); } + } +} +``` diff --git a/src/arc-final.md b/src/arc-final.md new file mode 100644 index 0000000..41d364c --- /dev/null +++ b/src/arc-final.md @@ -0,0 +1,80 @@ +# Final Code + +Here's the final code, with some added comments and re-ordered imports: +```rust +use std::marker::PhantomData; +use std::ops::Deref; +use std::ptr::NonNull; +use std::sync::atomic::{self, AtomicUsize, Ordering}; + +pub struct Arc { + ptr: NonNull>, + _marker: PhantomData>, +} + +pub struct ArcInner { + rc: AtomicUsize, + data: T, +} + +impl Arc { + pub fn new(data: T) -> Arc { + // We start the reference count at 1, as that first reference is the + // current pointer. + let boxed = Box::new(ArcInner { + rc: AtomicUsize::new(1), + data, + }); + Arc { + // It is okay to call `.unwrap()` here as we get a pointer from + // `Box::into_raw` which is guaranteed to not be null. + ptr: NonNull::new(Box::into_raw(boxed)).unwrap(), + _marker: PhantomData, + } + } + + fn inner(&self) -> &ArcInner { + // This unsafety is okay because while this Arc is alive, we're + // guaranteed that the inner pointer is valid. Also, ArcInner is + // Sync if T is Sync. + unsafe { self.ptr.as_ref() } + } +} + +unsafe impl Send for Arc {} +unsafe impl Sync for Arc {} + +impl Clone for Arc { + fn clone(&self) -> Arc { + // Using a relaxed ordering is alright here as knowledge of the original + // reference prevents other threads from wrongly deleting the object. + self.inner().rc.fetch_add(1, Ordering::Relaxed); + Self { + ptr: self.ptr, + _marker: PhantomData, + } + } +} + +impl Drop for Arc { + fn drop(&mut self) { + if self.inner().rc.fetch_sub(1, Ordering::Release) != 1 { + return; + } + // This fence is needed to prevent reordering of the use and deletion + // of the data. + atomic::fence(Ordering::Acquire); + // This is safe as we know we have the last pointer to the `ArcInner` + // and that its pointer is valid. + unsafe { Box::from_raw(self.ptr.as_ptr()); } + } +} + +impl Deref for Arc { + type Target = T; + + fn deref(&self) -> &T { + &self.inner().data + } +} +``` diff --git a/src/arc-layout.md b/src/arc-layout.md new file mode 100644 index 0000000..a8a4c86 --- /dev/null +++ b/src/arc-layout.md @@ -0,0 +1,59 @@ +# Layout + +Let's start by making the layout for our implementation of `Arc`. + +We'll need at least two things: a pointer to our data and a shared atomic +reference count. Since we're *building* `Arc`, we can't store the reference +count inside another `Arc`. Instead, we can get the best of both worlds and +store our data and the reference count in one structure and get a pointer to +that. This also prevents having to dereference two separate pointers inside our +code, which may also improve performance. (Yay!) + +Naively, it'd looks something like this: +```rust,ignore +use std::sync::atomic; + +pub struct Arc { + ptr: *mut ArcInner, +} + +pub struct ArcInner { + rc: atomic::AtomicUsize, + data: T +} +``` + +This would compile, however it would be incorrect. First of all, the compiler +will give us too strict variance. For example, an `Arc<&'static str>` couldn't +be used where an `Arc<&'a str>` was expected. More importantly, it will give +incorrect ownership information to the drop checker, as it will assume we don't +own any values of type `T`. As this is a structure providing shared ownership of +a value, at some point there will be an instance of this structure that entirely +owns its data. See [the chapter on ownership and lifetimes](ownership.md) for +all the details on variance and drop check. + +To fix the first problem, we can use `NonNull`. Note that `NonNull` is a +wrapper around a raw pointer that declares that: +* We are variant over `T` +* Our pointer is never null + +To fix the second problem, we can include a `PhantomData` marker containing an +`ArcInner`. This will tell the drop checker that we have some notion of +ownership of a value of `ArcInner` (which itself contains some `T`). + +With these changes, our new structure will look like this: +```rust,ignore +use std::marker::PhantomData; +use std::ptr::NonNull; +use std::sync::atomic::AtomicUsize; + +pub struct Arc { + ptr: NonNull>, + phantom: PhantomData> +} + +pub struct ArcInner { + rc: AtomicUsize, + data: T +} +``` diff --git a/src/arc.md b/src/arc.md new file mode 100644 index 0000000..580e662 --- /dev/null +++ b/src/arc.md @@ -0,0 +1,13 @@ +# Implementing Arc + +In this section, we'll be implementing a simpler version of `std::sync::Arc`. +Similarly to [the implementation of `Vec` we made earlier](vec.md), we won't be +taking advantage of as many optimizations, intrinsics, or unstable code as the +standard library may. + +This implementation is loosely based on the standard library's implementation +(technically taken from `alloc::sync` in 1.49, as that's where it's actually +implemented), but it will not support weak references at the moment as they +make the implementation slightly more complex. + +Please note that this section is very work-in-progress at the moment.