diff --git a/src/atomics/atomics.md b/src/atomics/atomics.md index 7402d38..c28310a 100644 --- a/src/atomics/atomics.md +++ b/src/atomics/atomics.md @@ -30,9 +30,9 @@ three main factors at play here: your program at a moment's notice. The C++ memory model is fundamentally about trying to bridge the gap between -these three, allowing users to write code for a logical and consistent abstract -machine while the compiler and hardware deal with the madness underneath that -makes it run fast. +these three, allowing users to write code for a logical and consistent Abstract +Machine (AM for short) while the compiler and hardware deal with the madness +underneath that makes it run fast. ### Compiler Reordering @@ -118,136 +118,5 @@ programming: incorrect. If possible, concurrent algorithms should be tested on weakly-ordered hardware. ---- - -## Data Accesses - -The C++ memory model attempts to bridge the gap by allowing us to talk about the -*causality* of our program. Generally, this is by establishing a *happens -before* relationship between parts of the program and the threads that are -running them. This gives the hardware and compiler room to optimize the program -more aggressively where a strict happens-before relationship isn't established, -but forces them to be more careful where one is established. The way we -communicate these relationships are through *data accesses* and *atomic -accesses*. - -Data accesses are the bread-and-butter of the programming world. They are -fundamentally unsynchronized and compilers are free to aggressively optimize -them. In particular, data accesses are free to be reordered by the compiler on -the assumption that the program is single-threaded. The hardware is also free to -propagate the changes made in data accesses to other threads as lazily and -inconsistently as it wants. Most critically, data accesses are how data races -happen. Data accesses are very friendly to the hardware and compiler, but as -we've seen they offer *awful* semantics to try to write synchronized code with. -Actually, that's too weak. - -**It is literally impossible to write correct synchronized code using only data -accesses.** - -Atomic accesses are how we tell the hardware and compiler that our program is -multi-threaded. Each atomic access can be marked with an *ordering* that -specifies what kind of relationship it establishes with other accesses. In -practice, this boils down to telling the compiler and hardware certain things -they *can't* do. For the compiler, this largely revolves around re-ordering of -instructions. For the hardware, this largely revolves around how writes are -propagated to other threads. The set of orderings Rust exposes are: - -* Sequentially Consistent (SeqCst) -* Release -* Acquire -* Relaxed - -(Note: We explicitly do not expose the C++ *consume* ordering) - -TODO: negative reasoning vs positive reasoning? TODO: "can't forget to -synchronize" - -## Sequentially Consistent - -Sequentially Consistent is the most powerful of all, implying the restrictions -of all other orderings. Intuitively, a sequentially consistent operation -cannot be reordered: all accesses on one thread that happen before and after a -SeqCst access stay before and after it. A data-race-free program that uses -only sequentially consistent atomics and data accesses has the very nice -property that there is a single global execution of the program's instructions -that all threads agree on. This execution is also particularly nice to reason -about: it's just an interleaving of each thread's individual executions. This -does not hold if you start using the weaker atomic orderings. - -The relative developer-friendliness of sequential consistency doesn't come for -free. Even on strongly-ordered platforms sequential consistency involves -emitting memory fences. - -In practice, sequential consistency is rarely necessary for program correctness. -However sequential consistency is definitely the right choice if you're not -confident about the other memory orders. Having your program run a bit slower -than it needs to is certainly better than it running incorrectly! It's also -mechanically trivial to downgrade atomic operations to have a weaker -consistency later on. Just change `SeqCst` to `Relaxed` and you're done! Of -course, proving that this transformation is *correct* is a whole other matter. - -## Acquire-Release - -Acquire and Release are largely intended to be paired. Their names hint at their -use case: they're perfectly suited for acquiring and releasing locks, and -ensuring that critical sections don't overlap. - -Intuitively, an acquire access ensures that every access after it stays after -it. However operations that occur before an acquire are free to be reordered to -occur after it. Similarly, a release access ensures that every access before it -stays before it. However operations that occur after a release are free to be -reordered to occur before it. - -When thread A releases a location in memory and then thread B subsequently -acquires *the same* location in memory, causality is established. Every write -(including non-atomic and relaxed atomic writes) that happened before A's -release will be observed by B after its acquisition. However no causality is -established with any other threads. Similarly, no causality is established -if A and B access *different* locations in memory. - -Basic use of release-acquire is therefore simple: you acquire a location of -memory to begin the critical section, and then release that location to end it. -For instance, a simple spinlock might look like: - -```rust -use std::sync::Arc; -use std::sync::atomic::{AtomicBool, Ordering}; -use std::thread; - -fn main() { - let lock = Arc::new(AtomicBool::new(false)); // value answers "am I locked?" - - // ... distribute lock to threads somehow ... - - // Try to acquire the lock by setting it to true - while lock.compare_and_swap(false, true, Ordering::Acquire) { } - // broke out of the loop, so we successfully acquired the lock! - - // ... scary data accesses ... - - // ok we're done, release the lock - lock.store(false, Ordering::Release); -} -``` - -On strongly-ordered platforms most accesses have release or acquire semantics, -making release and acquire often totally free. This is not the case on -weakly-ordered platforms. - -## Relaxed - -Relaxed accesses are the absolute weakest. They can be freely re-ordered and -provide no happens-before relationship. Still, relaxed operations are still -atomic. That is, they don't count as data accesses and any read-modify-write -operations done to them occur atomically. Relaxed operations are appropriate for -things that you definitely want to happen, but don't particularly otherwise care -about. For instance, incrementing a counter can be safely done by multiple -threads using a relaxed `fetch_add` if you're not using the counter to -synchronize any other accesses. - -There's rarely a benefit in making an operation relaxed on strongly-ordered -platforms, since they usually provide release-acquire semantics anyway. However -relaxed operations can be cheaper on weakly-ordered platforms. - [C11-busted]: http://plv.mpi-sws.org/c11comp/popl15.pdf [C++-model]: https://en.cppreference.com/w/cpp/atomic/memory_order