clarify atomics

pull/10/head
Alexis Beingessner 10 years ago committed by Manish Goregaokar
parent e48ef911f0
commit ebee2651ec

@ -10,8 +10,8 @@ C.
Trying to fully explain the model in this book is fairly hopeless. It's defined Trying to fully explain the model in this book is fairly hopeless. It's defined
in terms of madness-inducing causality graphs that require a full book to properly in terms of madness-inducing causality graphs that require a full book to properly
understand in a practical way. If you want all the nitty-gritty details, you understand in a practical way. If you want all the nitty-gritty details, you
should check out [C's specification][C11-model]. Still, we'll try to cover the should check out [C's specification (Section 7.17)][C11-model]. Still, we'll try
basics and some of the problems Rust developers face. to cover the basics and some of the problems Rust developers face.
The C11 memory model is fundamentally about trying to bridge the gap between The C11 memory model is fundamentally about trying to bridge the gap between
the semantics we want, the optimizations compilers want, and the inconsistent the semantics we want, the optimizations compilers want, and the inconsistent
@ -127,7 +127,8 @@ to propagate the changes made in data accesses to other threads
as lazily and inconsistently as it wants. Mostly critically, data accesses are as lazily and inconsistently as it wants. Mostly critically, data accesses are
how data races happen. Data accesses are very friendly to the hardware and how data races happen. Data accesses are very friendly to the hardware and
compiler, but as we've seen they offer *awful* semantics to try to compiler, but as we've seen they offer *awful* semantics to try to
write synchronized code with. write synchronized code with. Actually, that's too weak. *It is literally
impossible to write correct synchronized code using only data accesses*.
Atomic accesses are how we tell the hardware and compiler that our program is Atomic accesses are how we tell the hardware and compiler that our program is
multi-threaded. Each atomic access can be marked with multi-threaded. Each atomic access can be marked with
@ -146,29 +147,33 @@ exposes are:
(Note: We explicitly do not expose the C11 *consume* ordering) (Note: We explicitly do not expose the C11 *consume* ordering)
TODO: negative reasoning vs positive reasoning? TODO: negative reasoning vs positive reasoning?
TODO: "can't forget to synchronize"
# Sequentially Consistent # Sequentially Consistent
Sequentially Consistent is the most powerful of all, implying the restrictions Sequentially Consistent is the most powerful of all, implying the restrictions
of all other orderings. A Sequentially Consistent operation *cannot* of all other orderings. Intuitively, a sequentially consistent operation *cannot*
be reordered: all accesses on one thread that happen before and after it *stay* be reordered: all accesses on one thread that happen before and after it *stay*
before and after it. A program that has sequential consistency has the very nice before and after it. A data-race-free program that uses only sequentially consistent
property that there is a single global execution of the program's instructions atomics and data accesses has the very nice property that there is a single global
that all threads agree on. This execution is also particularly nice to reason execution of the program's instructions that all threads agree on. This execution
about: it's just an interleaving of each thread's individual executions. is also particularly nice to reason about: it's just an interleaving of each thread's
individual executions. This *does not* hold if you start using the weaker atomic
orderings.
The relative developer-friendliness of sequential consistency doesn't come for The relative developer-friendliness of sequential consistency doesn't come for
free. Even on strongly-ordered platforms, sequential consistency involves free. Even on strongly-ordered platforms sequential consistency involves
emitting memory fences. emitting memory fences.
In practice, sequential consistency is rarely necessary for program correctness. In practice, sequential consistency is rarely necessary for program correctness.
However sequential consistency is definitely the right choice if you're not However sequential consistency is definitely the right choice if you're not
confident about the other memory orders. Having your program run a bit slower confident about the other memory orders. Having your program run a bit slower
than it needs to is certainly better than it running incorrectly! It's also than it needs to is certainly better than it running incorrectly! It's also
completely trivial to downgrade to a weaker consistency later. *mechanically* trivial to downgrade atomic operations to have a weaker
consistency later on. Just change `SeqCst` to e.g. `Relaxed` and you're done! Of
course, proving that this transformation is *correct* is whole other matter.
@ -179,20 +184,42 @@ Acquire and Release are largely intended to be paired. Their names hint at
their use case: they're perfectly suited for acquiring and releasing locks, their use case: they're perfectly suited for acquiring and releasing locks,
and ensuring that critical sections don't overlap. and ensuring that critical sections don't overlap.
An acquire access ensures that every access after it *stays* after it. However Intuitively, an acquire access ensures that every access after it *stays* after
operations that occur before an acquire are free to be reordered to occur after it. However operations that occur before an acquire are free to be reordered to
it. occur after it. Similarly, a release access ensures that every access before it
*stays* before it. However operations that occur after a release are free to
be reordered to occur before it.
When thread A releases a location in memory and then thread B subsequently
acquires *the same* location in memory, causality is established. Every write
that happened *before* A's release will be observed by B *after* it's release.
However no causality is established with any other threads. Similarly, no
causality is established if A and B access *different* locations in memory.
Basic use of release-acquire is therefore simple: you acquire a location of
memory to begin the critical section, and then release that location to end it.
For instance, a simple spinlock might look like:
```rust
use std::sync::Arc;
use std::sync::atomic::{AtomicBool, Ordering};
use std::thread;
A release access ensures that every access before it *stays* before it. However fn main() {
operations that occur after a release are free to be reordered to occur before let lock = Arc::new(AtomicBool::new(true)); // value answers "am I locked?"
it.
Basic use of release-acquire is simple: you acquire a location of memory to // ... distribute lock to threads somehow ...
begin the critical section, and the release that location to end it. If
thread A releases a location of memory and thread B acquires that location of // Try to acquire the lock by setting it to false
memory, this establishes that A's critical section *happened before* B's while !lock.compare_and_swap(true, false, Ordering::Acquire) { }
critical section. All accesses that happened before the release will be observed // broke out of the loop, so we successfully acquired the lock!
by anything that happens after the acquire.
// ... scary data accesses ...
// ok we're done, release the lock
lock.store(true, Ordering::Release);
}
```
On strongly-ordered platforms most accesses have release or acquire semantics, On strongly-ordered platforms most accesses have release or acquire semantics,
making release and acquire often totally free. This is not the case on making release and acquire often totally free. This is not the case on
@ -205,10 +232,12 @@ weakly-ordered platforms.
Relaxed accesses are the absolute weakest. They can be freely re-ordered and Relaxed accesses are the absolute weakest. They can be freely re-ordered and
provide no happens-before relationship. Still, relaxed operations *are* still provide no happens-before relationship. Still, relaxed operations *are* still
atomic, which is valuable. Relaxed operations are appropriate for things that atomic. That is, they don't count as data accesses and any read-modify-write
you definitely want to happen, but don't particularly care about much else. For operations done to them occur atomically. Relaxed operations are appropriate for
instance, incrementing a counter can be relaxed if you're not using the things that you definitely want to happen, but don't particularly otherwise care
counter to synchronize any other accesses. about. For instance, incrementing a counter can be safely done by multiple
threads using a relaxed `fetch_add` if you're not using the counter to
synchronize any other accesses.
There's rarely a benefit in making an operation relaxed on strongly-ordered There's rarely a benefit in making an operation relaxed on strongly-ordered
platforms, since they usually provide release-acquire semantics anyway. However platforms, since they usually provide release-acquire semantics anyway. However
@ -219,4 +248,4 @@ relaxed operations can be cheaper on weakly-ordered platforms.
[C11-busted]: http://plv.mpi-sws.org/c11comp/popl15.pdf [C11-busted]: http://plv.mpi-sws.org/c11comp/popl15.pdf
[C11-model]: http://en.cppreference.com/w/c/atomic/memory_order [C11-model]: http://www.open-std.org/jtc1/sc22/wg14/www/standards.html#9899

Loading…
Cancel
Save