From 1fe545747967ff9bc196ba6fea6a7a26fed9cd5b Mon Sep 17 00:00:00 2001 From: Yuki Okushi Date: Tue, 8 Jun 2021 10:10:39 +0900 Subject: [PATCH 1/2] Fix minor style issues --- src/README.md | 2 +- src/SUMMARY.md | 100 ++++++++++++++++++------------------- src/aliasing.md | 8 +-- src/arc-clone.md | 10 +++- src/arc-drop.md | 10 +++- src/arc-final.md | 1 + src/arc-layout.md | 3 ++ src/atomics.md | 38 +++----------- src/beneath-std.md | 8 +-- src/casts.md | 34 ++++++------- src/coercions.md | 18 +++---- src/conversions.md | 1 - src/dropck.md | 6 +-- src/exception-safety.md | 8 --- src/exotic-sizes.md | 25 ++-------- src/ffi.md | 29 ++++++----- src/leaking.md | 8 --- src/lifetime-mismatch.md | 5 +- src/lifetimes.md | 14 ++---- src/other-reprs.md | 27 ++-------- src/ownership.md | 1 - src/panic-handler.md | 2 +- src/subtyping.md | 7 +-- src/transmutes.md | 7 ++- src/unbounded-lifetimes.md | 1 - src/unchecked-uninit.md | 11 ++-- src/vec-into-iter.md | 1 - src/vec-zsts.md | 6 --- src/what-unsafe-does.md | 32 ++++++------ src/working-with-unsafe.md | 1 - 30 files changed, 169 insertions(+), 255 deletions(-) diff --git a/src/README.md b/src/README.md index cc50b5f..d9770de 100644 --- a/src/README.md +++ b/src/README.md @@ -1,6 +1,6 @@ # The Rustonomicon -#### The Dark Arts of Unsafe Rust +## The Dark Arts of Unsafe Rust > THE KNOWLEDGE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF UNLEASHING INDESCRIBABLE HORRORS THAT diff --git a/src/SUMMARY.md b/src/SUMMARY.md index fbd9cc0..72f92f4 100644 --- a/src/SUMMARY.md +++ b/src/SUMMARY.md @@ -3,64 +3,64 @@ [Introduction](README.md) * [Meet Safe and Unsafe](meet-safe-and-unsafe.md) - * [How Safe and Unsafe Interact](safe-unsafe-meaning.md) - * [What Unsafe Can Do](what-unsafe-does.md) - * [Working with Unsafe](working-with-unsafe.md) + * [How Safe and Unsafe Interact](safe-unsafe-meaning.md) + * [What Unsafe Can Do](what-unsafe-does.md) + * [Working with Unsafe](working-with-unsafe.md) * [Data Layout](data.md) - * [repr(Rust)](repr-rust.md) - * [Exotically Sized Types](exotic-sizes.md) - * [Other reprs](other-reprs.md) + * [repr(Rust)](repr-rust.md) + * [Exotically Sized Types](exotic-sizes.md) + * [Other reprs](other-reprs.md) * [Ownership](ownership.md) - * [References](references.md) - * [Aliasing](aliasing.md) - * [Lifetimes](lifetimes.md) - * [Limits of Lifetimes](lifetime-mismatch.md) - * [Lifetime Elision](lifetime-elision.md) - * [Unbounded Lifetimes](unbounded-lifetimes.md) - * [Higher-Rank Trait Bounds](hrtb.md) - * [Subtyping and Variance](subtyping.md) - * [Drop Check](dropck.md) - * [PhantomData](phantom-data.md) - * [Splitting Borrows](borrow-splitting.md) + * [References](references.md) + * [Aliasing](aliasing.md) + * [Lifetimes](lifetimes.md) + * [Limits of Lifetimes](lifetime-mismatch.md) + * [Lifetime Elision](lifetime-elision.md) + * [Unbounded Lifetimes](unbounded-lifetimes.md) + * [Higher-Rank Trait Bounds](hrtb.md) + * [Subtyping and Variance](subtyping.md) + * [Drop Check](dropck.md) + * [PhantomData](phantom-data.md) + * [Splitting Borrows](borrow-splitting.md) * [Type Conversions](conversions.md) - * [Coercions](coercions.md) - * [The Dot Operator](dot-operator.md) - * [Casts](casts.md) - * [Transmutes](transmutes.md) + * [Coercions](coercions.md) + * [The Dot Operator](dot-operator.md) + * [Casts](casts.md) + * [Transmutes](transmutes.md) * [Uninitialized Memory](uninitialized.md) - * [Checked](checked-uninit.md) - * [Drop Flags](drop-flags.md) - * [Unchecked](unchecked-uninit.md) + * [Checked](checked-uninit.md) + * [Drop Flags](drop-flags.md) + * [Unchecked](unchecked-uninit.md) * [Ownership Based Resource Management](obrm.md) - * [Constructors](constructors.md) - * [Destructors](destructors.md) - * [Leaking](leaking.md) + * [Constructors](constructors.md) + * [Destructors](destructors.md) + * [Leaking](leaking.md) * [Unwinding](unwinding.md) - * [Exception Safety](exception-safety.md) - * [Poisoning](poisoning.md) + * [Exception Safety](exception-safety.md) + * [Poisoning](poisoning.md) * [Concurrency](concurrency.md) - * [Races](races.md) - * [Send and Sync](send-and-sync.md) - * [Atomics](atomics.md) + * [Races](races.md) + * [Send and Sync](send-and-sync.md) + * [Atomics](atomics.md) * [Implementing Vec](vec.md) - * [Layout](vec-layout.md) - * [Allocating](vec-alloc.md) - * [Push and Pop](vec-push-pop.md) - * [Deallocating](vec-dealloc.md) - * [Deref](vec-deref.md) - * [Insert and Remove](vec-insert-remove.md) - * [IntoIter](vec-into-iter.md) - * [RawVec](vec-raw.md) - * [Drain](vec-drain.md) - * [Handling Zero-Sized Types](vec-zsts.md) - * [Final Code](vec-final.md) + * [Layout](vec-layout.md) + * [Allocating](vec-alloc.md) + * [Push and Pop](vec-push-pop.md) + * [Deallocating](vec-dealloc.md) + * [Deref](vec-deref.md) + * [Insert and Remove](vec-insert-remove.md) + * [IntoIter](vec-into-iter.md) + * [RawVec](vec-raw.md) + * [Drain](vec-drain.md) + * [Handling Zero-Sized Types](vec-zsts.md) + * [Final Code](vec-final.md) * [Implementing Arc and Mutex](arc-and-mutex.md) - * [Arc](arc.md) - * [Layout](arc-layout.md) - * [Base Code](arc-base.md) - * [Cloning](arc-clone.md) - * [Dropping](arc-drop.md) - * [Final Code](arc-final.md) + * [Arc](arc.md) + * [Layout](arc-layout.md) + * [Base Code](arc-base.md) + * [Cloning](arc-clone.md) + * [Dropping](arc-drop.md) + * [Final Code](arc-final.md) * [FFI](ffi.md) * [Beneath `std`](beneath-std.md) - * [#[panic_handler]](panic-handler.md) + * [#[panic_handler]](panic-handler.md) diff --git a/src/aliasing.md b/src/aliasing.md index ebdec9a..d319e41 100644 --- a/src/aliasing.md +++ b/src/aliasing.md @@ -14,10 +14,7 @@ don't happen unless you tell it otherwise. For more details, see the With that said, here's our working definition: variables and pointers *alias* if they refer to overlapping regions of memory. - - - -# Why Aliasing Matters +## Why Aliasing Matters So why should we care about aliasing? @@ -130,6 +127,3 @@ Of course, a full aliasing model for Rust must also take into consideration thin function calls (which may mutate things we don't see), raw pointers (which have no aliasing requirements on their own), and UnsafeCell (which lets the referent of an `&` be mutated). - - - diff --git a/src/arc-clone.md b/src/arc-clone.md index ad98800..96b8bd6 100644 --- a/src/arc-clone.md +++ b/src/arc-clone.md @@ -3,15 +3,18 @@ Now that we've got some basic code set up, we'll need a way to clone the `Arc`. Basically, we need to: + 1. Increment the atomic reference count 2. Construct a new instance of the `Arc` from the inner pointer First, we need to get access to the `ArcInner`: + ```rust,ignore let inner = unsafe { self.ptr.as_ref() }; ``` We can update the atomic reference count as follows: + ```rust,ignore let old_rc = inner.rc.fetch_add(1, Ordering::???); ``` @@ -26,11 +29,13 @@ is described more in [the section on the `Drop` implementation for ordering, see [the section on atomics](atomics.md). Thus, the code becomes this: + ```rust,ignore let old_rc = inner.rc.fetch_add(1, Ordering::Relaxed); ``` We'll need to add another import to use `Ordering`: + ```rust,ignore use std::sync::atomic::Ordering; ``` @@ -54,7 +59,8 @@ probably incredibly degenerate) if the reference count reaches `isize::MAX` probably not about 2 billion threads (or about **9 quintillion** on some 64-bit machines) incrementing the reference count at once. This is what we'll do. -It's pretty simple to implement this behaviour: +It's pretty simple to implement this behavior: + ```rust,ignore if old_rc >= isize::MAX as usize { std::process::abort(); @@ -62,6 +68,7 @@ if old_rc >= isize::MAX as usize { ``` Then, we need to return a new instance of the `Arc`: + ```rust,ignore Self { ptr: self.ptr, @@ -70,6 +77,7 @@ Self { ``` Now, let's wrap this all up inside the `Clone` implementation: + ```rust,ignore use std::sync::atomic::Ordering; diff --git a/src/arc-drop.md b/src/arc-drop.md index 0b3cfe5..ccbb57c 100644 --- a/src/arc-drop.md +++ b/src/arc-drop.md @@ -6,13 +6,15 @@ low enough, otherwise the data will live forever on the heap. To do this, we can implement `Drop`. Basically, we need to: + 1. Decrement the reference count 2. If there is only one reference remaining to the data, then: 3. Atomically fence the data to prevent reordering of the use and deletion of the data -4. Drop the inner data +4. Drop the inner data First, we'll need to get access to the `ArcInner`: + ```rust,ignore let inner = unsafe { self.ptr.as_ref() }; ``` @@ -21,6 +23,7 @@ Now, we need to decrement the reference count. To streamline our code, we can also return if the returned value from `fetch_sub` (the value of the reference count before decrementing it) is not equal to `1` (which happens when we are not the last reference to the data). + ```rust,ignore if inner.rc.fetch_sub(1, Ordering::Relaxed) != 1 { return; @@ -53,17 +56,19 @@ implementation of `Arc`][3]: > Also note that the Acquire fence here could probably be replaced with an > Acquire load, which could improve performance in highly-contended situations. > See [2]. -> +> > [1]: https://www.boost.org/doc/libs/1_55_0/doc/html/atomic/usage_examples.html > [2]: https://github.com/rust-lang/rust/pull/41714 [3]: https://github.com/rust-lang/rust/blob/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/alloc/src/sync.rs#L1440-L1467 To do this, we do the following: + ```rust,ignore atomic::fence(Ordering::Acquire); ``` We'll need to import `std::sync::atomic` itself: + ```rust,ignore use std::sync::atomic; ``` @@ -80,6 +85,7 @@ This is safe as we know we have the last pointer to the `ArcInner` and that its pointer is valid. Now, let's wrap this all up inside the `Drop` implementation: + ```rust,ignore impl Drop for Arc { fn drop(&mut self) { diff --git a/src/arc-final.md b/src/arc-final.md index 4fcc274..b9c362d 100644 --- a/src/arc-final.md +++ b/src/arc-final.md @@ -1,6 +1,7 @@ # Final Code Here's the final code, with some added comments and re-ordered imports: + ```rust use std::marker::PhantomData; use std::ops::Deref; diff --git a/src/arc-layout.md b/src/arc-layout.md index 55d4793..cbe0ad3 100644 --- a/src/arc-layout.md +++ b/src/arc-layout.md @@ -21,6 +21,7 @@ pointer to the T's allocation, we might as well put the reference count in that same allocation. Naively, it would look something like this: + ```rust,ignore use std::sync::atomic; @@ -45,6 +46,7 @@ all the details on variance and drop check. To fix the first problem, we can use `NonNull`. Note that `NonNull` is a wrapper around a raw pointer that declares that: + * We are variant over `T` * Our pointer is never null @@ -53,6 +55,7 @@ To fix the second problem, we can include a `PhantomData` marker containing an ownership of a value of `ArcInner` (which itself contains some `T`). With these changes we get our final structure: + ```rust,ignore use std::marker::PhantomData; use std::ptr::NonNull; diff --git a/src/atomics.md b/src/atomics.md index 2883107..c663f98 100644 --- a/src/atomics.md +++ b/src/atomics.md @@ -22,10 +22,7 @@ semantics we want, the optimizations compilers want, and the inconsistent chaos our hardware wants. *We* would like to just write programs and have them do exactly what we said but, you know, fast. Wouldn't that be great? - - - -# Compiler Reordering +## Compiler Reordering Compilers fundamentally want to be able to do all sorts of complicated transformations to reduce data dependencies and eliminate dead code. In @@ -54,10 +51,7 @@ able to make these kinds of optimizations, because they can seriously improve performance. On the other hand, we'd also like to be able to depend on our program *doing the thing we said*. - - - -# Hardware Reordering +## Hardware Reordering On the other hand, even if the compiler totally understood what we wanted and respected our wishes, our hardware might instead get us in trouble. Trouble @@ -110,11 +104,7 @@ programming: incorrect. If possible, concurrent algorithms should be tested on weakly-ordered hardware. - - - - -# Data Accesses +## Data Accesses The C++ memory model attempts to bridge the gap by allowing us to talk about the *causality* of our program. Generally, this is by establishing a *happens @@ -156,9 +146,7 @@ propagated to other threads. The set of orderings Rust exposes are: TODO: negative reasoning vs positive reasoning? TODO: "can't forget to synchronize" - - -# Sequentially Consistent +## Sequentially Consistent Sequentially Consistent is the most powerful of all, implying the restrictions of all other orderings. Intuitively, a sequentially consistent operation @@ -182,10 +170,7 @@ mechanically trivial to downgrade atomic operations to have a weaker consistency later on. Just change `SeqCst` to `Relaxed` and you're done! Of course, proving that this transformation is *correct* is a whole other matter. - - - -# Acquire-Release +## Acquire-Release Acquire and Release are largely intended to be paired. Their names hint at their use case: they're perfectly suited for acquiring and releasing locks, and @@ -200,8 +185,8 @@ reordered to occur before it. When thread A releases a location in memory and then thread B subsequently acquires *the same* location in memory, causality is established. Every write (including non-atomic and relaxed atomic writes) that happened before A's -release will be observed by B after its acquisition. However no causality is -established with any other threads. Similarly, no causality is established +release will be observed by B after its acquisition. However no causality is +established with any other threads. Similarly, no causality is established if A and B access *different* locations in memory. Basic use of release-acquire is therefore simple: you acquire a location of @@ -233,10 +218,7 @@ On strongly-ordered platforms most accesses have release or acquire semantics, making release and acquire often totally free. This is not the case on weakly-ordered platforms. - - - -# Relaxed +## Relaxed Relaxed accesses are the absolute weakest. They can be freely re-ordered and provide no happens-before relationship. Still, relaxed operations are still @@ -251,9 +233,5 @@ There's rarely a benefit in making an operation relaxed on strongly-ordered platforms, since they usually provide release-acquire semantics anyway. However relaxed operations can be cheaper on weakly-ordered platforms. - - - - [C11-busted]: http://plv.mpi-sws.org/c11comp/popl15.pdf [C++-model]: https://en.cppreference.com/w/cpp/atomic/memory_order diff --git a/src/beneath-std.md b/src/beneath-std.md index 2c759f0..6f05182 100644 --- a/src/beneath-std.md +++ b/src/beneath-std.md @@ -4,7 +4,7 @@ This section documents (or will document) features that are provided by the stan that `#![no_std]` developers have to deal with (i.e. provide) to build `#![no_std]` binary crates. A (likely incomplete) list of such features is shown below: -- #[lang = "eh_personality"] -- #[lang = "start"] -- #[lang = "termination"] -- #[panic_implementation] +- `#[lang = "eh_personality"]` +- `#[lang = "start"]` +- `#[lang = "termination"]` +- `#[panic_implementation]` diff --git a/src/casts.md b/src/casts.md index 014c100..a851612 100644 --- a/src/casts.md +++ b/src/casts.md @@ -23,18 +23,18 @@ Here's an exhaustive list of all the true casts. For brevity, we will use `*` to denote either a `*const` or `*mut`, and `integer` to denote any integral primitive: - * `*T as *U` where `T, U: Sized` - * `*T as *U` TODO: explain unsized situation - * `*T as integer` - * `integer as *T` - * `number as number` - * `field-less enum as integer` - * `bool as integer` - * `char as integer` - * `u8 as char` - * `&[T; n] as *const T` - * `fn as *T` where `T: Sized` - * `fn as integer` +* `*T as *U` where `T, U: Sized` +* `*T as *U` TODO: explain unsized situation +* `*T as integer` +* `integer as *T` +* `number as number` +* `field-less enum as integer` +* `bool as integer` +* `char as integer` +* `u8 as char` +* `&[T; n] as *const T` +* `fn as *T` where `T: Sized` +* `fn as integer` Note that lengths are not adjusted when casting raw slices - `*const [u16] as *const [u8]` creates a slice that only includes @@ -49,13 +49,13 @@ For numeric casts, there are quite a few cases to consider: * casting from a larger integer to a smaller integer (e.g. u32 -> u8) will truncate * casting from a smaller integer to a larger integer (e.g. u8 -> u32) will - * zero-extend if the source is unsigned - * sign-extend if the source is signed + * zero-extend if the source is unsigned + * sign-extend if the source is signed * casting from a float to an integer will round the float towards zero and produces a "saturating cast" when the float is outside the integer's range - * floats that are too big turn into the largest possible integer - * floats that are too small produce the smallest possible integer - * NaN produces zero + * floats that are too big turn into the largest possible integer + * floats that are too small produce the smallest possible integer + * NaN produces zero * casting from an integer to float will produce the floating point representation of the integer, rounded if necessary (rounding to nearest, ties to even) diff --git a/src/coercions.md b/src/coercions.md index 065a9e1..0a51bb4 100644 --- a/src/coercions.md +++ b/src/coercions.md @@ -12,10 +12,10 @@ Coercion is allowed between the following types: * Transitivity: `T_1` to `T_3` where `T_1` coerces to `T_2` and `T_2` coerces to `T_3` * Pointer Weakening: - * `&mut T` to `&T` - * `*mut T` to `*const T` - * `&T` to `*const T` - * `&mut T` to `*mut T` + * `&mut T` to `&T` + * `*mut T` to `*const T` + * `&T` to `*const T` + * `&mut T` to `*mut T` * Unsizing: `T` to `U` if `T` implements `CoerceUnsized` * Deref coercion: Expression `&x` of type `&T` to `&*x` of type `&U` if `T` derefs to `U` (i.e. `T: Deref`) * Non-capturing closure to a function pointer ([RFC 1558], e.g. `|| 8usize` to `fn() -> usize`) @@ -29,11 +29,11 @@ only implemented automatically, and enables the following transformations: * `[T; n]` => `[T]` * `T` => `dyn Trait` where `T: Trait` * `Foo<..., T, ...>` => `Foo<..., U, ...>` where: - * `T: Unsize` - * `Foo` is a struct - * Only the last field of `Foo` has type involving `T` - * `T` is not part of the type of any other fields - * `Bar: Unsize>`, if the last field of `Foo` has type `Bar` + * `T: Unsize` + * `Foo` is a struct + * Only the last field of `Foo` has type involving `T` + * `T` is not part of the type of any other fields + * `Bar: Unsize>`, if the last field of `Foo` has type `Bar` Coercions occur at a *coercion site*. Any location that is explicitly typed will cause a coercion to its type. If inference is necessary, the coercion will diff --git a/src/conversions.md b/src/conversions.md index 388d003..4c29fd8 100644 --- a/src/conversions.md +++ b/src/conversions.md @@ -31,4 +31,3 @@ fn reinterpret(foo: Foo) -> Bar { But this is, at best, annoying. For common conversions, Rust provides more ergonomic alternatives. - diff --git a/src/dropck.md b/src/dropck.md index 41c5afc..30d2337 100644 --- a/src/dropck.md +++ b/src/dropck.md @@ -203,7 +203,7 @@ of an inspector's destructor might access that borrowed data. Therefore, the drop checker forces all borrowed data in a value to strictly outlive that value. -# An Escape Hatch +## An Escape Hatch The precise rules that govern drop checking may be less restrictive in the future. @@ -322,13 +322,13 @@ attribute makes the type vulnerable to misuse that the borrower checker will not catch, inviting havoc. It is better to avoid adding the attribute. -# A related side note about drop order +## A related side note about drop order While the drop order of fields inside a struct is defined, relying on it is fragile and subtle. When the order matters, it is better to use the [`ManuallyDrop`] wrapper. -# Is that all about drop checker? +## Is that all about drop checker? It turns out that when writing unsafe code, we generally don't need to worry at all about doing the right thing for the drop checker. However there diff --git a/src/exception-safety.md b/src/exception-safety.md index 0a63764..e0dd926 100644 --- a/src/exception-safety.md +++ b/src/exception-safety.md @@ -30,10 +30,6 @@ it is not uncommon for Unsafe code to work with arrays of temporarily uninitialized data while repeatedly invoking caller-provided code. Such code needs to be careful and consider exception safety. - - - - ## Vec::push_all `Vec::push_all` is a temporary hack to get extending a Vec by a slice reliably @@ -69,10 +65,6 @@ we *did* clone are dropped, we can set the `len` every loop iteration. If we just want to guarantee that uninitialized memory can't be observed, we can set the `len` after the loop. - - - - ## BinaryHeap::sift_up Bubbling an element up a heap is a bit more complicated than extending a Vec. diff --git a/src/exotic-sizes.md b/src/exotic-sizes.md index fa997c0..6cbbf83 100644 --- a/src/exotic-sizes.md +++ b/src/exotic-sizes.md @@ -3,11 +3,7 @@ Most of the time, we expect types to have a statically known and positive size. This isn't always the case in Rust. - - - - -# Dynamically Sized Types (DSTs) +## Dynamically Sized Types (DSTs) Rust supports Dynamically Sized Types (DSTs): types without a statically known size or alignment. On the surface, this is a bit nonsensical: Rust *must* @@ -69,11 +65,7 @@ fn main() { (Yes, custom DSTs are a largely half-baked feature for now.) - - - - -# Zero Sized Types (ZSTs) +## Zero Sized Types (ZSTs) Rust also allows types to be specified that occupy no space: @@ -121,9 +113,7 @@ type. [alloc]: ../std/alloc/trait.GlobalAlloc.html#tymethod.alloc [ub]: what-unsafe-does.html - - -# Empty Types +## Empty Types Rust also enables types to be declared that *cannot even be instantiated*. These types can only be talked about at the type level, and never at the value level. @@ -177,11 +167,7 @@ into a reference without any safety problems. It still doesn't prevent you from trying to read or write values, but at least it compiles to a no-op instead of UB. - - - - -# Extern Types +## Extern Types There is [an accepted RFC][extern-types] to add proper types with an unknown size, called *extern types*, which would let Rust developers model things like C's `void*` @@ -189,9 +175,6 @@ and other "declared but never defined" types more accurately. However as of Rust 2018, the feature is stuck in limbo over how `size_of::()` should behave. - - - [dst-issue]: https://github.com/rust-lang/rust/issues/26403 [extern-types]: https://github.com/rust-lang/rfcs/blob/master/text/1861-extern-types.md [`str`]: ../std/primitive.str.html diff --git a/src/ffi.md b/src/ffi.md index a8383d6..811a900 100644 --- a/src/ffi.md +++ b/src/ffi.md @@ -1,6 +1,6 @@ # Foreign Function Interface -# Introduction +## Introduction This guide will use the [snappy](https://github.com/google/snappy) compression/decompression library as an introduction to writing bindings for @@ -85,7 +85,7 @@ extern { # fn main() {} ``` -# Creating a safe interface +## Creating a safe interface The raw C API needs to be wrapped to provide memory safety and make use of higher-level concepts like vectors. A library can choose to expose only the safe, high-level interface and hide the unsafe @@ -234,7 +234,7 @@ mod tests { } ``` -# Destructors +## Destructors Foreign libraries often hand off ownership of resources to the calling code. When this occurs, we must use Rust's destructors to provide safety and guarantee @@ -242,7 +242,7 @@ the release of these resources (especially in the case of panic). For more about destructors, see the [Drop trait](../std/ops/trait.Drop.html). -# Callbacks from C code to Rust functions +## Callbacks from C code to Rust functions Some external libraries require the usage of callbacks to report back their current state or intermediate data to the caller. @@ -295,7 +295,6 @@ void trigger_callback() { In this example Rust's `main()` will call `trigger_callback()` in C, which would, in turn, call back to `callback()` in Rust. - ## Targeting callbacks to Rust objects The former example showed how a global function can be called from C code. @@ -384,7 +383,7 @@ This can be achieved by unregistering the callback in the object's destructor and designing the library in a way that guarantees that no callback will be performed after deregistration. -# Linking +## Linking The `link` attribute on `extern` blocks provides the basic building block for instructing rustc how it will link to native libraries. There are two accepted @@ -433,7 +432,7 @@ A few examples of how this model can be used are: On macOS, frameworks behave with the same semantics as a dynamic library. -# Unsafe blocks +## Unsafe blocks Some operations, like dereferencing raw pointers or calling functions that have been marked unsafe are only allowed inside unsafe blocks. Unsafe blocks isolate unsafety and are a promise to @@ -448,7 +447,7 @@ unsafe fn kaboom(ptr: *const i32) -> i32 { *ptr } This function can only be called from an `unsafe` block or another `unsafe` function. -# Accessing foreign globals +## Accessing foreign globals Foreign APIs often export a global variable which could do something like track global state. In order to access these variables, you declare them in `extern` @@ -498,7 +497,7 @@ fn main() { Note that all interaction with a `static mut` is unsafe, both reading and writing. Dealing with global mutable state requires a great deal of care. -# Foreign calling conventions +## Foreign calling conventions Most foreign code exposes a C ABI, and Rust uses the platform's C calling convention by default when calling foreign functions. Some foreign functions, most notably the Windows API, use other calling @@ -540,7 +539,7 @@ however, windows uses the `C` calling convention, so `C` would be used. This means that in our previous example, we could have used `extern "system" { ... }` to define a block for all windows systems, not only x86 ones. -# Interoperability with foreign code +## Interoperability with foreign code Rust guarantees that the layout of a `struct` is compatible with the platform's representation in C only if the `#[repr(C)]` attribute is applied to it. @@ -565,7 +564,7 @@ The [`libc` crate on crates.io][libc] includes type aliases and function definitions for the C standard library in the `libc` module, and Rust links against `libc` and `libm` by default. -# Variadic functions +## Variadic functions In C, functions can be 'variadic', meaning they accept a variable number of arguments. This can be achieved in Rust by specifying `...` within the argument list of a foreign function declaration: @@ -590,7 +589,7 @@ Normal Rust functions can *not* be variadic: fn foo(x: i32, ...) { } ``` -# The "nullable pointer optimization" +## The "nullable pointer optimization" Certain Rust types are defined to never be `null`. This includes references (`&T`, `&mut T`), boxes (`Box`), and function pointers (`extern "abi" fn()`). When @@ -655,7 +654,7 @@ void register(void (*f)(int (*)(int), int)) { No `transmute` required! -# Calling Rust code from C +## Calling Rust code from C You may wish to compile Rust code in a way so that it can be called from C. This is fairly easy, but requires a few things: @@ -673,7 +672,7 @@ discussed above in "[Foreign Calling Conventions](ffi.html#foreign-calling-conventions)". The `no_mangle` attribute turns off Rust's name mangling, so that it is easier to link to. -# FFI and panics +## FFI and panics It’s important to be mindful of `panic!`s when working with FFI. A `panic!` across an FFI boundary is undefined behavior. If you’re writing code that may @@ -702,7 +701,7 @@ for more information. [`catch_unwind`]: ../std/panic/fn.catch_unwind.html -# Representing opaque structs +## Representing opaque structs Sometimes, a C library wants to provide a pointer to something, but not let you know the internal details of the thing it wants. The simplest way is to use a diff --git a/src/leaking.md b/src/leaking.md index 3b04e28..42684e5 100644 --- a/src/leaking.md +++ b/src/leaking.md @@ -49,8 +49,6 @@ library: * `Rc` * `thread::scoped::JoinGuard` - - ## Drain `drain` is a collections API that moves data out of the container without @@ -105,9 +103,6 @@ mem::forget us in the middle of the iteration, all that does is *leak even more* Since we've accepted that mem::forget is safe, this is definitely safe. We call leaks causing more leaks a *leak amplification*. - - - ## Rc Rc is an interesting case because at first glance it doesn't appear to be a @@ -177,9 +172,6 @@ This can be solved by just checking the `ref_count` and doing *something*. The standard library's stance is to just abort, because your program has become horribly degenerate. Also *oh my gosh* it's such a ridiculous corner case. - - - ## thread::scoped::JoinGuard The thread::scoped API intended to allow threads to be spawned that reference diff --git a/src/lifetime-mismatch.md b/src/lifetime-mismatch.md index c53648c..005396e 100644 --- a/src/lifetime-mismatch.md +++ b/src/lifetime-mismatch.md @@ -70,9 +70,7 @@ blows up in our face! This program is clearly correct according to the reference semantics we actually care about, but the lifetime system is too coarse-grained to handle that. - - -# Improperly reduced borrows +## Improperly reduced borrows The following code fails to compile, because Rust doesn't understand that the borrow is no longer needed and conservatively falls back to using a whole scope for it. @@ -120,5 +118,4 @@ error[E0499]: cannot borrow `*map` as mutable more than once at a time | |_____- returning this value requires that `*map` is borrowed for `'m` ``` - [ex2]: lifetimes.html#example-aliasing-a-mutable-reference diff --git a/src/lifetimes.md b/src/lifetimes.md index c6c0e84..1fd43ab 100644 --- a/src/lifetimes.md +++ b/src/lifetimes.md @@ -87,9 +87,7 @@ z = y; } ``` - - -# Example: references that outlive referents +## Example: references that outlive referents Alright, let's look at some of those examples from before: @@ -169,11 +167,7 @@ we could have returned an `&'a str` would have been if it was in a field of the can be considered to reside at the bottom of the stack; though this limits our implementation *just a bit*.) - - - - -# Example: aliasing a mutable reference +## Example: aliasing a mutable reference How about the other example: @@ -222,9 +216,7 @@ to the compiler. However it does mean that several programs that are totally correct with respect to Rust's *true* semantics are rejected because lifetimes are too dumb. - - -# The area covered by a lifetime +## The area covered by a lifetime The lifetime (sometimes called a *borrow*) is *alive* from the place it is created to its last use. The borrowed thing needs to outlive only borrows that diff --git a/src/other-reprs.md b/src/other-reprs.md index a7cc675..10f337b 100644 --- a/src/other-reprs.md +++ b/src/other-reprs.md @@ -3,10 +3,7 @@ Rust allows you to specify alternative data layout strategies from the default. There's also the [unsafe code guidelines] (note that it's **NOT** normative). - - - -# repr(C) +## repr(C) This is the most important `repr`. It has fairly simple intent: do what C does. The order, size, and alignment of fields is exactly what you would expect from C @@ -57,9 +54,7 @@ construct an instance of an enum that does not match one of its variants. (This allows exhaustive matches to continue to be written and compiled as normal.) - - -# repr(transparent) +## repr(transparent) This can only be used on structs with a single non-zero-sized field (there may be additional zero-sized fields). The effect is that the layout and ABI of the @@ -75,9 +70,7 @@ Foo(f32)` to always have the same ABI as `f32`. More details are in the [RFC][rfc-transparent]. - - -# repr(u*), repr(i*) +## repr(u*), repr(i*) These specify the size to make a fieldless enum. If the discriminant overflows the integer it has to fit in, it will produce a compile-time error. You can @@ -101,10 +94,7 @@ optimization. These reprs have no effect on a struct. - - - -# repr(packed) +## repr(packed) `repr(packed)` forces Rust to strip any padding, and only align the type to a byte. This may improve the memory footprint, but will likely have other negative @@ -124,10 +114,7 @@ this should not be used. This repr is a modifier on `repr(C)` and `repr(Rust)`. - - - -# repr(align(n)) +## repr(align(n)) `repr(align(n))` (where `n` is a power of two) forces the type to have an alignment of *at least* n. @@ -139,10 +126,6 @@ kinds of concurrent code). This is a modifier on `repr(C)` and `repr(Rust)`. It is incompatible with `repr(packed)`. - - - - [unsafe code guidelines]: https://rust-lang.github.io/unsafe-code-guidelines/layout.html [drop flags]: drop-flags.html [ub loads]: https://github.com/rust-lang/rust/issues/27060 diff --git a/src/ownership.md b/src/ownership.md index dd9e9db..4790632 100644 --- a/src/ownership.md +++ b/src/ownership.md @@ -63,4 +63,3 @@ naive scope analysis would be insufficient to prevent this bug, because `data` does in fact live as long as we needed. However it was *changed* while we had a reference into it. This is why Rust requires any references to freeze the referent and its owners. - diff --git a/src/panic-handler.md b/src/panic-handler.md index b06707b..c951772 100644 --- a/src/panic-handler.md +++ b/src/panic-handler.md @@ -1,4 +1,4 @@ -## #[panic_handler] +# #[panic_handler] `#[panic_handler]` is used to define the behavior of `panic!` in `#![no_std]` applications. The `#[panic_handler]` attribute must be applied to a function with signature `fn(&PanicInfo) diff --git a/src/subtyping.md b/src/subtyping.md index 79b6408..56d1496 100644 --- a/src/subtyping.md +++ b/src/subtyping.md @@ -15,7 +15,6 @@ we will then relate it back to how subtyping actually occurs in Rust. So here's our simple extension, *Objective Rust*, featuring three new types: - ```rust trait Animal { fn snuggle(&self); @@ -133,10 +132,7 @@ because nothing ever has type `'a`. Lifetimes only occur as part of some larger like `&'a u32` or `IterMut<'a, u32>`. To apply lifetime subtyping, we need to know how to compose subtyping. Once again, we need *variance*. - - - -# Variance +## Variance Variance is where things get a bit complicated. @@ -442,4 +438,3 @@ struct MyType<'a, 'b, A: 'a, B: 'b, C, D, E, F, G, H, In, Out, Mixed> { k2: Mixed, // invariant over Mixed, because invariance wins all conflicts } ``` - diff --git a/src/transmutes.md b/src/transmutes.md index 3733c13..1b2d699 100644 --- a/src/transmutes.md +++ b/src/transmutes.md @@ -20,9 +20,9 @@ boggling. it may produce a surprising type to satisfy inference. * Transmuting an `&` to `&mut` is UB. - * Transmuting an `&` to `&mut` is *always* UB. - * No you can't do it. - * No you're not special. + * Transmuting an `&` to `&mut` is *always* UB. + * No you can't do it. + * No you're not special. * Transmuting to a reference without an explicitly provided lifetime produces an [unbounded lifetime]. @@ -50,7 +50,6 @@ Also of course you can get all of the functionality of these functions using raw pointer casts or `union`s, but without any of the lints or other basic sanity checks. Raw pointer casts and `union`s do not magically avoid the above rules. - [unbounded lifetime]: ./unbounded-lifetimes.md [transmute]: ../std/mem/fn.transmute.html [transmute_copy]: ../std/mem/fn.transmute_copy.html diff --git a/src/unbounded-lifetimes.md b/src/unbounded-lifetimes.md index b41cf8b..07f1d6c 100644 --- a/src/unbounded-lifetimes.md +++ b/src/unbounded-lifetimes.md @@ -33,4 +33,3 @@ way to bound a lifetime is to return it from a function with a bound lifetime. However if this is unacceptable, the reference can be placed in a location with a specific lifetime. Unfortunately it's impossible to name all lifetimes involved in a function. - diff --git a/src/unchecked-uninit.md b/src/unchecked-uninit.md index 29e1b6a..e6df9cd 100644 --- a/src/unchecked-uninit.md +++ b/src/unchecked-uninit.md @@ -79,11 +79,13 @@ This code proceeds in three steps: acknowledge that by providing appropriate methods). It's worth spending a bit more time on the loop in the middle, and in particular -the assignment operator and its interaction with `drop`. If we would have -written something like +the assignment operator and its interaction with `drop`. If we would have +written something like: + ```rust,ignore *x[i].as_mut_ptr() = Box::new(i as u32); // WRONG! ``` + we would actually overwrite a `Box`, leading to `drop` of uninitialized data, which will cause much sadness and pain. @@ -126,6 +128,7 @@ Note that, to use the `ptr` methods, you need to first obtain a *raw pointer* to the data you want to initialize. It is illegal to construct a *reference* to uninitialized data, which implies that you have to be careful when obtaining said raw pointer: + * For an array of `T`, you can use `base_ptr.add(idx)` where `base_ptr: *mut T` to compute the address of array index `idx`. This relies on how arrays are laid out in memory. @@ -133,9 +136,9 @@ how arrays are laid out in memory. also cannot use `&mut base_ptr.field` as that would be creating a reference. Thus, it is currently not possible to create a raw pointer to a field of a partially initialized struct, and also not possible to initialize a single -field of a partially initialized struct. (A +field of a partially initialized struct. (a [solution to this problem](https://github.com/rust-lang/rfcs/pull/2582) is being -worked on.) +worked on). One last remark: when reading old Rust code, you might stumble upon the deprecated `mem::uninitialized` function. That function used to be the only way diff --git a/src/vec-into-iter.md b/src/vec-into-iter.md index 03c2a9d..39057d4 100644 --- a/src/vec-into-iter.md +++ b/src/vec-into-iter.md @@ -129,7 +129,6 @@ Because IntoIter takes ownership of its allocation, it needs to implement Drop to free it. However it also wants to implement Drop to drop any elements it contains that weren't yielded. - ```rust,ignore impl Drop for IntoIter { fn drop(&mut self) { diff --git a/src/vec-zsts.md b/src/vec-zsts.md index d1b8fe8..500b6f3 100644 --- a/src/vec-zsts.md +++ b/src/vec-zsts.md @@ -13,9 +13,6 @@ zero-sized types. We need to be careful of two things: Thankfully we abstracted out pointer-iterators and allocating handling into `RawValIter` and `RawVec` respectively. How mysteriously convenient. - - - ## Allocating Zero-Sized Types So if the allocator API doesn't support zero-sized allocations, what on earth @@ -103,9 +100,6 @@ impl Drop for RawVec { That's it. We support pushing and popping zero-sized types now. Our iterators (that aren't provided by slice Deref) are still busted, though. - - - ## Iterating Zero-Sized Types Zero-sized offsets are no-ops. This means that our current design will always diff --git a/src/what-unsafe-does.md b/src/what-unsafe-does.md index c82c70c..6a3c79f 100644 --- a/src/what-unsafe-does.md +++ b/src/what-unsafe-does.md @@ -24,22 +24,22 @@ language cares about is preventing the following things: not support * Producing invalid values (either alone or as a field of a compound type such as `enum`/`struct`/array/tuple): - * a `bool` that isn't 0 or 1 - * an `enum` with an invalid discriminant - * a null `fn` pointer - * a `char` outside the ranges [0x0, 0xD7FF] and [0xE000, 0x10FFFF] - * a `!` (all values are invalid for this type) - * an integer (`i*`/`u*`), floating point value (`f*`), or raw pointer read from - [uninitialized memory][], or uninitialized memory in a `str`. - * a reference/`Box` that is dangling, unaligned, or points to an invalid value. - * a wide reference, `Box`, or raw pointer that has invalid metadata: - * `dyn Trait` metadata is invalid if it is not a pointer to a vtable for - `Trait` that matches the actual dynamic trait the pointer or reference points to - * slice metadata is invalid if the length is not a valid `usize` - (i.e., it must not be read from uninitialized memory) - * a type with custom invalid values that is one of those values, such as a - [`NonNull`] that is null. (Requesting custom invalid values is an unstable - feature, but some stable libstd types, like `NonNull`, make use of it.) + * a `bool` that isn't 0 or 1 + * an `enum` with an invalid discriminant + * a null `fn` pointer + * a `char` outside the ranges [0x0, 0xD7FF] and [0xE000, 0x10FFFF] + * a `!` (all values are invalid for this type) + * an integer (`i*`/`u*`), floating point value (`f*`), or raw pointer read from + [uninitialized memory][], or uninitialized memory in a `str`. + * a reference/`Box` that is dangling, unaligned, or points to an invalid value. + * a wide reference, `Box`, or raw pointer that has invalid metadata: + * `dyn Trait` metadata is invalid if it is not a pointer to a vtable for + `Trait` that matches the actual dynamic trait the pointer or reference points to + * slice metadata is invalid if the length is not a valid `usize` + (i.e., it must not be read from uninitialized memory) + * a type with custom invalid values that is one of those values, such as a + [`NonNull`] that is null. (Requesting custom invalid values is an unstable + feature, but some stable libstd types, like `NonNull`, make use of it.) "Producing" a value happens any time a value is assigned, passed to a function/primitive operation or returned from a function/primitive operation. diff --git a/src/working-with-unsafe.md b/src/working-with-unsafe.md index 935ce58..0649121 100644 --- a/src/working-with-unsafe.md +++ b/src/working-with-unsafe.md @@ -117,4 +117,3 @@ it prevents us from having to trust all the safe code in the universe from messi with our trusted state. Safety lives! - From 4b6eb0ff96022fd7323c6270f6ad7fa5973fa510 Mon Sep 17 00:00:00 2001 From: Yuki Okushi Date: Tue, 8 Jun 2021 10:32:26 +0900 Subject: [PATCH 2/2] Update some wording making reference to issues/RFCs --- src/exotic-sizes.md | 6 +++--- src/other-reprs.md | 7 ++++--- src/unchecked-uninit.md | 2 +- 3 files changed, 8 insertions(+), 7 deletions(-) diff --git a/src/exotic-sizes.md b/src/exotic-sizes.md index 6cbbf83..4357d80 100644 --- a/src/exotic-sizes.md +++ b/src/exotic-sizes.md @@ -172,10 +172,10 @@ of UB. There is [an accepted RFC][extern-types] to add proper types with an unknown size, called *extern types*, which would let Rust developers model things like C's `void*` and other "declared but never defined" types more accurately. However as of -Rust 2018, the feature is stuck in limbo over how `size_of::()` -should behave. +Rust 2018, [the feature is stuck in limbo over how `size_of_val::()` +should behave][extern-types-issue]. -[dst-issue]: https://github.com/rust-lang/rust/issues/26403 [extern-types]: https://github.com/rust-lang/rfcs/blob/master/text/1861-extern-types.md +[extern-types-issue]: https://github.com/rust-lang/rust/issues/43467 [`str`]: ../std/primitive.str.html [slice]: ../std/primitive.slice.html diff --git a/src/other-reprs.md b/src/other-reprs.md index 10f337b..c2eea41 100644 --- a/src/other-reprs.md +++ b/src/other-reprs.md @@ -15,7 +15,7 @@ reinterpreting values as a different type. We strongly recommend using [rust-bindgen][] and/or [cbindgen][] to manage your FFI boundaries for you. The Rust team works closely with those projects to ensure that they work robustly and are compatible with current and future guarantees -about type layouts and reprs. +about type layouts and `repr`s. The interaction of `repr(C)` with Rust's more exotic data layout features must be kept in mind. Due to its dual purpose as "for FFI" and "for layout control", @@ -92,7 +92,7 @@ manipulate its tag and fields. See [the RFC][really-tagged] for details. Adding an explicit `repr` to an enum suppresses the null-pointer optimization. -These reprs have no effect on a struct. +These `repr`s have no effect on a struct. ## repr(packed) @@ -107,7 +107,8 @@ compiler might be able to paper over alignment issues with shifts and masks. However if you take a reference to a packed field, it's unlikely that the compiler will be able to emit code to avoid an unaligned load. -**[As of Rust 2018, this still can cause undefined behavior.][ub loads]** +[As this can cause undefined behavior][ub loads], the lint has been implemented +and it will become a hard error. `repr(packed)` is not to be used lightly. Unless you have extreme requirements, this should not be used. diff --git a/src/unchecked-uninit.md b/src/unchecked-uninit.md index e6df9cd..3c6913c 100644 --- a/src/unchecked-uninit.md +++ b/src/unchecked-uninit.md @@ -137,7 +137,7 @@ also cannot use `&mut base_ptr.field` as that would be creating a reference. Thus, it is currently not possible to create a raw pointer to a field of a partially initialized struct, and also not possible to initialize a single field of a partially initialized struct. (a -[solution to this problem](https://github.com/rust-lang/rfcs/pull/2582) is being +[solution to this problem](https://github.com/rust-lang/rust/issues/64490) is being worked on). One last remark: when reading old Rust code, you might stumble upon the