diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml index 5558603..066afca 100644 --- a/.github/workflows/main.yml +++ b/.github/workflows/main.yml @@ -4,7 +4,7 @@ on: merge_group: env: - MDBOOK_VERSION: 0.4.40 + MDBOOK_VERSION: 0.4.45 jobs: test: diff --git a/book.toml b/book.toml index 693aca4..b2c0d11 100644 --- a/book.toml +++ b/book.toml @@ -32,4 +32,4 @@ git-repository-url = "https://github.com/rust-lang/nomicon" "./arc.html" = "./arc-mutex/arc.html" [rust] -edition = "2021" +edition = "2024" diff --git a/src/arc-mutex/arc-drop.md b/src/arc-mutex/arc-drop.md index 3dd9f03..db59860 100644 --- a/src/arc-mutex/arc-drop.md +++ b/src/arc-mutex/arc-drop.md @@ -51,7 +51,7 @@ implementation of `Arc`][3]: > > "acquire" operation before deleting the object. > > In particular, while the contents of an Arc are usually immutable, it's -> possible to have interior writes to something like a Mutex. Since a Mutex +> possible to have interior writes to something like a `Mutex`. Since a Mutex > is not acquired when it is deleted, we can't rely on its synchronization logic > to make writes in thread A visible to a destructor running in thread B. > diff --git a/src/beneath-std.md b/src/beneath-std.md index da2cc50..4c5bcca 100644 --- a/src/beneath-std.md +++ b/src/beneath-std.md @@ -30,46 +30,10 @@ We will probably need a nightly version of the compiler to produce a `#![no_std]` executable because on many platforms, we have to provide the `eh_personality` [lang item], which is unstable. -Controlling the entry point is possible in two ways: the `#[start]` attribute, -or overriding the default shim for the C `main` function with your own. -Additionally, it's required to define a [panic handler function](panic-handler.html). - -The function marked `#[start]` is passed the command line parameters -in the same format as C (aside from the exact integer types being used): - -```rust -#![feature(start, lang_items, core_intrinsics, rustc_private)] -#![allow(internal_features)] -#![no_std] +You will need to define a symbol for the entry point that is suitable for your target. For example, `main`, `_start`, `WinMain`, or whatever starting point is relevant for your target. +Additionally, you need to use the `#![no_main]` attribute to prevent the compiler from attempting to generate an entry point itself. -// Necessary for `panic = "unwind"` builds on cfg(unix) platforms. -#![feature(panic_unwind)] -extern crate unwind; - -// Pull in the system libc library for what crt0.o likely requires. -#[cfg(not(windows))] -extern crate libc; - -use core::panic::PanicInfo; - -// Entry point for this program. -#[start] -fn main(_argc: isize, _argv: *const *const u8) -> isize { - 0 -} - -// These functions are used by the compiler, but not for an empty program like this. -// They are normally provided by `std`. -#[lang = "eh_personality"] -fn rust_eh_personality() {} -#[panic_handler] -fn panic_handler(_info: &PanicInfo) -> ! { core::intrinsics::abort() } -``` - -To override the compiler-inserted `main` shim, we have to disable it -with `#![no_main]` and then create the appropriate symbol with the -correct ABI and the correct name, which requires overriding the -compiler's name mangling too: +Additionally, it's required to define a [panic handler function](panic-handler.html). ```rust #![feature(lang_items, core_intrinsics, rustc_private)] @@ -89,7 +53,7 @@ use core::ffi::{c_char, c_int}; use core::panic::PanicInfo; // Entry point for this program. -#[no_mangle] // ensure that this symbol is included in the output as `main` +#[unsafe(no_mangle)] // ensure that this symbol is included in the output as `main` extern "C" fn main(_argc: c_int, _argv: *const *const c_char) -> c_int { 0 } diff --git a/src/exotic-sizes.md b/src/exotic-sizes.md index 5e6a395..f8c6602 100644 --- a/src/exotic-sizes.md +++ b/src/exotic-sizes.md @@ -8,7 +8,7 @@ This isn't always the case in Rust. Rust supports Dynamically Sized Types (DSTs): types without a statically known size or alignment. On the surface, this is a bit nonsensical: Rust *must* know the size and alignment of something in order to correctly work with it! In -this regard, DSTs are not normal types. Because they lack a statically known +this regard, DSTs are not normal types. Since they lack a statically known size, these types can only exist behind a pointer. Any pointer to a DST consequently becomes a *wide* pointer consisting of the pointer and the information that "completes" them (more on this below). @@ -40,7 +40,7 @@ struct MySuperSlice { } ``` -Although such a type is largely useless without a way to construct it. Currently the +Unfortunately, such a type is largely useless without a way to construct it. Currently the only properly supported way to create a custom DST is by making your type generic and performing an *unsizing coercion*: diff --git a/src/ffi.md b/src/ffi.md index b76f0b2..76e6950 100644 --- a/src/ffi.md +++ b/src/ffi.md @@ -31,7 +31,7 @@ compile if snappy is installed: use libc::size_t; #[link(name = "snappy")] -extern { +unsafe extern "C" { fn snappy_max_compressed_length(source_length: size_t) -> size_t; } @@ -64,7 +64,7 @@ The `extern` block can be extended to cover the entire snappy API: use libc::{c_int, size_t}; #[link(name = "snappy")] -extern { +unsafe extern { fn snappy_compress(input: *const u8, input_length: size_t, compressed: *mut u8, @@ -251,7 +251,7 @@ First, we assume you have a lib crate named as `rust_from_c`. `lib.rs` should have Rust code as following: ```rust -#[no_mangle] +#[unsafe(no_mangle)] pub extern "C" fn hello_from_rust() { println!("Hello from Rust!"); } @@ -331,7 +331,7 @@ extern fn callback(a: i32) { } #[link(name = "extlib")] -extern { +unsafe extern { fn register_callback(cb: extern fn(i32)) -> i32; fn trigger_callback(); } @@ -383,7 +383,7 @@ struct RustObject { // Other members... } -extern "C" fn callback(target: *mut RustObject, a: i32) { +unsafe extern "C" fn callback(target: *mut RustObject, a: i32) { println!("I'm called from C with value {0}", a); unsafe { // Update the value in RustObject with the value received from the callback: @@ -392,9 +392,9 @@ extern "C" fn callback(target: *mut RustObject, a: i32) { } #[link(name = "extlib")] -extern { +unsafe extern { fn register_callback(target: *mut RustObject, - cb: extern fn(*mut RustObject, i32)) -> i32; + cb: unsafe extern fn(*mut RustObject, i32)) -> i32; fn trigger_callback(); } @@ -523,7 +523,7 @@ blocks with the `static` keyword: ```rust,ignore #[link(name = "readline")] -extern { +unsafe extern { static rl_readline_version: libc::c_int; } @@ -543,7 +543,7 @@ use std::ffi::CString; use std::ptr; #[link(name = "readline")] -extern { +unsafe extern { static mut rl_prompt: *const libc::c_char; } @@ -573,7 +573,7 @@ conventions. Rust provides a way to tell the compiler which convention to use: #[cfg(all(target_os = "win32", target_arch = "x86"))] #[link(name = "kernel32")] #[allow(non_snake_case)] -extern "stdcall" { +unsafe extern "stdcall" { fn SetEnvironmentVariableA(n: *const u8, v: *const u8) -> libc::c_int; } # fn main() { } @@ -635,7 +635,7 @@ In C, functions can be 'variadic', meaning they accept a variable number of argu be achieved in Rust by specifying `...` within the argument list of a foreign function declaration: ```no_run -extern { +unsafe extern { fn foo(x: i32, ...); } @@ -685,7 +685,7 @@ we have function pointers flying across the FFI boundary in both directions. use libc::c_int; # #[cfg(hidden)] -extern "C" { +unsafe extern "C" { /// Registers the callback. fn register(cb: Option c_int>, c_int) -> c_int>); } @@ -750,8 +750,8 @@ mechanisms (notably C++'s `try`/`catch`). ```rust,ignore -#[no_mangle] -extern "C-unwind" fn example() { +#[unsafe(no_mangle)] +unsafe extern "C-unwind" fn example() { panic!("Uh oh"); } ``` @@ -780,13 +780,13 @@ If the C++ frames have objects, their destructors will be called. ```rust,ignore #[link(...)] -extern "C-unwind" { +unsafe extern "C-unwind" { // A C++ function that may throw an exception fn may_throw(); } -#[no_mangle] -extern "C-unwind" fn rust_passthrough() { +#[unsafe(no_mangle)] +unsafe extern "C-unwind" fn rust_passthrough() { let b = Box::new(5); unsafe { may_throw(); } println!("{:?}", &b); @@ -816,7 +816,7 @@ will be printed. ### `panic` can be stopped at an ABI boundary ```rust -#[no_mangle] +#[unsafe(no_mangle)] extern "C" fn assert_nonzero(input: u32) { assert!(input != 0) } @@ -833,7 +833,7 @@ process if it panics, you must use [`catch_unwind`]: ```rust use std::panic::catch_unwind; -#[no_mangle] +#[unsafe(no_mangle)] pub extern "C" fn oh_no() -> i32 { let result = catch_unwind(|| { panic!("Oops!"); @@ -867,7 +867,7 @@ We can represent this in Rust with the `c_void` type: ```rust,ignore -extern "C" { +unsafe extern "C" { pub fn foo(arg: *mut libc::c_void); pub fn bar(arg: *mut libc::c_void); } @@ -891,18 +891,18 @@ To do this in Rust, let’s create our own opaque types: ```rust #[repr(C)] pub struct Foo { - _data: [u8; 0], + _data: (), _marker: core::marker::PhantomData<(*mut u8, core::marker::PhantomPinned)>, } #[repr(C)] pub struct Bar { - _data: [u8; 0], + _data: (), _marker: core::marker::PhantomData<(*mut u8, core::marker::PhantomPinned)>, } -extern "C" { +unsafe extern "C" { pub fn foo(arg: *mut Foo); pub fn bar(arg: *mut Bar); } diff --git a/src/intro.md b/src/intro.md index 323c0ce..fbe6e37 100644 --- a/src/intro.md +++ b/src/intro.md @@ -39,7 +39,7 @@ Topics that are within the scope of this book include: the meaning of (un)safety The Rustonomicon is not a place to exhaustively describe the semantics and guarantees of every single API in the standard library, nor is it a place to exhaustively describe every feature of Rust. -Unless otherwise noted, Rust code in this book uses the Rust 2021 edition. +Unless otherwise noted, Rust code in this book uses the Rust 2024 edition. [trpl]: ../book/index.html [ref]: ../reference/index.html diff --git a/src/other-reprs.md b/src/other-reprs.md index 289da57..6b397c9 100644 --- a/src/other-reprs.md +++ b/src/other-reprs.md @@ -7,7 +7,8 @@ There's also the [unsafe code guidelines] (note that it's **NOT** normative). This is the most important `repr`. It has fairly simple intent: do what C does. The order, size, and alignment of fields is exactly what you would expect from C -or C++. Any type you expect to pass through an FFI boundary should have +or C++. The type is also passed across `extern "C"` function call boundaries the +same way C would pass the corresponding type. Any type you expect to pass through an FFI boundary should have `repr(C)`, as C is the lingua-franca of the programming world. This is also necessary to soundly do more elaborate tricks with data layout such as reinterpreting values as a different type. @@ -86,10 +87,14 @@ be 0. However Rust will not allow you to create an enum where two variants have the same discriminant. The term "fieldless enum" only means that the enum doesn't have data in any -of its variants. A fieldless enum without a `repr(u*)` or `repr(C)` is -still a Rust native type, and does not have a stable ABI representation. -Adding a `repr` causes it to be treated exactly like the specified -integer type for ABI purposes. +of its variants. A fieldless enum without a `repr` is +still a Rust native type, and does not have a stable layout or representation. +Adding a `repr(u*)`/`repr(i*)` causes it to be treated exactly like the specified +integer type for layout purposes (except that the compiler will still exploit its +knowledge of "invalid" values at this type to optimize enum layout, such as when +this enum is wrapped in `Option`). Note that the function call ABI for these +types is still in general unspecified, except that across `extern "C"` calls they +are ABI-compatible with C enums of the same sign and size. If the enum has fields, the effect is similar to the effect of `repr(C)` in that there is a defined layout of the type. This makes it possible to @@ -119,31 +124,34 @@ assert_eq!(16, size_of::>()); This optimization still applies to fieldless enums with an explicit `repr(u*)`, `repr(i*)`, or `repr(C)`. -## repr(packed) +## repr(packed), repr(packed(n)) -`repr(packed)` forces Rust to strip any padding, and only align the type to a -byte. This may improve the memory footprint, but will likely have other negative -side-effects. +`repr(packed(n))` (where `n` is a power of two) forces the type to have an +alignment of *at most* `n`. Most commonly used without an explicit `n`, +`repr(packed)` is equivalent to `repr(packed(1))` which forces Rust to strip +any padding, and only align the type to a byte. This may improve the memory +footprint, but will likely have other negative side-effects. -In particular, most architectures *strongly* prefer values to be aligned. This -may mean the unaligned loads are penalized (x86), or even fault (some ARM -chips). For simple cases like directly loading or storing a packed field, the -compiler might be able to paper over alignment issues with shifts and masks. -However if you take a reference to a packed field, it's unlikely that the -compiler will be able to emit code to avoid an unaligned load. +In particular, most architectures *strongly* prefer values to be naturally +aligned. This may mean that unaligned loads are penalized (x86), or even fault +(some ARM chips). For simple cases like directly loading or storing a packed +field, the compiler might be able to paper over alignment issues with shifts +and masks. However if you take a reference to a packed field, it's unlikely +that the compiler will be able to emit code to avoid an unaligned load. [As this can cause undefined behavior][ub loads], the lint has been implemented and it will become a hard error. -`repr(packed)` is not to be used lightly. Unless you have extreme requirements, -this should not be used. +`repr(packed)/repr(packed(n))` is not to be used lightly. Unless you have +extreme requirements, this should not be used. -This repr is a modifier on `repr(C)` and `repr(Rust)`. +This repr is a modifier on `repr(C)` and `repr(Rust)`. For FFI compatibility +you most likely always want to be explicit: `repr(C, packed)`. ## repr(align(n)) `repr(align(n))` (where `n` is a power of two) forces the type to have an -alignment of *at least* n. +alignment of *at least* `n`. This enables several tricks, like making sure neighboring elements of an array never share the same cache line with each other (which may speed up certain diff --git a/src/races.md b/src/races.md index aaeaf5b..6f8b9a3 100644 --- a/src/races.md +++ b/src/races.md @@ -60,8 +60,8 @@ thread::spawn(move || { println!("{}", data[idx.load(Ordering::SeqCst)]); ``` -We can cause a data race if we instead do the bound check in advance, and then -unsafely access the data with an unchecked value: +We can cause a race condition to violate memory safety if we instead do the bound +check in advance, and then unsafely access the data with an unchecked value: ```rust,no_run use std::thread; diff --git a/src/references.md b/src/references.md index 294fe1c..2625c0c 100644 --- a/src/references.md +++ b/src/references.md @@ -1,6 +1,6 @@ # References -There are two kinds of reference: +There are two kinds of references: * Shared reference: `&` * Mutable reference: `&mut` diff --git a/src/send-and-sync.md b/src/send-and-sync.md index 808a5c3..c475da5 100644 --- a/src/send-and-sync.md +++ b/src/send-and-sync.md @@ -89,7 +89,7 @@ to the heap. # pub use ::std::os::raw::{c_int, c_void}; # #[allow(non_camel_case_types)] # pub type size_t = usize; -# extern "C" { pub fn posix_memalign(memptr: *mut *mut c_void, align: size_t, size: size_t) -> c_int; } +# unsafe extern "C" { pub fn posix_memalign(memptr: *mut *mut c_void, align: size_t, size: size_t) -> c_int; } # } use std::{ mem::{align_of, size_of}, @@ -225,7 +225,7 @@ allocation done on another thread. We can check this is true in the docs for # struct Carton(std::ptr::NonNull); # mod libc { # pub use ::std::os::raw::c_void; -# extern "C" { pub fn free(p: *mut c_void); } +# unsafe extern "C" { pub fn free(p: *mut c_void); } # } impl Drop for Carton { fn drop(&mut self) { @@ -253,6 +253,6 @@ only to data races? [box-is-special]: https://manishearth.github.io/blog/2017/01/10/rust-tidbits-box-is-special/ [deref-doc]: https://doc.rust-lang.org/core/ops/trait.Deref.html [deref-mut-doc]: https://doc.rust-lang.org/core/ops/trait.DerefMut.html -[mutex-guard-not-send-docs-rs]: https://doc.rust-lang.org/std/sync/struct.MutexGuard.html#impl-Send +[mutex-guard-not-send-docs-rs]: https://doc.rust-lang.org/std/sync/struct.MutexGuard.html#impl-Send-for-MutexGuard%3C'_,+T%3E [mutex-guard-not-send-comment]: https://github.com/rust-lang/rust/issues/23465#issuecomment-82730326 [libc-free-docs]: https://linux.die.net/man/3/free diff --git a/src/unchecked-uninit.md b/src/unchecked-uninit.md index b3dd31c..5665996 100644 --- a/src/unchecked-uninit.md +++ b/src/unchecked-uninit.md @@ -134,7 +134,7 @@ to compute the address of array index `idx`. This relies on how arrays are laid out in memory. * For a struct, however, in general we do not know how it is laid out, and we also cannot use `&mut base_ptr.field` as that would be creating a -reference. So, you must carefully use the [`addr_of_mut`] macro. This creates +reference. So, you must carefully use the [raw reference][raw_reference] syntax. This creates a raw pointer to the field without creating an intermediate reference: ```rust @@ -147,7 +147,7 @@ struct Demo { let mut uninit = MaybeUninit::::uninit(); // `&uninit.as_mut().field` would create a reference to an uninitialized `bool`, // and thus be Undefined Behavior! -let f1_ptr = unsafe { ptr::addr_of_mut!((*uninit.as_mut_ptr()).field) }; +let f1_ptr = unsafe { &raw mut (*uninit.as_mut_ptr()).field }; unsafe { f1_ptr.write(true); } let init = unsafe { uninit.assume_init() }; @@ -167,7 +167,7 @@ it around at all, be sure to be *really* careful. [`MaybeUninit`]: ../core/mem/union.MaybeUninit.html [assume_init]: ../core/mem/union.MaybeUninit.html#method.assume_init [`ptr`]: ../core/ptr/index.html -[`addr_of_mut`]: ../core/ptr/macro.addr_of_mut.html +[raw_reference]: ../reference/types/pointer.html#r-type.pointer.raw.constructor [`write`]: ../core/ptr/fn.write.html [`copy`]: ../std/ptr/fn.copy.html [`copy_nonoverlapping`]: ../std/ptr/fn.copy_nonoverlapping.html