Merge branch 'master' into handle-drop-zst

pull/425/head
pwbh 2 weeks ago committed by GitHub
commit 0d116cfab0
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

@ -4,7 +4,7 @@ on:
merge_group: merge_group:
env: env:
MDBOOK_VERSION: 0.4.40 MDBOOK_VERSION: 0.4.45
jobs: jobs:
test: test:

@ -32,4 +32,4 @@ git-repository-url = "https://github.com/rust-lang/nomicon"
"./arc.html" = "./arc-mutex/arc.html" "./arc.html" = "./arc-mutex/arc.html"
[rust] [rust]
edition = "2021" edition = "2024"

@ -51,7 +51,7 @@ implementation of `Arc`][3]:
> > "acquire" operation before deleting the object. > > "acquire" operation before deleting the object.
> >
> In particular, while the contents of an Arc are usually immutable, it's > In particular, while the contents of an Arc are usually immutable, it's
> possible to have interior writes to something like a Mutex<T>. Since a Mutex > possible to have interior writes to something like a `Mutex<T>`. Since a Mutex
> is not acquired when it is deleted, we can't rely on its synchronization logic > is not acquired when it is deleted, we can't rely on its synchronization logic
> to make writes in thread A visible to a destructor running in thread B. > to make writes in thread A visible to a destructor running in thread B.
> >

@ -30,46 +30,10 @@ We will probably need a nightly version of the compiler to produce
a `#![no_std]` executable because on many platforms, we have to provide the a `#![no_std]` executable because on many platforms, we have to provide the
`eh_personality` [lang item], which is unstable. `eh_personality` [lang item], which is unstable.
Controlling the entry point is possible in two ways: the `#[start]` attribute, You will need to define a symbol for the entry point that is suitable for your target. For example, `main`, `_start`, `WinMain`, or whatever starting point is relevant for your target.
or overriding the default shim for the C `main` function with your own. Additionally, you need to use the `#![no_main]` attribute to prevent the compiler from attempting to generate an entry point itself.
Additionally, it's required to define a [panic handler function](panic-handler.html).
The function marked `#[start]` is passed the command line parameters
in the same format as C (aside from the exact integer types being used):
```rust
#![feature(start, lang_items, core_intrinsics, rustc_private)]
#![allow(internal_features)]
#![no_std]
// Necessary for `panic = "unwind"` builds on cfg(unix) platforms. Additionally, it's required to define a [panic handler function](panic-handler.html).
#![feature(panic_unwind)]
extern crate unwind;
// Pull in the system libc library for what crt0.o likely requires.
#[cfg(not(windows))]
extern crate libc;
use core::panic::PanicInfo;
// Entry point for this program.
#[start]
fn main(_argc: isize, _argv: *const *const u8) -> isize {
0
}
// These functions are used by the compiler, but not for an empty program like this.
// They are normally provided by `std`.
#[lang = "eh_personality"]
fn rust_eh_personality() {}
#[panic_handler]
fn panic_handler(_info: &PanicInfo) -> ! { core::intrinsics::abort() }
```
To override the compiler-inserted `main` shim, we have to disable it
with `#![no_main]` and then create the appropriate symbol with the
correct ABI and the correct name, which requires overriding the
compiler's name mangling too:
```rust ```rust
#![feature(lang_items, core_intrinsics, rustc_private)] #![feature(lang_items, core_intrinsics, rustc_private)]
@ -89,7 +53,7 @@ use core::ffi::{c_char, c_int};
use core::panic::PanicInfo; use core::panic::PanicInfo;
// Entry point for this program. // Entry point for this program.
#[no_mangle] // ensure that this symbol is included in the output as `main` #[unsafe(no_mangle)] // ensure that this symbol is included in the output as `main`
extern "C" fn main(_argc: c_int, _argv: *const *const c_char) -> c_int { extern "C" fn main(_argc: c_int, _argv: *const *const c_char) -> c_int {
0 0
} }

@ -8,7 +8,7 @@ This isn't always the case in Rust.
Rust supports Dynamically Sized Types (DSTs): types without a statically Rust supports Dynamically Sized Types (DSTs): types without a statically
known size or alignment. On the surface, this is a bit nonsensical: Rust *must* known size or alignment. On the surface, this is a bit nonsensical: Rust *must*
know the size and alignment of something in order to correctly work with it! In know the size and alignment of something in order to correctly work with it! In
this regard, DSTs are not normal types. Because they lack a statically known this regard, DSTs are not normal types. Since they lack a statically known
size, these types can only exist behind a pointer. Any pointer to a size, these types can only exist behind a pointer. Any pointer to a
DST consequently becomes a *wide* pointer consisting of the pointer and the DST consequently becomes a *wide* pointer consisting of the pointer and the
information that "completes" them (more on this below). information that "completes" them (more on this below).
@ -40,7 +40,7 @@ struct MySuperSlice {
} }
``` ```
Although such a type is largely useless without a way to construct it. Currently the Unfortunately, such a type is largely useless without a way to construct it. Currently the
only properly supported way to create a custom DST is by making your type generic only properly supported way to create a custom DST is by making your type generic
and performing an *unsizing coercion*: and performing an *unsizing coercion*:

@ -31,7 +31,7 @@ compile if snappy is installed:
use libc::size_t; use libc::size_t;
#[link(name = "snappy")] #[link(name = "snappy")]
extern { unsafe extern "C" {
fn snappy_max_compressed_length(source_length: size_t) -> size_t; fn snappy_max_compressed_length(source_length: size_t) -> size_t;
} }
@ -64,7 +64,7 @@ The `extern` block can be extended to cover the entire snappy API:
use libc::{c_int, size_t}; use libc::{c_int, size_t};
#[link(name = "snappy")] #[link(name = "snappy")]
extern { unsafe extern {
fn snappy_compress(input: *const u8, fn snappy_compress(input: *const u8,
input_length: size_t, input_length: size_t,
compressed: *mut u8, compressed: *mut u8,
@ -251,7 +251,7 @@ First, we assume you have a lib crate named as `rust_from_c`.
`lib.rs` should have Rust code as following: `lib.rs` should have Rust code as following:
```rust ```rust
#[no_mangle] #[unsafe(no_mangle)]
pub extern "C" fn hello_from_rust() { pub extern "C" fn hello_from_rust() {
println!("Hello from Rust!"); println!("Hello from Rust!");
} }
@ -331,7 +331,7 @@ extern fn callback(a: i32) {
} }
#[link(name = "extlib")] #[link(name = "extlib")]
extern { unsafe extern {
fn register_callback(cb: extern fn(i32)) -> i32; fn register_callback(cb: extern fn(i32)) -> i32;
fn trigger_callback(); fn trigger_callback();
} }
@ -383,7 +383,7 @@ struct RustObject {
// Other members... // Other members...
} }
extern "C" fn callback(target: *mut RustObject, a: i32) { unsafe extern "C" fn callback(target: *mut RustObject, a: i32) {
println!("I'm called from C with value {0}", a); println!("I'm called from C with value {0}", a);
unsafe { unsafe {
// Update the value in RustObject with the value received from the callback: // Update the value in RustObject with the value received from the callback:
@ -392,9 +392,9 @@ extern "C" fn callback(target: *mut RustObject, a: i32) {
} }
#[link(name = "extlib")] #[link(name = "extlib")]
extern { unsafe extern {
fn register_callback(target: *mut RustObject, fn register_callback(target: *mut RustObject,
cb: extern fn(*mut RustObject, i32)) -> i32; cb: unsafe extern fn(*mut RustObject, i32)) -> i32;
fn trigger_callback(); fn trigger_callback();
} }
@ -523,7 +523,7 @@ blocks with the `static` keyword:
<!-- ignore: requires libc crate --> <!-- ignore: requires libc crate -->
```rust,ignore ```rust,ignore
#[link(name = "readline")] #[link(name = "readline")]
extern { unsafe extern {
static rl_readline_version: libc::c_int; static rl_readline_version: libc::c_int;
} }
@ -543,7 +543,7 @@ use std::ffi::CString;
use std::ptr; use std::ptr;
#[link(name = "readline")] #[link(name = "readline")]
extern { unsafe extern {
static mut rl_prompt: *const libc::c_char; static mut rl_prompt: *const libc::c_char;
} }
@ -573,7 +573,7 @@ conventions. Rust provides a way to tell the compiler which convention to use:
#[cfg(all(target_os = "win32", target_arch = "x86"))] #[cfg(all(target_os = "win32", target_arch = "x86"))]
#[link(name = "kernel32")] #[link(name = "kernel32")]
#[allow(non_snake_case)] #[allow(non_snake_case)]
extern "stdcall" { unsafe extern "stdcall" {
fn SetEnvironmentVariableA(n: *const u8, v: *const u8) -> libc::c_int; fn SetEnvironmentVariableA(n: *const u8, v: *const u8) -> libc::c_int;
} }
# fn main() { } # fn main() { }
@ -635,7 +635,7 @@ In C, functions can be 'variadic', meaning they accept a variable number of argu
be achieved in Rust by specifying `...` within the argument list of a foreign function declaration: be achieved in Rust by specifying `...` within the argument list of a foreign function declaration:
```no_run ```no_run
extern { unsafe extern {
fn foo(x: i32, ...); fn foo(x: i32, ...);
} }
@ -685,7 +685,7 @@ we have function pointers flying across the FFI boundary in both directions.
use libc::c_int; use libc::c_int;
# #[cfg(hidden)] # #[cfg(hidden)]
extern "C" { unsafe extern "C" {
/// Registers the callback. /// Registers the callback.
fn register(cb: Option<extern "C" fn(Option<extern "C" fn(c_int) -> c_int>, c_int) -> c_int>); fn register(cb: Option<extern "C" fn(Option<extern "C" fn(c_int) -> c_int>, c_int) -> c_int>);
} }
@ -750,8 +750,8 @@ mechanisms (notably C++'s `try`/`catch`).
<!-- ignore: using unstable feature --> <!-- ignore: using unstable feature -->
```rust,ignore ```rust,ignore
#[no_mangle] #[unsafe(no_mangle)]
extern "C-unwind" fn example() { unsafe extern "C-unwind" fn example() {
panic!("Uh oh"); panic!("Uh oh");
} }
``` ```
@ -780,13 +780,13 @@ If the C++ frames have objects, their destructors will be called.
<!-- ignore: using unstable feature --> <!-- ignore: using unstable feature -->
```rust,ignore ```rust,ignore
#[link(...)] #[link(...)]
extern "C-unwind" { unsafe extern "C-unwind" {
// A C++ function that may throw an exception // A C++ function that may throw an exception
fn may_throw(); fn may_throw();
} }
#[no_mangle] #[unsafe(no_mangle)]
extern "C-unwind" fn rust_passthrough() { unsafe extern "C-unwind" fn rust_passthrough() {
let b = Box::new(5); let b = Box::new(5);
unsafe { may_throw(); } unsafe { may_throw(); }
println!("{:?}", &b); println!("{:?}", &b);
@ -816,7 +816,7 @@ will be printed.
### `panic` can be stopped at an ABI boundary ### `panic` can be stopped at an ABI boundary
```rust ```rust
#[no_mangle] #[unsafe(no_mangle)]
extern "C" fn assert_nonzero(input: u32) { extern "C" fn assert_nonzero(input: u32) {
assert!(input != 0) assert!(input != 0)
} }
@ -833,7 +833,7 @@ process if it panics, you must use [`catch_unwind`]:
```rust ```rust
use std::panic::catch_unwind; use std::panic::catch_unwind;
#[no_mangle] #[unsafe(no_mangle)]
pub extern "C" fn oh_no() -> i32 { pub extern "C" fn oh_no() -> i32 {
let result = catch_unwind(|| { let result = catch_unwind(|| {
panic!("Oops!"); panic!("Oops!");
@ -867,7 +867,7 @@ We can represent this in Rust with the `c_void` type:
<!-- ignore: requires libc crate --> <!-- ignore: requires libc crate -->
```rust,ignore ```rust,ignore
extern "C" { unsafe extern "C" {
pub fn foo(arg: *mut libc::c_void); pub fn foo(arg: *mut libc::c_void);
pub fn bar(arg: *mut libc::c_void); pub fn bar(arg: *mut libc::c_void);
} }
@ -891,18 +891,18 @@ To do this in Rust, lets create our own opaque types:
```rust ```rust
#[repr(C)] #[repr(C)]
pub struct Foo { pub struct Foo {
_data: [u8; 0], _data: (),
_marker: _marker:
core::marker::PhantomData<(*mut u8, core::marker::PhantomPinned)>, core::marker::PhantomData<(*mut u8, core::marker::PhantomPinned)>,
} }
#[repr(C)] #[repr(C)]
pub struct Bar { pub struct Bar {
_data: [u8; 0], _data: (),
_marker: _marker:
core::marker::PhantomData<(*mut u8, core::marker::PhantomPinned)>, core::marker::PhantomData<(*mut u8, core::marker::PhantomPinned)>,
} }
extern "C" { unsafe extern "C" {
pub fn foo(arg: *mut Foo); pub fn foo(arg: *mut Foo);
pub fn bar(arg: *mut Bar); pub fn bar(arg: *mut Bar);
} }

@ -39,7 +39,7 @@ Topics that are within the scope of this book include: the meaning of (un)safety
The Rustonomicon is not a place to exhaustively describe the semantics and guarantees of every single API in the standard library, nor is it a place to exhaustively describe every feature of Rust. The Rustonomicon is not a place to exhaustively describe the semantics and guarantees of every single API in the standard library, nor is it a place to exhaustively describe every feature of Rust.
Unless otherwise noted, Rust code in this book uses the Rust 2021 edition. Unless otherwise noted, Rust code in this book uses the Rust 2024 edition.
[trpl]: ../book/index.html [trpl]: ../book/index.html
[ref]: ../reference/index.html [ref]: ../reference/index.html

@ -7,7 +7,8 @@ There's also the [unsafe code guidelines] (note that it's **NOT** normative).
This is the most important `repr`. It has fairly simple intent: do what C does. This is the most important `repr`. It has fairly simple intent: do what C does.
The order, size, and alignment of fields is exactly what you would expect from C The order, size, and alignment of fields is exactly what you would expect from C
or C++. Any type you expect to pass through an FFI boundary should have or C++. The type is also passed across `extern "C"` function call boundaries the
same way C would pass the corresponding type. Any type you expect to pass through an FFI boundary should have
`repr(C)`, as C is the lingua-franca of the programming world. This is also `repr(C)`, as C is the lingua-franca of the programming world. This is also
necessary to soundly do more elaborate tricks with data layout such as necessary to soundly do more elaborate tricks with data layout such as
reinterpreting values as a different type. reinterpreting values as a different type.
@ -86,10 +87,14 @@ be 0. However Rust will not allow you to create an enum where two variants have
the same discriminant. the same discriminant.
The term "fieldless enum" only means that the enum doesn't have data in any The term "fieldless enum" only means that the enum doesn't have data in any
of its variants. A fieldless enum without a `repr(u*)` or `repr(C)` is of its variants. A fieldless enum without a `repr` is
still a Rust native type, and does not have a stable ABI representation. still a Rust native type, and does not have a stable layout or representation.
Adding a `repr` causes it to be treated exactly like the specified Adding a `repr(u*)`/`repr(i*)` causes it to be treated exactly like the specified
integer type for ABI purposes. integer type for layout purposes (except that the compiler will still exploit its
knowledge of "invalid" values at this type to optimize enum layout, such as when
this enum is wrapped in `Option`). Note that the function call ABI for these
types is still in general unspecified, except that across `extern "C"` calls they
are ABI-compatible with C enums of the same sign and size.
If the enum has fields, the effect is similar to the effect of `repr(C)` If the enum has fields, the effect is similar to the effect of `repr(C)`
in that there is a defined layout of the type. This makes it possible to in that there is a defined layout of the type. This makes it possible to
@ -119,31 +124,34 @@ assert_eq!(16, size_of::<MyReprOption<&u16>>());
This optimization still applies to fieldless enums with an explicit `repr(u*)`, `repr(i*)`, or `repr(C)`. This optimization still applies to fieldless enums with an explicit `repr(u*)`, `repr(i*)`, or `repr(C)`.
## repr(packed) ## repr(packed), repr(packed(n))
`repr(packed)` forces Rust to strip any padding, and only align the type to a `repr(packed(n))` (where `n` is a power of two) forces the type to have an
byte. This may improve the memory footprint, but will likely have other negative alignment of *at most* `n`. Most commonly used without an explicit `n`,
side-effects. `repr(packed)` is equivalent to `repr(packed(1))` which forces Rust to strip
any padding, and only align the type to a byte. This may improve the memory
footprint, but will likely have other negative side-effects.
In particular, most architectures *strongly* prefer values to be aligned. This In particular, most architectures *strongly* prefer values to be naturally
may mean the unaligned loads are penalized (x86), or even fault (some ARM aligned. This may mean that unaligned loads are penalized (x86), or even fault
chips). For simple cases like directly loading or storing a packed field, the (some ARM chips). For simple cases like directly loading or storing a packed
compiler might be able to paper over alignment issues with shifts and masks. field, the compiler might be able to paper over alignment issues with shifts
However if you take a reference to a packed field, it's unlikely that the and masks. However if you take a reference to a packed field, it's unlikely
compiler will be able to emit code to avoid an unaligned load. that the compiler will be able to emit code to avoid an unaligned load.
[As this can cause undefined behavior][ub loads], the lint has been implemented [As this can cause undefined behavior][ub loads], the lint has been implemented
and it will become a hard error. and it will become a hard error.
`repr(packed)` is not to be used lightly. Unless you have extreme requirements, `repr(packed)/repr(packed(n))` is not to be used lightly. Unless you have
this should not be used. extreme requirements, this should not be used.
This repr is a modifier on `repr(C)` and `repr(Rust)`. This repr is a modifier on `repr(C)` and `repr(Rust)`. For FFI compatibility
you most likely always want to be explicit: `repr(C, packed)`.
## repr(align(n)) ## repr(align(n))
`repr(align(n))` (where `n` is a power of two) forces the type to have an `repr(align(n))` (where `n` is a power of two) forces the type to have an
alignment of *at least* n. alignment of *at least* `n`.
This enables several tricks, like making sure neighboring elements of an array This enables several tricks, like making sure neighboring elements of an array
never share the same cache line with each other (which may speed up certain never share the same cache line with each other (which may speed up certain

@ -60,8 +60,8 @@ thread::spawn(move || {
println!("{}", data[idx.load(Ordering::SeqCst)]); println!("{}", data[idx.load(Ordering::SeqCst)]);
``` ```
We can cause a data race if we instead do the bound check in advance, and then We can cause a race condition to violate memory safety if we instead do the bound
unsafely access the data with an unchecked value: check in advance, and then unsafely access the data with an unchecked value:
```rust,no_run ```rust,no_run
use std::thread; use std::thread;

@ -1,6 +1,6 @@
# References # References
There are two kinds of reference: There are two kinds of references:
* Shared reference: `&` * Shared reference: `&`
* Mutable reference: `&mut` * Mutable reference: `&mut`

@ -89,7 +89,7 @@ to the heap.
# pub use ::std::os::raw::{c_int, c_void}; # pub use ::std::os::raw::{c_int, c_void};
# #[allow(non_camel_case_types)] # #[allow(non_camel_case_types)]
# pub type size_t = usize; # pub type size_t = usize;
# extern "C" { pub fn posix_memalign(memptr: *mut *mut c_void, align: size_t, size: size_t) -> c_int; } # unsafe extern "C" { pub fn posix_memalign(memptr: *mut *mut c_void, align: size_t, size: size_t) -> c_int; }
# } # }
use std::{ use std::{
mem::{align_of, size_of}, mem::{align_of, size_of},
@ -225,7 +225,7 @@ allocation done on another thread. We can check this is true in the docs for
# struct Carton<T>(std::ptr::NonNull<T>); # struct Carton<T>(std::ptr::NonNull<T>);
# mod libc { # mod libc {
# pub use ::std::os::raw::c_void; # pub use ::std::os::raw::c_void;
# extern "C" { pub fn free(p: *mut c_void); } # unsafe extern "C" { pub fn free(p: *mut c_void); }
# } # }
impl<T> Drop for Carton<T> { impl<T> Drop for Carton<T> {
fn drop(&mut self) { fn drop(&mut self) {
@ -253,6 +253,6 @@ only to data races?
[box-is-special]: https://manishearth.github.io/blog/2017/01/10/rust-tidbits-box-is-special/ [box-is-special]: https://manishearth.github.io/blog/2017/01/10/rust-tidbits-box-is-special/
[deref-doc]: https://doc.rust-lang.org/core/ops/trait.Deref.html [deref-doc]: https://doc.rust-lang.org/core/ops/trait.Deref.html
[deref-mut-doc]: https://doc.rust-lang.org/core/ops/trait.DerefMut.html [deref-mut-doc]: https://doc.rust-lang.org/core/ops/trait.DerefMut.html
[mutex-guard-not-send-docs-rs]: https://doc.rust-lang.org/std/sync/struct.MutexGuard.html#impl-Send [mutex-guard-not-send-docs-rs]: https://doc.rust-lang.org/std/sync/struct.MutexGuard.html#impl-Send-for-MutexGuard%3C'_,+T%3E
[mutex-guard-not-send-comment]: https://github.com/rust-lang/rust/issues/23465#issuecomment-82730326 [mutex-guard-not-send-comment]: https://github.com/rust-lang/rust/issues/23465#issuecomment-82730326
[libc-free-docs]: https://linux.die.net/man/3/free [libc-free-docs]: https://linux.die.net/man/3/free

@ -134,7 +134,7 @@ to compute the address of array index `idx`. This relies on
how arrays are laid out in memory. how arrays are laid out in memory.
* For a struct, however, in general we do not know how it is laid out, and we * For a struct, however, in general we do not know how it is laid out, and we
also cannot use `&mut base_ptr.field` as that would be creating a also cannot use `&mut base_ptr.field` as that would be creating a
reference. So, you must carefully use the [`addr_of_mut`] macro. This creates reference. So, you must carefully use the [raw reference][raw_reference] syntax. This creates
a raw pointer to the field without creating an intermediate reference: a raw pointer to the field without creating an intermediate reference:
```rust ```rust
@ -147,7 +147,7 @@ struct Demo {
let mut uninit = MaybeUninit::<Demo>::uninit(); let mut uninit = MaybeUninit::<Demo>::uninit();
// `&uninit.as_mut().field` would create a reference to an uninitialized `bool`, // `&uninit.as_mut().field` would create a reference to an uninitialized `bool`,
// and thus be Undefined Behavior! // and thus be Undefined Behavior!
let f1_ptr = unsafe { ptr::addr_of_mut!((*uninit.as_mut_ptr()).field) }; let f1_ptr = unsafe { &raw mut (*uninit.as_mut_ptr()).field };
unsafe { f1_ptr.write(true); } unsafe { f1_ptr.write(true); }
let init = unsafe { uninit.assume_init() }; let init = unsafe { uninit.assume_init() };
@ -167,7 +167,7 @@ it around at all, be sure to be *really* careful.
[`MaybeUninit`]: ../core/mem/union.MaybeUninit.html [`MaybeUninit`]: ../core/mem/union.MaybeUninit.html
[assume_init]: ../core/mem/union.MaybeUninit.html#method.assume_init [assume_init]: ../core/mem/union.MaybeUninit.html#method.assume_init
[`ptr`]: ../core/ptr/index.html [`ptr`]: ../core/ptr/index.html
[`addr_of_mut`]: ../core/ptr/macro.addr_of_mut.html [raw_reference]: ../reference/types/pointer.html#r-type.pointer.raw.constructor
[`write`]: ../core/ptr/fn.write.html [`write`]: ../core/ptr/fn.write.html
[`copy`]: ../std/ptr/fn.copy.html [`copy`]: ../std/ptr/fn.copy.html
[`copy_nonoverlapping`]: ../std/ptr/fn.copy_nonoverlapping.html [`copy_nonoverlapping`]: ../std/ptr/fn.copy_nonoverlapping.html

Loading…
Cancel
Save