many many pnkfelix fixes

pull/10/head
Alexis Beingessner 9 years ago committed by Manish Goregaokar
parent 35f68b4107
commit fadf50dc7d

@ -9,19 +9,24 @@ is not always the case, however.
# Dynamically Sized Types (DSTs) # Dynamically Sized Types (DSTs)
Rust also supports types without a statically known size. On the surface, this Rust in fact supports Dynamically Sized Types (DSTs): types without a statically
is a bit nonsensical: Rust *must* know the size of something in order to work known size or alignment. On the surface, this is a bit nonsensical: Rust *must*
with it! DSTs are generally produced as views, or through type-erasure of types know the size and alignment of something in order to correctly work with it! In
that *do* have a known size. Due to their lack of a statically known size, these this regard, DSTs are not normal types. Due to their lack of a statically known
types can only exist *behind* some kind of pointer. They consequently produce a size, these types can only exist behind some kind of pointer. Any pointer to a
*fat* pointer consisting of the pointer and the information that *completes* DST consequently becomes a *fat* pointer consisting of the pointer and the
them. information that "completes" them (more on this below).
For instance, the slice type, `[T]`, is some statically unknown number of There are two major DSTs exposed by the language: trait objects, and slices.
elements stored contiguously. `&[T]` consequently consists of a `(&T, usize)`
pair that specifies where the slice starts, and how many elements it contains. A trait object represents some type that implements the traits it specifies.
Similarly, Trait Objects support interface-oriented type erasure through a The exact original type is *erased* in favour of runtime reflection
`(data_ptr, vtable_ptr)` pair. with a vtable containing all the information necessary to use the type.
This is the information that completes a trait object: a pointer to its vtable.
A slice is simply a view into some contiguous storage -- typically an array or
`Vec`. The information that completes a slice is just the number of elements
it points to.
Structs can actually store a single DST directly as their last field, but this Structs can actually store a single DST directly as their last field, but this
makes them a DST as well: makes them a DST as well:
@ -34,8 +39,8 @@ struct Foo {
} }
``` ```
**NOTE: As of Rust 1.0 struct DSTs are broken if the last field has **NOTE: [As of Rust 1.0 struct DSTs are broken if the last field has
a variable position based on its alignment.** a variable position based on its alignment][dst-issue].**
@ -56,22 +61,32 @@ struct Baz {
} }
``` ```
On their own, ZSTs are, for obvious reasons, pretty useless. However as with On their own, Zero Sized Types (ZSTs) are, for obvious reasons, pretty useless.
many curious layout choices in Rust, their potential is realized in a generic However as with many curious layout choices in Rust, their potential is realized
context. in a generic context: Rust largely understands that any operation that produces
or stores a ZST can be reduced to a no-op. First off, storing it doesn't even
Rust largely understands that any operation that produces or stores a ZST can be make sense -- it doesn't occupy any space. Also there's only one value of that
reduced to a no-op. For instance, a `HashSet<T>` can be effeciently implemented type, so anything that loads it can just produce it from the aether -- which is
as a thin wrapper around `HashMap<T, ()>` because all the operations `HashMap` also a no-op since it doesn't occupy any space.
normally does to store and retrieve values will be completely stripped in
monomorphization. One of the most extreme example's of this is Sets and Maps. Given a
`Map<Key, Value>`, it is common to implement a `Set<Key>` as just a thin wrapper
Similarly `Result<(), ()>` and `Option<()>` are effectively just fancy `bool`s. around `Map<Key, UselessJunk>`. In many languages, this would necessitate
allocating space for UselessJunk and doing work to store and load UselessJunk
only to discard it. Proving this unnecessary would be a difficult analysis for
the compiler.
However in Rust, we can just say that `Set<Key> = Map<Key, ()>`. Now Rust
statically knows that every load and store is useless, and no allocation has any
size. The result is that the monomorphized code is basically a custom
implementation of a HashSet with none of the overhead that HashMap would have to
support values.
Safe code need not worry about ZSTs, but *unsafe* code must be careful about the Safe code need not worry about ZSTs, but *unsafe* code must be careful about the
consequence of types with no size. In particular, pointer offsets are no-ops, consequence of types with no size. In particular, pointer offsets are no-ops,
and standard allocators (including jemalloc, the one used by Rust) generally and standard allocators (including jemalloc, the one used by default in Rust)
consider passing in `0` as Undefined Behaviour. generally consider passing in `0` for the size of an allocation as Undefined
Behaviour.
@ -93,11 +108,12 @@ return a Result in general, but a specific case actually is infallible. It's
actually possible to communicate this at the type level by returning a actually possible to communicate this at the type level by returning a
`Result<T, Void>`. Consumers of the API can confidently unwrap such a Result `Result<T, Void>`. Consumers of the API can confidently unwrap such a Result
knowing that it's *statically impossible* for this value to be an `Err`, as knowing that it's *statically impossible* for this value to be an `Err`, as
this would require providing a value of type Void. this would require providing a value of type `Void`.
In principle, Rust can do some interesting analyses and optimizations based In principle, Rust can do some interesting analyses and optimizations based
on this fact. For instance, `Result<T, Void>` could be represented as just `T`, on this fact. For instance, `Result<T, Void>` could be represented as just `T`,
because the Err case doesn't actually exist. The following *could* also compile: because the `Err` case doesn't actually exist. The following *could* also
compile:
```rust,ignore ```rust,ignore
enum Void {} enum Void {}
@ -116,3 +132,6 @@ actually valid to construct, but dereferencing them is Undefined Behaviour
because that doesn't actually make sense. That is, you could model C's `void *` because that doesn't actually make sense. That is, you could model C's `void *`
type with `*const Void`, but this doesn't necessarily gain anything over using type with `*const Void`, but this doesn't necessarily gain anything over using
e.g. `*const ()`, which *is* safe to randomly dereference. e.g. `*const ()`, which *is* safe to randomly dereference.
[dst-issue]: https://github.com/rust-lang/rust/issues/26403

@ -2,7 +2,7 @@
Programmers in safe "high-level" languages face a fundamental dilemma. On one Programmers in safe "high-level" languages face a fundamental dilemma. On one
hand, it would be *really* great to just say what you want and not worry about hand, it would be *really* great to just say what you want and not worry about
how it's done. On the other hand, that can lead to some *really* poor how it's done. On the other hand, that can lead to unacceptably poor
performance. It may be necessary to drop down to less clear or idiomatic performance. It may be necessary to drop down to less clear or idiomatic
practices to get the performance characteristics you want. Or maybe you just practices to get the performance characteristics you want. Or maybe you just
throw up your hands in disgust and decide to shell out to an implementation in throw up your hands in disgust and decide to shell out to an implementation in
@ -12,21 +12,22 @@ Worse, when you want to talk directly to the operating system, you *have* to
talk to an unsafe language: *C*. C is ever-present and unavoidable. It's the talk to an unsafe language: *C*. C is ever-present and unavoidable. It's the
lingua-franca of the programming world. lingua-franca of the programming world.
Even other safe languages generally expose C interfaces for the world at large! Even other safe languages generally expose C interfaces for the world at large!
Regardless of *why* you're doing it, as soon as your program starts talking to Regardless of why you're doing it, as soon as your program starts talking to
C it stops being safe. C it stops being safe.
With that said, Rust is *totally* a safe programming language. With that said, Rust is *totally* a safe programming language.
Well, Rust *has* a safe programming language. Let's step back a bit. Well, Rust *has* a safe programming language. Let's step back a bit.
Rust can be thought of as being composed of two Rust can be thought of as being composed of two programming languages: *Safe
programming languages: *Safe* and *Unsafe*. Safe is For Reals Totally Safe. Rust* and *Unsafe Rust*. Safe Rust is For Reals Totally Safe. Unsafe Rust,
Unsafe, unsurprisingly, is *not* For Reals Totally Safe. In fact, Unsafe lets unsurprisingly, is *not* For Reals Totally Safe. In fact, Unsafe Rust lets you
you do some really crazy unsafe things. do some really crazy unsafe things.
Safe is *the* Rust programming language. If all you do is write Safe Rust, Safe Rust is the *true* Rust programming language. If all you do is write Safe
you will never have to worry about type-safety or memory-safety. You will never Rust, you will never have to worry about type-safety or memory-safety. You will
endure a null or dangling pointer, or any of that Undefined Behaviour nonsense. never endure a null or dangling pointer, or any of that Undefined Behaviour
nonsense.
*That's totally awesome*. *That's totally awesome*.
@ -69,17 +70,16 @@ language cares about is preventing the following things:
* A non-utf8 `str` * A non-utf8 `str`
* Unwinding into another language * Unwinding into another language
* Causing a [data race][race] * Causing a [data race][race]
* Double-dropping a value
That's it. That's all the Undefined Behaviour baked into Rust. Of course, unsafe That's it. That's all the causes of Undefined Behaviour baked into Rust. Of
functions and traits are free to declare arbitrary other constraints that a course, unsafe functions and traits are free to declare arbitrary other
program must maintain to avoid Undefined Behaviour. However these are generally constraints that a program must maintain to avoid Undefined Behaviour. However,
just things that will transitively lead to one of the above problems. Some generally violations of these constraints will just transitively lead to one of
additional constraints may also derive from compiler intrinsics that make special the above problems. Some additional constraints may also derive from compiler
assumptions about how code can be optimized. intrinsics that make special assumptions about how code can be optimized.
Rust is otherwise quite permissive with respect to other dubious operations. Rust Rust is otherwise quite permissive with respect to other dubious operations.
considers it "safe" to: Rust considers it "safe" to:
* Deadlock * Deadlock
* Have a [race condition][race] * Have a [race condition][race]

@ -12,21 +12,21 @@ The order, size, and alignment of fields is exactly what you would expect from C
or C++. Any type you expect to pass through an FFI boundary should have or C++. Any type you expect to pass through an FFI boundary should have
`repr(C)`, as C is the lingua-franca of the programming world. This is also `repr(C)`, as C is the lingua-franca of the programming world. This is also
necessary to soundly do more elaborate tricks with data layout such as necessary to soundly do more elaborate tricks with data layout such as
reintepretting values as a different type. reinterpreting values as a different type.
However, the interaction with Rust's more exotic data layout features must be However, the interaction with Rust's more exotic data layout features must be
kept in mind. Due to its dual purpose as "for FFI" and "for layout control", kept in mind. Due to its dual purpose as "for FFI" and "for layout control",
`repr(C)` can be applied to types that will be nonsensical or problematic if `repr(C)` can be applied to types that will be nonsensical or problematic if
passed through the FFI boundary. passed through the FFI boundary.
* ZSTs are still zero-sized, even though this is not a standard behaviour in * ZSTs are still zero-sized, even though this is not a standard behaviour in
C, and is explicitly contrary to the behaviour of an empty type in C++, which C, and is explicitly contrary to the behaviour of an empty type in C++, which
still consumes a byte of space. still consumes a byte of space.
* DSTs, tuples, and tagged unions are not a concept in C and as such are never * DSTs, tuples, and tagged unions are not a concept in C and as such are never
FFI safe. FFI safe.
* **The [drop flag][] will still be added** * **If the type would have any [drop flags][], they will still be added**
* This is equivalent to one of `repr(u*)` (see the next section) for enums. The * This is equivalent to one of `repr(u*)` (see the next section) for enums. The
chosen size is the default enum size for the target platform's C ABI. Note that chosen size is the default enum size for the target platform's C ABI. Note that
@ -39,10 +39,10 @@ compiled with certain flags.
# repr(u8), repr(u16), repr(u32), repr(u64) # repr(u8), repr(u16), repr(u32), repr(u64)
These specify the size to make a C-like enum. If the discriminant overflows the These specify the size to make a C-like enum. If the discriminant overflows the
integer it has to fit in, it will be an error. You can manually ask Rust to integer it has to fit in, it will produce a compile-time error. You can manually
allow this by setting the overflowing element to explicitly be 0. However Rust ask Rust to allow this by setting the overflowing element to explicitly be 0.
will not allow you to create an enum where two variants have the same However Rust will not allow you to create an enum where two variants have the
discriminant. same discriminant.
On non-C-like enums, this will inhibit certain optimizations like the null- On non-C-like enums, this will inhibit certain optimizations like the null-
pointer optimization. pointer optimization.
@ -65,9 +65,12 @@ compiler might be able to paper over alignment issues with shifts and masks.
However if you take a reference to a packed field, it's unlikely that the However if you take a reference to a packed field, it's unlikely that the
compiler will be able to emit code to avoid an unaligned load. compiler will be able to emit code to avoid an unaligned load.
**[As of Rust 1.0 this can cause undefined behaviour.][ub loads]**
`repr(packed)` is not to be used lightly. Unless you have extreme requirements, `repr(packed)` is not to be used lightly. Unless you have extreme requirements,
this should not be used. this should not be used.
This repr is a modifier on `repr(C)` and `repr(rust)`. This repr is a modifier on `repr(C)` and `repr(rust)`.
[drop flag]: drop-flags.html [drop flags]: drop-flags.html
[ub loads]: https://github.com/rust-lang/rust/issues/27060

@ -5,16 +5,17 @@ memory-safe and efficient, while avoiding garbage collection. Before getting
into the ownership system in detail, we will consider the motivation of this into the ownership system in detail, we will consider the motivation of this
design. design.
We will assume that you accept that garbage collection is not always an optimal We will assume that you accept that garbage collection (GC) is not always an
solution, and that it is desirable to manually manage memory to some extent. optimal solution, and that it is desirable to manually manage memory in some
If you do not accept this, might I interest you in a different language? contexts. If you do not accept this, might I interest you in a different
language?
Regardless of your feelings on GC, it is pretty clearly a *massive* boon to Regardless of your feelings on GC, it is pretty clearly a *massive* boon to
making code safe. You never have to worry about things going away *too soon* making code safe. You never have to worry about things going away *too soon*
(although whether you still *wanted* to be pointing at that thing is a different (although whether you still *wanted* to be pointing at that thing is a different
issue...). This is a pervasive problem that C and C++ need to deal with. issue...). This is a pervasive problem that C and C++ programs need to deal
Consider this simple mistake that all of us who have used a non-GC'd language with. Consider this simple mistake that all of us who have used a non-GC'd
have made at one point: language have made at one point:
```rust,ignore ```rust,ignore
fn as_str(data: &u32) -> &str { fn as_str(data: &u32) -> &str {
@ -40,7 +41,7 @@ be forced to accept your program on the assumption that it is correct.
This will never happen to Rust. It's up to the programmer to prove to the This will never happen to Rust. It's up to the programmer to prove to the
compiler that everything is sound. compiler that everything is sound.
Of course, rust's story around ownership is much more complicated than just Of course, Rust's story around ownership is much more complicated than just
verifying that references don't escape the scope of their referent. That's verifying that references don't escape the scope of their referent. That's
because ensuring pointers are always valid is much more complicated than this. because ensuring pointers are always valid is much more complicated than this.
For instance in this code, For instance in this code,

@ -1,5 +1,19 @@
% repr(Rust) % repr(Rust)
First and foremost, all types have an alignment specified in bytes. The
alignment of a type specifies what addresses are valid to store the value at. A
value of alignment `n` must only be stored at an address that is a multiple of
`n`. So alignment 2 means you must be stored at an even address, and 1 means
that you can be stored anywhere. Alignment is at least 1, and always a power of
2. Most primitives are generally aligned to their size, although this is
platform-specific behaviour. In particular, on x86 `u64` and `f64` may be only
aligned to 32 bits.
A type's size must always be a multiple of its alignment. This ensures that an
array of that type may always be indexed by offsetting by a multiple of its
size. Note that the size and alignment of a type may not be known
statically in the case of [dynamically sized types][dst].
Rust gives you the following ways to lay out composite data: Rust gives you the following ways to lay out composite data:
* structs (named product types) * structs (named product types)
@ -9,17 +23,10 @@ Rust gives you the following ways to lay out composite data:
An enum is said to be *C-like* if none of its variants have associated data. An enum is said to be *C-like* if none of its variants have associated data.
For all these, individual fields are aligned to their preferred alignment. For Composite structures will have an alignment equal to the maximum
primitives this is usually equal to their size. For instance, a u32 will be of their fields' alignment. Rust will consequently insert padding where
aligned to a multiple of 32 bits, and a u16 will be aligned to a multiple of 16 necessary to ensure that all fields are properly aligned and that the overall
bits. Note that some primitives may be emulated on different platforms, and as type's size is a multiple of its alignment. For instance:
such may have strange alignment. For instance, a u64 on x86 may actually be
emulated as a pair of u32s, and thus only have 32-bit alignment.
Composite structures will have a preferred alignment equal to the maximum
of their fields' preferred alignment, and a size equal to a multiple of their
preferred alignment. This ensures that arrays of T can be correctly iterated
by offsetting by their size. So for instance,
```rust ```rust
struct A { struct A {
@ -29,12 +36,24 @@ struct A {
} }
``` ```
will have a size that is a multiple of 32-bits, and 32-bit alignment. will be 32-bit aligned assuming these primitives are aligned to their size.
It will therefore have a size that is a multiple of 32-bits. It will potentially
*really* become:
There is *no indirection* for these types; all data is stored contiguously as you would ```rust
expect in C. However with the exception of arrays (which are densely packed and struct A {
in-order), the layout of data is not by default specified in Rust. Given the two a: u8,
following struct definitions: _pad1: [u8; 3], // to align `b`
b: u32,
c: u16,
_pad2: [u8; 2], // to make overall size multiple of 4
}
```
There is *no indirection* for these types; all data is stored contiguously as
you would expect in C. However with the exception of arrays (which are densely
packed and in-order), the layout of data is not by default specified in Rust.
Given the two following struct definitions:
```rust ```rust
struct A { struct A {
@ -48,13 +67,15 @@ struct B {
} }
``` ```
Rust *does* guarantee that two instances of A have their data laid out in exactly Rust *does* guarantee that two instances of A have their data laid out in
the same way. However Rust *does not* guarantee that an instance of A has the same exactly the same way. However Rust *does not* guarantee that an instance of A
field ordering or padding as an instance of B (in practice there's no *particular* has the same field ordering or padding as an instance of B (in practice there's
reason why they wouldn't, other than that its not currently guaranteed). no *particular* reason why they wouldn't, other than that its not currently
guaranteed).
With A and B as written, this is basically nonsensical, but several other features With A and B as written, this is basically nonsensical, but several other
of Rust make it desirable for the language to play with data layout in complex ways. features of Rust make it desirable for the language to play with data layout in
complex ways.
For instance, consider this struct: For instance, consider this struct:
@ -66,10 +87,10 @@ struct Foo<T, U> {
} }
``` ```
Now consider the monomorphizations of `Foo<u32, u16>` and `Foo<u16, u32>`. If Rust lays out the Now consider the monomorphizations of `Foo<u32, u16>` and `Foo<u16, u32>`. If
fields in the order specified, we expect it to *pad* the values in the struct to satisfy Rust lays out the fields in the order specified, we expect it to *pad* the
their *alignment* requirements. So if Rust didn't reorder fields, we would expect Rust to values in the struct to satisfy their *alignment* requirements. So if Rust
produce the following: didn't reorder fields, we would expect Rust to produce the following:
```rust,ignore ```rust,ignore
struct Foo<u16, u32> { struct Foo<u16, u32> {
@ -87,10 +108,11 @@ struct Foo<u32, u16> {
} }
``` ```
The latter case quite simply wastes space. An optimal use of space therefore requires The latter case quite simply wastes space. An optimal use of space therefore
different monomorphizations to have *different field orderings*. requires different monomorphizations to have *different field orderings*.
**Note: this is a hypothetical optimization that is not yet implemented in Rust 1.0** **Note: this is a hypothetical optimization that is not yet implemented in Rust
**1.0
Enums make this consideration even more complicated. Naively, an enum such as: Enums make this consideration even more complicated. Naively, an enum such as:
@ -121,8 +143,10 @@ by using null as a special value. The net result is that
There are many types in Rust that are, or contain, "not null" pointers such as There are many types in Rust that are, or contain, "not null" pointers such as
`Box<T>`, `Vec<T>`, `String`, `&T`, and `&mut T`. Similarly, one can imagine `Box<T>`, `Vec<T>`, `String`, `&T`, and `&mut T`. Similarly, one can imagine
nested enums pooling their tags into a single descriminant, as they are by nested enums pooling their tags into a single discriminant, as they are by
definition known to have a limited range of valid values. In principle enums can definition known to have a limited range of valid values. In principle enums can
use fairly elaborate algorithms to cache bits throughout nested types with use fairly elaborate algorithms to cache bits throughout nested types with
special constrained representations. As such it is *especially* desirable that special constrained representations. As such it is *especially* desirable that
we leave enum layout unspecified today. we leave enum layout unspecified today.
[dst]: exotic-sizes.html#dynamically-sized-types-(dsts)

@ -1,29 +1,30 @@
% How Safe and Unsafe Interact % How Safe and Unsafe Interact
So what's the relationship between Safe and Unsafe? How do they interact? So what's the relationship between Safe and Unsafe Rust? How do they interact?
Rust models the seperation between Safe and Unsafe with the `unsafe` keyword, which Rust models the separation between Safe and Unsafe Rust with the `unsafe`
can be thought as a sort of *foreign function interface* (FFI) between Safe and Unsafe. keyword, which can be thought as a sort of *foreign function interface* (FFI)
This is the magic behind why we can say Safe is a safe language: all the scary unsafe between Safe and Unsafe Rust. This is the magic behind why we can say Safe Rust
bits are relagated *exclusively* to FFI *just like every other safe language*. is a safe language: all the scary unsafe bits are relegated *exclusively* to FFI
*just like every other safe language*.
However because one language is a subset of the other, the two can be cleanly However because one language is a subset of the other, the two can be cleanly
intermixed as long as the boundary between Safe and Unsafe is denoted with the intermixed as long as the boundary between Safe and Unsafe Rust is denoted with
`unsafe` keyword. No need to write headers, initialize runtimes, or any of that the `unsafe` keyword. No need to write headers, initialize runtimes, or any of
other FFI boiler-plate. that other FFI boiler-plate.
There are several places `unsafe` can appear in Rust today, which can largely be There are several places `unsafe` can appear in Rust today, which can largely be
grouped into two categories: grouped into two categories:
* There are unchecked contracts here. To declare you understand this, I require * There are unchecked contracts here. To declare you understand this, I require
you to write `unsafe` elsewhere: you to write `unsafe` elsewhere:
* On functions, `unsafe` is declaring the function to be unsafe to call. Users * On functions, `unsafe` is declaring the function to be unsafe to call.
of the function must check the documentation to determine what this means, Users of the function must check the documentation to determine what this
and then have to write `unsafe` somewhere to identify that they're aware of means, and then have to write `unsafe` somewhere to identify that they're
the danger. aware of the danger.
* On trait declarations, `unsafe` is declaring that *implementing* the trait * On trait declarations, `unsafe` is declaring that *implementing* the trait
is an unsafe operation, as it has contracts that other unsafe code is free to is an unsafe operation, as it has contracts that other unsafe code is free
trust blindly. (More on this below.) to trust blindly. (More on this below.)
* I am declaring that I have, to the best of my knowledge, adhered to the * I am declaring that I have, to the best of my knowledge, adhered to the
unchecked contracts: unchecked contracts:
@ -64,9 +65,9 @@ This means that Unsafe, **the royal vanguard of Undefined Behaviour**, has to be
*super paranoid* about generic safe code. Unsafe is free to trust *specific* safe *super paranoid* about generic safe code. Unsafe is free to trust *specific* safe
code (or else you would degenerate into infinite spirals of paranoid despair). code (or else you would degenerate into infinite spirals of paranoid despair).
It is generally regarded as ok to trust the standard library to be correct, as It is generally regarded as ok to trust the standard library to be correct, as
std is effectively an extension of the language (and you *really* just have to trust `std` is effectively an extension of the language (and you *really* just have
the language). If `std` fails to uphold the guarantees it declares, then it's to trust the language). If `std` fails to uphold the guarantees it declares,
basically a language bug. then it's basically a language bug.
That said, it would be best to minimize *needlessly* relying on properties of That said, it would be best to minimize *needlessly* relying on properties of
concrete safe code. Bugs happen! Of course, I must reinforce that this is only concrete safe code. Bugs happen! Of course, I must reinforce that this is only
@ -89,7 +90,7 @@ Ord for a type, but don't actually provide a proper total ordering, BTreeMap wil
get *really confused* and start making a total mess of itself. Data that is get *really confused* and start making a total mess of itself. Data that is
inserted may be impossible to find! inserted may be impossible to find!
But that's ok. BTreeMap is safe, so it guarantees that even if you give it a But that's okay. BTreeMap is safe, so it guarantees that even if you give it a
*completely* garbage Ord implementation, it will still do something *safe*. You *completely* garbage Ord implementation, it will still do something *safe*. You
won't start reading uninitialized memory or unallocated memory. In fact, BTreeMap won't start reading uninitialized memory or unallocated memory. In fact, BTreeMap
manages to not actually lose any of your data. When the map is dropped, all the manages to not actually lose any of your data. When the map is dropped, all the
@ -104,7 +105,24 @@ Safe's responsibility to uphold.
But wouldn't it be grand if there was some way for Unsafe to trust *some* trait But wouldn't it be grand if there was some way for Unsafe to trust *some* trait
contracts *somewhere*? This is the problem that unsafe traits tackle: by marking contracts *somewhere*? This is the problem that unsafe traits tackle: by marking
*the trait itself* as unsafe *to implement*, Unsafe can trust the implementation *the trait itself* as unsafe *to implement*, Unsafe can trust the implementation
to be correct. to uphold the trait's contract. Although the trait implementation may be
incorrect in arbitrary other ways.
For instance, given a hypothetical UnsafeOrd trait, this is technically a valid
implementation:
```rust
# use std::cmp::Ordering;
# struct MyType;
# pub unsafe trait UnsafeOrd { fn cmp(&self, other: &Self) -> Ordering; }
unsafe impl UnsafeOrd for MyType {
fn cmp(&self, other: &Self) -> Ordering {
Ordering::Equal
}
}
```
But it's probably not the implementation you want.
Rust has traditionally avoided making traits unsafe because it makes Unsafe Rust has traditionally avoided making traits unsafe because it makes Unsafe
pervasive, which is not desirable. Send and Sync are unsafe is because pervasive, which is not desirable. Send and Sync are unsafe is because

@ -1,8 +1,8 @@
% Working with Unsafe % Working with Unsafe
Rust generally only gives us the tools to talk about Unsafe in a scoped and Rust generally only gives us the tools to talk about Unsafe Rust in a scoped and
binary manner. Unfortunately, reality is significantly more complicated than that. binary manner. Unfortunately, reality is significantly more complicated than
For instance, consider the following toy function: that. For instance, consider the following toy function:
```rust ```rust
fn index(idx: usize, arr: &[u8]) -> Option<u8> { fn index(idx: usize, arr: &[u8]) -> Option<u8> {
@ -35,10 +35,15 @@ fn index(idx: usize, arr: &[u8]) -> Option<u8> {
This program is now unsound, and yet *we only modified safe code*. This is the This program is now unsound, and yet *we only modified safe code*. This is the
fundamental problem of safety: it's non-local. The soundness of our unsafe fundamental problem of safety: it's non-local. The soundness of our unsafe
operations necessarily depends on the state established by "safe" operations. operations necessarily depends on the state established by otherwise
Although safety *is* modular (we *still* don't need to worry about about "safe" operations.
unrelated safety issues like uninitialized memory), it quickly contaminates the
surrounding code. Safety is modular in the sense that opting into unsafety doesn't require you
to consider arbitrary other kinds of badness. For instance, doing an unchecked
index into a slice doesn't mean you suddenly need to worry about the slice being
null or containing uninitialized memory. Nothing fundamentally changes. However
safety *isn't* modular in the sense that programs are inherently stateful and
your unsafe operations may depend on arbitrary other state.
Trickier than that is when we get into actual statefulness. Consider a simple Trickier than that is when we get into actual statefulness. Consider a simple
implementation of `Vec`: implementation of `Vec`:
@ -84,10 +89,10 @@ fn make_room(&mut self) {
} }
``` ```
This code is safe, but it is also completely unsound. Changing the capacity This code is 100% Safe Rust but it is also completely unsound. Changing the
violates the invariants of Vec (that `cap` reflects the allocated space in the capacity violates the invariants of Vec (that `cap` reflects the allocated space
Vec). This is not something the rest of Vec can guard against. It *has* to in the Vec). This is not something the rest of Vec can guard against. It *has*
trust the capacity field because there's no way to verify it. to trust the capacity field because there's no way to verify it.
`unsafe` does more than pollute a whole function: it pollutes a whole *module*. `unsafe` does more than pollute a whole function: it pollutes a whole *module*.
Generally, the only bullet-proof way to limit the scope of unsafe code is at the Generally, the only bullet-proof way to limit the scope of unsafe code is at the
@ -102,9 +107,13 @@ as Vec.
It is therefore possible for us to write a completely safe abstraction that It is therefore possible for us to write a completely safe abstraction that
relies on complex invariants. This is *critical* to the relationship between relies on complex invariants. This is *critical* to the relationship between
Safe Rust and Unsafe Rust. We have already seen that Unsafe code must trust Safe Rust and Unsafe Rust. We have already seen that Unsafe code must trust
*some* Safe code, but can't trust *arbitrary* Safe code. However if Unsafe *some* Safe code, but can't trust *generic* Safe code. It can't trust an
couldn't prevent client Safe code from messing with its state in arbitrary ways, arbitrary implementor of a trait or any function that was passed to it to be
safety would be a lost cause. well-behaved in a way that safe code doesn't care about.
However if unsafe code couldn't prevent client safe code from messing with its
state in arbitrary ways, safety would be a lost cause. Thankfully, it *can*
prevent arbitrary code from messing with critical state due to privacy.
Safety lives! Safety lives!

Loading…
Cancel
Save