mirror of https://github.com/rust-lang/nomicon
parent
5f3cec4a00
commit
165132a1ad
@ -1,20 +1,39 @@
|
|||||||
% The Unsafe Rust Programming Language
|
% The Advanced Rust Programming Language
|
||||||
|
|
||||||
# NOTE: This is a draft document, and may contain serious errors
|
# NOTE: This is a draft document, and may contain serious errors
|
||||||
|
|
||||||
**This document is about advanced functionality and low-level development practices
|
So you've played around with Rust a bit. You've written a few simple programs and
|
||||||
in the Rust Programming Language. Most of the things discussed won't matter
|
you think you grok the basics. Maybe you've even read through
|
||||||
to the average Rust programmer. However if you wish to correctly write unsafe
|
*[The Rust Programming Language][trpl]*. Now you want to get neck-deep in all the
|
||||||
code in Rust, this text contains invaluable information.**
|
nitty-gritty details of the language. You want to know those weird corner-cases.
|
||||||
|
You want to know what the heck `unsafe` really means, and how to properly use it.
|
||||||
|
This is the book for you.
|
||||||
|
|
||||||
The Unsafe Rust Programming Language (TURPL) seeks to complement
|
To be clear, this book goes into *serious* detail. We're going to dig into
|
||||||
[The Rust Programming Language Book][trpl] (TRPL).
|
exception-safety and pointer aliasing. We're going to talk about memory
|
||||||
Where TRPL introduces the language and teaches the basics, TURPL dives deep into
|
models. We're even going to do some type-theory. This is stuff that you
|
||||||
the specification of the language, and all the nasty bits necessary to write
|
absolutely *don't* need to know to write fast and safe Rust programs.
|
||||||
Unsafe Rust. TURPL does not assume you have read TRPL, but does assume you know
|
You could probably close this book *right now* and still have a productive
|
||||||
the basics of the language and systems programming. We will not explain the
|
and happy career in Rust.
|
||||||
stack or heap. We will not explain the basic syntax.
|
|
||||||
|
|
||||||
|
However if you intend to write unsafe code -- or just *really* want to dig into
|
||||||
|
the guts of the language -- this book contains *invaluable* information.
|
||||||
|
|
||||||
|
Unlike *The Rust Programming Language* we *will* be assuming considerable prior
|
||||||
|
knowledge. In particular, you should be comfortable with:
|
||||||
|
|
||||||
|
* Basic Systems Programming:
|
||||||
|
* Pointers
|
||||||
|
* [The stack and heap][]
|
||||||
|
* The memory hierarchy (caches)
|
||||||
|
* Threads
|
||||||
|
|
||||||
|
* [Basic Rust][]
|
||||||
|
|
||||||
|
Due to the nature of advanced Rust programming, we will be spending a lot of time
|
||||||
|
talking about *safety* and *guarantees*. In particular, a significant portion of
|
||||||
|
the book will be dedicated to correctly writing and understanding Unsafe Rust.
|
||||||
|
|
||||||
[trpl]: https://doc.rust-lang.org/book/
|
[trpl]: https://doc.rust-lang.org/book/
|
||||||
|
[The stack and heap]: https://doc.rust-lang.org/book/the-stack-and-the-heap.html
|
||||||
|
[Basic Rust]: https://doc.rust-lang.org/book/syntax-and-semantics.html
|
||||||
|
@ -1,82 +1,98 @@
|
|||||||
% Meet Safe and Unsafe
|
% Meet Safe and Unsafe
|
||||||
|
|
||||||
Safe and Unsafe are Rust's chief engineers.
|
Programmers in safe "high-level" languages face a fundamental dilemma. On one
|
||||||
|
hand, it would be *really* great to just say what you want and not worry about
|
||||||
TODO: ADORABLE PICTURES OMG
|
how it's done. On the other hand, that can lead to some *really* poor
|
||||||
|
performance. It may be necessary to drop down to less clear or idiomatic
|
||||||
Unsafe handles all the dangerous internal stuff. They build the foundations
|
practices to get the performance characteristics you want. Or maybe you just
|
||||||
and handle all the dangerous materials. By all accounts, Unsafe is really a bit
|
throw up your hands in disgust and decide to shell out to an implementation in
|
||||||
unproductive, because the nature of their work means that they have to spend a
|
a less sugary-wonderful *unsafe* language.
|
||||||
lot of time checking and double-checking everything. What if there's an earthquake
|
|
||||||
on a leap year? Are we ready for that? Unsafe better be, because if they get
|
|
||||||
*anything* wrong, everything will blow up! What Unsafe brings to the table is
|
|
||||||
*quality*, not quantity. Still, nothing would ever get done if everything was
|
|
||||||
built to Unsafe's standards!
|
|
||||||
|
|
||||||
That's where Safe comes in. Safe has to handle *everything else*. Since Safe needs
|
|
||||||
to *get work done*, they've grown to be fairly careless and clumsy! Safe doesn't worry
|
|
||||||
about all the crazy eventualities that Unsafe does, because life is too short to deal
|
|
||||||
with leap-year-earthquakes. Of course, this means there's some jobs that Safe just
|
|
||||||
can't handle. Safe is all about quantity over quality.
|
|
||||||
|
|
||||||
Unsafe loves Safe to bits, but knows that they *can never trust them to do the
|
|
||||||
right thing*. Still, Unsafe acknowledges that not every problem needs quite the
|
|
||||||
attention to detail that they apply. Indeed, Unsafe would *love* if Safe could do
|
|
||||||
*everything* for them. To accomplish this, Unsafe spends most of their time
|
|
||||||
building *safe abstractions*. These abstractions handle all the nitty-gritty
|
|
||||||
details for Safe, and choose good defaults so that the simplest solution (which
|
|
||||||
Safe will inevitably use) is usually the *right* one. Once a safe abstraction is
|
|
||||||
built, Unsafe ideally needs to never work on it again, and Safe can blindly use
|
|
||||||
it in all their work.
|
|
||||||
|
|
||||||
Unsafe's attention to detail means that all the things that they mark as ok for
|
|
||||||
Safe to use can be combined in arbitrarily ridiculous ways, and all the rules
|
|
||||||
that Unsafe is forced to uphold will never be violated. If they *can* be violated
|
|
||||||
by Safe, that means *Unsafe*'s the one in the wrong. Safe can work carelessly,
|
|
||||||
knowing that if anything blows up, it's not *their* fault. Safe can also call in
|
|
||||||
Unsafe at any time if there's a hard problem they can't quite work out, or if they
|
|
||||||
can't meet the client's quality demands. Of course, Unsafe will beg and plead Safe
|
|
||||||
to try their latest safe abstraction first!
|
|
||||||
|
|
||||||
In addition to being adorable, Safe and Unsafe are what makes Rust possible.
|
|
||||||
Rust can be thought of as two different languages: Safe Rust, and Unsafe Rust.
|
|
||||||
Any time someone opines the guarantees of Rust, they are almost surely talking about
|
|
||||||
Safe. However Safe is not sufficient to write every program. For that,
|
|
||||||
we need the Unsafe superset.
|
|
||||||
|
|
||||||
Most fundamentally, writing bindings to other languages
|
|
||||||
(such as the C exposed by your operating system) is never going to be safe. Rust
|
|
||||||
can't control what other languages do to program execution! However Unsafe is
|
|
||||||
also necessary to construct fundamental abstractions where the type system is not
|
|
||||||
sufficient to automatically prove what you're doing is sound.
|
|
||||||
|
|
||||||
Indeed, the Rust standard library is implemented in Rust, and it makes substantial
|
|
||||||
use of Unsafe for implementing IO, memory allocation, collections,
|
|
||||||
synchronization, and other low-level computational primitives.
|
|
||||||
|
|
||||||
Upon hearing this, many wonder why they would not simply just use C or C++ in place of
|
|
||||||
Rust (or just use a "real" safe language). If we're going to do unsafe things, why not
|
|
||||||
lean on these much more established languages?
|
|
||||||
|
|
||||||
The most important difference between C++ and Rust is a matter of defaults:
|
|
||||||
Rust is 100% safe by default. Even when you *opt out* of safety in Rust, it is a modular
|
|
||||||
action. In deciding to work with unchecked uninitialized memory, this does not
|
|
||||||
suddenly make dangling or null pointers a problem. When using unchecked indexing on `x`,
|
|
||||||
one does not have to suddenly worry about indexing out of bounds on `y`.
|
|
||||||
C and C++, by contrast, have pervasive unsafety baked into the language. Even the
|
|
||||||
modern best practices like `unique_ptr` have various safety pitfalls.
|
|
||||||
|
|
||||||
It cannot be emphasized enough that Unsafe should be regarded as an exceptional
|
|
||||||
thing, not a normal one. Unsafe is often the domain of *fundamental libraries*: anything that needs
|
|
||||||
to make FFI bindings or define core abstractions. These fundamental libraries then expose
|
|
||||||
a safe interface for intermediate libraries and applications to build upon. And these
|
|
||||||
safe interfaces make an important promise: if your application segfaults, it's not your
|
|
||||||
fault. *They* have a bug.
|
|
||||||
|
|
||||||
And really, how is that different from *any* safe language? Python, Ruby, and Java libraries
|
|
||||||
can internally do all sorts of nasty things. The languages themselves are no
|
|
||||||
different. Safe languages *regularly* have bugs that cause critical vulnerabilities.
|
|
||||||
The fact that Rust is written with a healthy spoonful of Unsafe is no different.
|
|
||||||
However it *does* mean that Rust doesn't need to fall back to the pervasive unsafety of
|
|
||||||
C to do the nasty things that need to get done.
|
|
||||||
|
|
||||||
|
Worse, when you want to talk directly to the operating system, you *have* to
|
||||||
|
talk to an unsafe language: *C*. C is ever-present and unavoidable. It's the
|
||||||
|
lingua-franca of the programming world.
|
||||||
|
Even other safe languages generally expose C interfaces for the world at large!
|
||||||
|
Regardless of *why* you're doing it, as soon as your program starts talking to
|
||||||
|
C it stops being safe.
|
||||||
|
|
||||||
|
With that said, Rust is *totally* a safe programming language.
|
||||||
|
|
||||||
|
Well, Rust *has* a safe programming language. Let's step back a bit.
|
||||||
|
|
||||||
|
Rust can be thought of as being composed of two
|
||||||
|
programming languages: *Safe* and *Unsafe*. Safe is For Reals Totally Safe.
|
||||||
|
Unsafe, unsurprisingly, is *not* For Reals Totally Safe. In fact, Unsafe lets
|
||||||
|
you do some really crazy unsafe things.
|
||||||
|
|
||||||
|
Safe is *the* Rust programming language. If all you do is write Safe Rust,
|
||||||
|
you will never have to worry about type-safety or memory-safety. You will never
|
||||||
|
endure a null or dangling pointer, or any of that Undefined Behaviour nonsense.
|
||||||
|
|
||||||
|
*That's totally awesome*.
|
||||||
|
|
||||||
|
The standard library also gives you enough utilities out-of-the-box that you'll
|
||||||
|
be able to write awesome high-performance applications and libraries in pure
|
||||||
|
idiomatic Safe Rust.
|
||||||
|
|
||||||
|
But maybe you want to talk to another language. Maybe you're writing a
|
||||||
|
low-level abstraction not exposed by the standard library. Maybe you're
|
||||||
|
*writing* the standard library (which is written entirely in Rust). Maybe you
|
||||||
|
need to do something the type-system doesn't understand and just *frob some dang
|
||||||
|
bits*. Maybe you need Unsafe Rust.
|
||||||
|
|
||||||
|
Unsafe Rust is exactly like Safe Rust with *all* the same rules and semantics.
|
||||||
|
However Unsafe Rust lets you do some *extra* things that are Definitely Not Safe.
|
||||||
|
|
||||||
|
The only things that are different in Unsafe Rust are that you can:
|
||||||
|
|
||||||
|
* Dereference raw pointers
|
||||||
|
* Call `unsafe` functions (including C functions, intrinsics, and the raw allocator)
|
||||||
|
* Implement `unsafe` traits
|
||||||
|
* Mutate statics
|
||||||
|
|
||||||
|
That's it. The reason these operations are relegated to Unsafe is that misusing
|
||||||
|
any of these things will cause the ever dreaded Undefined Behaviour. Invoking
|
||||||
|
Undefined Behaviour gives the compiler full rights to do arbitrarily bad things
|
||||||
|
to your program. You definitely *should not* invoke Undefined Behaviour.
|
||||||
|
|
||||||
|
Unlike C, Undefined Behaviour is pretty limited in scope in Rust. All the core
|
||||||
|
language cares about is preventing the following things:
|
||||||
|
|
||||||
|
* Dereferencing null or dangling pointers
|
||||||
|
* Reading [uninitialized memory][]
|
||||||
|
* Breaking the [pointer aliasing rules][]
|
||||||
|
* Producing invalid primitive values:
|
||||||
|
* dangling/null references
|
||||||
|
* a `bool` that isn't 0 or 1
|
||||||
|
* an undefined `enum` discriminant
|
||||||
|
* a `char` outside the ranges [0x0, 0xD7FF] and [0xE000, 0x10FFFF]
|
||||||
|
* A non-utf8 `str`
|
||||||
|
* Unwinding into another language
|
||||||
|
* Causing a [data race][race]
|
||||||
|
* Double-dropping a value
|
||||||
|
|
||||||
|
That's it. That's all the Undefined Behaviour baked into Rust. Of course, unsafe
|
||||||
|
functions and traits are free to declare arbitrary other constraints that a
|
||||||
|
program must maintain to avoid Undefined Behaviour. However these are generally
|
||||||
|
just things that will transitively lead to one of the above problems. Some
|
||||||
|
additional constraints may also derive from compiler intrinsics that make special
|
||||||
|
assumptions about how code can be optimized.
|
||||||
|
|
||||||
|
Rust is otherwise quite permissive with respect to other dubious operations. Rust
|
||||||
|
considers it "safe" to:
|
||||||
|
|
||||||
|
* Deadlock
|
||||||
|
* Have a [race condition][race]
|
||||||
|
* Leak memory
|
||||||
|
* Fail to call destructors
|
||||||
|
* Overflow integers
|
||||||
|
* Abort the program
|
||||||
|
* Delete the production database
|
||||||
|
|
||||||
|
However any program that actually manages to do such a thing is *probably*
|
||||||
|
incorrect. Rust provides lots of tools to make these things rare, but
|
||||||
|
these problems are considered impractical to categorically prevent.
|
||||||
|
|
||||||
|
[pointer aliasing rules]: references.html
|
||||||
|
[uninitialized memory]: uninitialized.html
|
||||||
|
[race]: races.html
|
||||||
|
Loading…
Reference in new issue