poke at data and conversions more

pull/10/head
Alexis Beingessner 10 years ago committed by Manish Goregaokar
parent 931728a825
commit c403766c45

@ -32,21 +32,6 @@ more ergonomic alternatives.
# Auto-Deref
(Maybe nix this in favour of receiver coercions)
Deref is a trait that allows you to overload the unary `*` to specify a type
you dereference to. This is largely only intended to be implemented by pointer
types like `&`, `Box`, and `Rc`. The dot operator will automatically perform
automatic dereferencing, so that foo.bar() will work uniformly on `Foo`, `&Foo`, `
&&Foo`, `&Rc<Box<&mut&Box<Foo>>>` and so-on. Search bottoms out on the *first* match,
so implementing methods on pointers is generally to be avoided, as it will shadow
"actual" methods.
# Coercions # Coercions
Types can implicitly be coerced to change in certain contexts. These changes are Types can implicitly be coerced to change in certain contexts. These changes are
@ -58,88 +43,42 @@ Here's all the kinds of coercion:
Coercion is allowed between the following types: Coercion is allowed between the following types:
* `T` to `U` if `T` is a [subtype](lifetimes.html#subtyping-and-variance) * Subtyping: `T` to `U` if `T` is a [subtype](lifetimes.html#subtyping-and-variance)
of `U` (the 'identity' case); of `U`
* Transitivity: `T_1` to `T_3` where `T_1` coerces to `T_2` and `T_2` coerces to `T_3`
* `T_1` to `T_3` where `T_1` coerces to `T_2` and `T_2` coerces to `T_3` * Pointer Weakening:
(transitivity case); * `&mut T` to `&T`
* `*mut T` to `*const T`
* `&mut T` to `&T`; * `&T` to `*const T`
* `&mut T` to `*mut T`
* `*mut T` to `*const T`; * Unsizing: `T` to `U` if `T` implements `CoerceUnsized<U>`
* `&T` to `*const T`; `CoerceUnsized<Pointer<U>> for Pointer<T>` where T: Unsize<U> is implemented
for all pointer types (including smart pointers like Box and Rc). Unsize is
* `&mut T` to `*mut T`; only implemented automatically, and enables the following transformations:
* `T` to `U` if `T` implements `CoerceUnsized<U>` (see below) and `T = Foo<...>` * `[T, ..n]` => `[T]`
and `U = Foo<...>`; * `T` => `Trait` where `T: Trait`
* `SubTrait` => `Trait` where `SubTrait: Trait` (TODO: is this now implied by the previous?)
* From TyCtor(`T`) to TyCtor(coerce_inner(`T`)); * `Foo<..., T, ...>` => `Foo<..., U, ...>` where:
* T: Unsize<U>
where TyCtor(`T`) is one of `&T`, `&mut T`, `*const T`, `*mut T`, or `Box<T>`. * `Foo` is a struct
And where coerce_inner is defined as * Only the last field has type `T`
* `T` is not part of the type of any other fields
* coerce_inner(`[T, ..n]`) = `[T]`; (note that this also applies to to tuples as an anonymous struct `Tuple3<T, U, V>`)
* coerce_inner(`T`) = `U` where `T` is a concrete type which implements the Coercions occur at a *coercion site*. Any location that is explicitly typed
trait `U`; will cause a coercion to its type. If inference is necessary, the coercion will
not be performed. Exhaustively, the coercion sites for an expression `e` to
* coerce_inner(`T`) = `U` where `T` is a sub-trait of `U`; type `U` are:
* coerce_inner(`Foo<..., T, ...>`) = `Foo<..., coerce_inner(T), ...>` where * let statements, statics, and consts: `let x: U = e`
`Foo` is a struct and only the last field has type `T` and `T` is not part of * Arguments to functions: `takes_a_U(e)`
the type of any other fields; * Any expression that will be returned: `fn foo() -> U { e }`
* Struct literals: `Foo { some_u: e }`
* coerce_inner(`(..., T)`) = `(..., coerce_inner(T))`. * Array literals: `let x: [U; 10] = [e, ..]`
* Tuple literals: `let x: (U, ..) = (e, ..)`
Coercions only occur at a *coercion site*. Exhaustively, the coercion sites * The last expression in a block: `let x: U = { ..; e }`
are:
* In `let` statements where an explicit type is given: in `let _: U = e;`, `e`
is coerced to to have type `U`;
* In statics and consts, similarly to `let` statements;
* In argument position for function calls. The value being coerced is the actual
parameter and it is coerced to the type of the formal parameter. For example,
where `foo` is defined as `fn foo(x: U) { ... }` and is called with `foo(e);`,
`e` is coerced to have type `U`;
* Where a field of a struct or variant is instantiated. E.g., where `struct Foo
{ x: U }` and the instantiation is `Foo { x: e }`, `e` is coerced to to have
type `U`;
* The result of a function, either the final line of a block if it is not semi-
colon terminated or any expression in a `return` statement. For example, for
`fn foo() -> U { e }`, `e` is coerced to to have type `U`;
If the expression in one of these coercion sites is a coercion-propagating
expression, then the relevant sub-expressions in that expression are also
coercion sites. Propagation recurses from these new coercion sites. Propagating
expressions and their relevant sub-expressions are:
* array literals, where the array has type `[U, ..n]`, each sub-expression in
the array literal is a coercion site for coercion to type `U`;
* array literals with repeating syntax, where the array has type `[U, ..n]`, the
repeated sub-expression is a coercion site for coercion to type `U`;
* tuples, where a tuple is a coercion site to type `(U_0, U_1, ..., U_n)`, each
sub-expression is a coercion site for the respective type, e.g., the zero-th
sub-expression is a coercion site to `U_0`;
* the box expression, if the expression has type `Box<U>`, the sub-expression is
a coercion site to `U`;
* parenthesised sub-expressions (`(e)`), if the expression has type `U`, then
the sub-expression is a coercion site to `U`;
* blocks, if a block has type `U`, then the last expression in the block (if it
is not semicolon-terminated) is a coercion site to `U`. This includes blocks
which are part of control flow statements, such as `if`/`else`, if the block
has a known type.
Note that we do not perform coercions when matching traits (except for Note that we do not perform coercions when matching traits (except for
receivers, see below). If there is an impl for some type `U` and `T` coerces to receivers, see below). If there is an impl for some type `U` and `T` coerces to
@ -147,29 +86,32 @@ receivers, see below). If there is an impl for some type `U` and `T` coerces to
following will not type check, even though it is OK to coerce `t` to `&T` and following will not type check, even though it is OK to coerce `t` to `&T` and
there is an impl for `&T`: there is an impl for `&T`:
``` ```rust
struct T;
trait Trait {} trait Trait {}
fn foo<X: Trait>(t: X) {} fn foo<X: Trait>(t: X) {}
impl<'a> Trait for &'a T {} impl<'a> Trait for &'a i32 {}
fn main() { fn main() {
let t: &mut T = &mut T; let t: &mut i32 = &mut 0;
foo(t); //~ ERROR failed to find an implementation of trait Trait for &mut T foo(t);
} }
``` ```
In a cast expression, `e as U`, the compiler will first attempt to coerce `e` to ```text
`U`, only if that fails will the conversion rules for casts (see below) be <anon>:10:5: 10:8 error: the trait `Trait` is not implemented for the type `&mut i32` [E0277]
applied. <anon>:10 foo(t);
^~~
```
# The Dot Operator
TODO: receiver coercions? The dot operator will perform a lot of magic to convert types. It will perform
auto-referencing, auto-dereferencing, and coercion until types match.
TODO: steal information from http://stackoverflow.com/questions/28519997/what-are-rusts-exact-auto-dereferencing-rules/28552082#28552082
# Casts # Casts
@ -178,21 +120,21 @@ cast, but some conversions *require* a cast. These "true casts" are generally re
as dangerous or problematic actions. True casts revolve around raw pointers and as dangerous or problematic actions. True casts revolve around raw pointers and
the primitive numeric types. True casts aren't checked. the primitive numeric types. True casts aren't checked.
Here's an exhaustive list of all the true casts: Here's an exhaustive list of all the true casts. For brevity, we will use `*`
to denote either a `*const` or `*mut`, and `integer` to denote any integral primitive:
* `e` has type `T` and `T` coerces to `U`; *coercion-cast*
* `e` has type `*T`, `U` is `*U_0`, and either `U_0: Sized` or * `*T as *U` where `T, U: Sized`
unsize_kind(`T`) = unsize_kind(`U_0`); *ptr-ptr-cast* * `*T as *U` TODO: explain unsized situation
* `e` has type `*T` and `U` is a numeric type, while `T: Sized`; *ptr-addr-cast* * `*T as integer`
* `e` is an integer and `U` is `*U_0`, while `U_0: Sized`; *addr-ptr-cast* * `integer as *T`
* `e` has type `T` and `T` and `U` are any numeric types; *numeric-cast* * `number as number`
* `e` is a C-like enum and `U` is an integer type; *enum-cast* * `C-like-enum as integer`
* `e` has type `bool` or `char` and `U` is an integer; *prim-int-cast* * `bool as integer`
* `e` has type `u8` and `U` is `char`; *u8-char-cast* * `char as integer`
* `e` has type `&[T; n]` and `U` is `*const T`; *array-ptr-cast* * `u8 as char`
* `e` is a function pointer type and `U` has type `*T`, * `&[T; n] as *const T`
while `T: Sized`; *fptr-ptr-cast* * `fn as *T` where `T: Sized`
* `e` is a function pointer type and `U` is an integer; *fptr-addr-cast* * `fn as integer`
where `&.T` and `*T` are references of either mutability, where `&.T` and `*T` are references of either mutability,
and where unsize_kind(`T`) is the kind of the unsize info and where unsize_kind(`T`) is the kind of the unsize info

@ -7,7 +7,7 @@ represented in Rust.
# The rust repr # The Rust repr
Rust gives you the following ways to lay out composite data: Rust gives you the following ways to lay out composite data:
@ -16,12 +16,14 @@ Rust gives you the following ways to lay out composite data:
* arrays (homogeneous product types) * arrays (homogeneous product types)
* enums (named sum types -- tagged unions) * enums (named sum types -- tagged unions)
For all these, individual fields are aligned to their preferred alignment. An enum is said to be *C-like* if none of its variants have associated data.
For primitives this is equal to
their size. For instance, a u32 will be aligned to a multiple of 32 bits, and a u16 will For all these, individual fields are aligned to their preferred alignment. For
be aligned to a multiple of 16 bits. Composite structures will have their size rounded primitives this is usually equal to their size. For instance, a u32 will be
up to be a multiple of the highest alignment required by their fields, and an alignment aligned to a multiple of 32 bits, and a u16 will be aligned to a multiple of 16
requirement equal to the highest alignment required by their fields. So for instance, bits. Composite structures will have their size rounded up to be a multiple of
the highest alignment required by their fields, and an alignment requirement
equal to the highest alignment required by their fields. So for instance,
```rust ```rust
struct A { struct A {
@ -127,6 +129,9 @@ In principle enums can use fairly elaborate algorithms to cache bits throughout
with special constrained representations. As such it is *especially* desirable that we leave with special constrained representations. As such it is *especially* desirable that we leave
enum layout unspecified today. enum layout unspecified today.
# Dynamically Sized Types (DSTs) # Dynamically Sized Types (DSTs)
Rust also supports types without a statically known size. On the surface, Rust also supports types without a statically known size. On the surface,
@ -219,15 +224,14 @@ struct Foo {
``` ```
For details as to *why* this is done, and how to make it not happen, check out For details as to *why* this is done, and how to make it not happen, check out
[SOME OTHER SECTION]. [TODO: SOME OTHER SECTION].
# Alternative representations # Alternative representations
Rust allows you to specify alternative data layout strategies from the default Rust Rust allows you to specify alternative data layout strategies from the default.
one.
@ -241,32 +245,44 @@ to soundly do more elaborate tricks with data layout such as reintepretting valu
as a different type. as a different type.
However, the interaction with Rust's more exotic data layout features must be kept However, the interaction with Rust's more exotic data layout features must be kept
in mind. Due to its dual purpose as a "for FFI" and "for layout control", repr(C) in mind. Due to its dual purpose as "for FFI" and "for layout control", `repr(C)`
can be applied to types that will be nonsensical or problematic if passed through can be applied to types that will be nonsensical or problematic if passed through
the FFI boundary. the FFI boundary.
* ZSTs are still zero-sized, even though this is not a standard behaviour * ZSTs are still zero-sized, even though this is not a standard behaviour
in C, and is explicitly contrary to the behaviour of an empty type in C++, which in C, and is explicitly contrary to the behaviour of an empty type in C++, which
still consumes a byte of space. still consumes a byte of space.
* DSTs are not a concept in C * DSTs, tuples, and tagged unions are not a concept in C and as such are never
FFI safe.
* **The drop flag will still be added** * **The drop flag will still be added**
* This is equivalent to repr(u32) for enums (see below) * This is equivalent to `repr(u32)` for enums (see below)
## repr(packed) ## repr(packed)
`repr(packed)` forces rust to strip any padding it would normally apply. `repr(packed)` forces rust to strip any padding, and only align the type to a
This may improve the memory footprint of a type, but will have negative byte. This may improve the memory footprint, but will likely have other
side-effects from "field access is heavily penalized" to "completely breaks negative side-effects.
everything" based on target platform.
In particular, most architectures *strongly* prefer values to be aligned. This
may mean the unaligned loads are penalized (x86), or even fault (ARM). In
particular, the compiler may have trouble with references to unaligned fields.
`repr(packed)` is not to be used lightly. Unless you have extreme requirements,
this should not be used.
This repr is a modifier on `repr(C)` and `repr(rust)`.
## repr(u8), repr(u16), repr(u32), repr(u64) ## repr(u8), repr(u16), repr(u32), repr(u64)
These specify the size to make a c-like enum (one which has no values in its variants). These specify the size to make a C-like enum. If the discriminant overflows the
integer it has to fit in, it will be an error. You can manually ask Rust to
allow this by setting the overflowing element to explicitly be 0. However Rust
will not allow you to create an enum where two variants.
These reprs have no affect on struct or non-C-like enum.

Loading…
Cancel
Save