poke at data and conversions more

pull/10/head
Alexis Beingessner 10 years ago committed by Manish Goregaokar
parent 931728a825
commit c403766c45

@ -32,21 +32,6 @@ more ergonomic alternatives.
# Auto-Deref
(Maybe nix this in favour of receiver coercions)
Deref is a trait that allows you to overload the unary `*` to specify a type
you dereference to. This is largely only intended to be implemented by pointer
types like `&`, `Box`, and `Rc`. The dot operator will automatically perform
automatic dereferencing, so that foo.bar() will work uniformly on `Foo`, `&Foo`, `
&&Foo`, `&Rc<Box<&mut&Box<Foo>>>` and so-on. Search bottoms out on the *first* match,
so implementing methods on pointers is generally to be avoided, as it will shadow
"actual" methods.
# Coercions
Types can implicitly be coerced to change in certain contexts. These changes are
@ -58,88 +43,42 @@ Here's all the kinds of coercion:
Coercion is allowed between the following types:
* `T` to `U` if `T` is a [subtype](lifetimes.html#subtyping-and-variance)
of `U` (the 'identity' case);
* `T_1` to `T_3` where `T_1` coerces to `T_2` and `T_2` coerces to `T_3`
(transitivity case);
* `&mut T` to `&T`;
* `*mut T` to `*const T`;
* `&T` to `*const T`;
* `&mut T` to `*mut T`;
* `T` to `U` if `T` implements `CoerceUnsized<U>` (see below) and `T = Foo<...>`
and `U = Foo<...>`;
* From TyCtor(`T`) to TyCtor(coerce_inner(`T`));
where TyCtor(`T`) is one of `&T`, `&mut T`, `*const T`, `*mut T`, or `Box<T>`.
And where coerce_inner is defined as
* coerce_inner(`[T, ..n]`) = `[T]`;
* coerce_inner(`T`) = `U` where `T` is a concrete type which implements the
trait `U`;
* coerce_inner(`T`) = `U` where `T` is a sub-trait of `U`;
* coerce_inner(`Foo<..., T, ...>`) = `Foo<..., coerce_inner(T), ...>` where
`Foo` is a struct and only the last field has type `T` and `T` is not part of
the type of any other fields;
* coerce_inner(`(..., T)`) = `(..., coerce_inner(T))`.
Coercions only occur at a *coercion site*. Exhaustively, the coercion sites
are:
* In `let` statements where an explicit type is given: in `let _: U = e;`, `e`
is coerced to to have type `U`;
* In statics and consts, similarly to `let` statements;
* In argument position for function calls. The value being coerced is the actual
parameter and it is coerced to the type of the formal parameter. For example,
where `foo` is defined as `fn foo(x: U) { ... }` and is called with `foo(e);`,
`e` is coerced to have type `U`;
* Where a field of a struct or variant is instantiated. E.g., where `struct Foo
{ x: U }` and the instantiation is `Foo { x: e }`, `e` is coerced to to have
type `U`;
* The result of a function, either the final line of a block if it is not semi-
colon terminated or any expression in a `return` statement. For example, for
`fn foo() -> U { e }`, `e` is coerced to to have type `U`;
If the expression in one of these coercion sites is a coercion-propagating
expression, then the relevant sub-expressions in that expression are also
coercion sites. Propagation recurses from these new coercion sites. Propagating
expressions and their relevant sub-expressions are:
* array literals, where the array has type `[U, ..n]`, each sub-expression in
the array literal is a coercion site for coercion to type `U`;
* array literals with repeating syntax, where the array has type `[U, ..n]`, the
repeated sub-expression is a coercion site for coercion to type `U`;
* tuples, where a tuple is a coercion site to type `(U_0, U_1, ..., U_n)`, each
sub-expression is a coercion site for the respective type, e.g., the zero-th
sub-expression is a coercion site to `U_0`;
* the box expression, if the expression has type `Box<U>`, the sub-expression is
a coercion site to `U`;
* parenthesised sub-expressions (`(e)`), if the expression has type `U`, then
the sub-expression is a coercion site to `U`;
* blocks, if a block has type `U`, then the last expression in the block (if it
is not semicolon-terminated) is a coercion site to `U`. This includes blocks
which are part of control flow statements, such as `if`/`else`, if the block
has a known type.
* Subtyping: `T` to `U` if `T` is a [subtype](lifetimes.html#subtyping-and-variance)
of `U`
* Transitivity: `T_1` to `T_3` where `T_1` coerces to `T_2` and `T_2` coerces to `T_3`
* Pointer Weakening:
* `&mut T` to `&T`
* `*mut T` to `*const T`
* `&T` to `*const T`
* `&mut T` to `*mut T`
* Unsizing: `T` to `U` if `T` implements `CoerceUnsized<U>`
`CoerceUnsized<Pointer<U>> for Pointer<T>` where T: Unsize<U> is implemented
for all pointer types (including smart pointers like Box and Rc). Unsize is
only implemented automatically, and enables the following transformations:
* `[T, ..n]` => `[T]`
* `T` => `Trait` where `T: Trait`
* `SubTrait` => `Trait` where `SubTrait: Trait` (TODO: is this now implied by the previous?)
* `Foo<..., T, ...>` => `Foo<..., U, ...>` where:
* T: Unsize<U>
* `Foo` is a struct
* Only the last field has type `T`
* `T` is not part of the type of any other fields
(note that this also applies to to tuples as an anonymous struct `Tuple3<T, U, V>`)
Coercions occur at a *coercion site*. Any location that is explicitly typed
will cause a coercion to its type. If inference is necessary, the coercion will
not be performed. Exhaustively, the coercion sites for an expression `e` to
type `U` are:
* let statements, statics, and consts: `let x: U = e`
* Arguments to functions: `takes_a_U(e)`
* Any expression that will be returned: `fn foo() -> U { e }`
* Struct literals: `Foo { some_u: e }`
* Array literals: `let x: [U; 10] = [e, ..]`
* Tuple literals: `let x: (U, ..) = (e, ..)`
* The last expression in a block: `let x: U = { ..; e }`
Note that we do not perform coercions when matching traits (except for
receivers, see below). If there is an impl for some type `U` and `T` coerces to
@ -147,29 +86,32 @@ receivers, see below). If there is an impl for some type `U` and `T` coerces to
following will not type check, even though it is OK to coerce `t` to `&T` and
there is an impl for `&T`:
```
struct T;
```rust
trait Trait {}
fn foo<X: Trait>(t: X) {}
impl<'a> Trait for &'a T {}
impl<'a> Trait for &'a i32 {}
fn main() {
let t: &mut T = &mut T;
foo(t); //~ ERROR failed to find an implementation of trait Trait for &mut T
let t: &mut i32 = &mut 0;
foo(t);
}
```
In a cast expression, `e as U`, the compiler will first attempt to coerce `e` to
`U`, only if that fails will the conversion rules for casts (see below) be
applied.
```text
<anon>:10:5: 10:8 error: the trait `Trait` is not implemented for the type `&mut i32` [E0277]
<anon>:10 foo(t);
^~~
```
# The Dot Operator
TODO: receiver coercions?
The dot operator will perform a lot of magic to convert types. It will perform
auto-referencing, auto-dereferencing, and coercion until types match.
TODO: steal information from http://stackoverflow.com/questions/28519997/what-are-rusts-exact-auto-dereferencing-rules/28552082#28552082
# Casts
@ -178,21 +120,21 @@ cast, but some conversions *require* a cast. These "true casts" are generally re
as dangerous or problematic actions. True casts revolve around raw pointers and
the primitive numeric types. True casts aren't checked.
Here's an exhaustive list of all the true casts:
* `e` has type `T` and `T` coerces to `U`; *coercion-cast*
* `e` has type `*T`, `U` is `*U_0`, and either `U_0: Sized` or
unsize_kind(`T`) = unsize_kind(`U_0`); *ptr-ptr-cast*
* `e` has type `*T` and `U` is a numeric type, while `T: Sized`; *ptr-addr-cast*
* `e` is an integer and `U` is `*U_0`, while `U_0: Sized`; *addr-ptr-cast*
* `e` has type `T` and `T` and `U` are any numeric types; *numeric-cast*
* `e` is a C-like enum and `U` is an integer type; *enum-cast*
* `e` has type `bool` or `char` and `U` is an integer; *prim-int-cast*
* `e` has type `u8` and `U` is `char`; *u8-char-cast*
* `e` has type `&[T; n]` and `U` is `*const T`; *array-ptr-cast*
* `e` is a function pointer type and `U` has type `*T`,
while `T: Sized`; *fptr-ptr-cast*
* `e` is a function pointer type and `U` is an integer; *fptr-addr-cast*
Here's an exhaustive list of all the true casts. For brevity, we will use `*`
to denote either a `*const` or `*mut`, and `integer` to denote any integral primitive:
* `*T as *U` where `T, U: Sized`
* `*T as *U` TODO: explain unsized situation
* `*T as integer`
* `integer as *T`
* `number as number`
* `C-like-enum as integer`
* `bool as integer`
* `char as integer`
* `u8 as char`
* `&[T; n] as *const T`
* `fn as *T` where `T: Sized`
* `fn as integer`
where `&.T` and `*T` are references of either mutability,
and where unsize_kind(`T`) is the kind of the unsize info

@ -7,7 +7,7 @@ represented in Rust.
# The rust repr
# The Rust repr
Rust gives you the following ways to lay out composite data:
@ -16,12 +16,14 @@ Rust gives you the following ways to lay out composite data:
* arrays (homogeneous product types)
* enums (named sum types -- tagged unions)
For all these, individual fields are aligned to their preferred alignment.
For primitives this is equal to
their size. For instance, a u32 will be aligned to a multiple of 32 bits, and a u16 will
be aligned to a multiple of 16 bits. Composite structures will have their size rounded
up to be a multiple of the highest alignment required by their fields, and an alignment
requirement equal to the highest alignment required by their fields. So for instance,
An enum is said to be *C-like* if none of its variants have associated data.
For all these, individual fields are aligned to their preferred alignment. For
primitives this is usually equal to their size. For instance, a u32 will be
aligned to a multiple of 32 bits, and a u16 will be aligned to a multiple of 16
bits. Composite structures will have their size rounded up to be a multiple of
the highest alignment required by their fields, and an alignment requirement
equal to the highest alignment required by their fields. So for instance,
```rust
struct A {
@ -127,6 +129,9 @@ In principle enums can use fairly elaborate algorithms to cache bits throughout
with special constrained representations. As such it is *especially* desirable that we leave
enum layout unspecified today.
# Dynamically Sized Types (DSTs)
Rust also supports types without a statically known size. On the surface,
@ -219,15 +224,14 @@ struct Foo {
```
For details as to *why* this is done, and how to make it not happen, check out
[SOME OTHER SECTION].
[TODO: SOME OTHER SECTION].
# Alternative representations
Rust allows you to specify alternative data layout strategies from the default Rust
one.
Rust allows you to specify alternative data layout strategies from the default.
@ -241,32 +245,44 @@ to soundly do more elaborate tricks with data layout such as reintepretting valu
as a different type.
However, the interaction with Rust's more exotic data layout features must be kept
in mind. Due to its dual purpose as a "for FFI" and "for layout control", repr(C)
in mind. Due to its dual purpose as "for FFI" and "for layout control", `repr(C)`
can be applied to types that will be nonsensical or problematic if passed through
the FFI boundary.
* ZSTs are still zero-sized, even though this is not a standard behaviour
in C, and is explicitly contrary to the behaviour of an empty type in C++, which
still consumes a byte of space.
in C, and is explicitly contrary to the behaviour of an empty type in C++, which
still consumes a byte of space.
* DSTs are not a concept in C
* DSTs, tuples, and tagged unions are not a concept in C and as such are never
FFI safe.
* **The drop flag will still be added**
* This is equivalent to repr(u32) for enums (see below)
* This is equivalent to `repr(u32)` for enums (see below)
## repr(packed)
`repr(packed)` forces rust to strip any padding it would normally apply.
This may improve the memory footprint of a type, but will have negative
side-effects from "field access is heavily penalized" to "completely breaks
everything" based on target platform.
`repr(packed)` forces rust to strip any padding, and only align the type to a
byte. This may improve the memory footprint, but will likely have other
negative side-effects.
In particular, most architectures *strongly* prefer values to be aligned. This
may mean the unaligned loads are penalized (x86), or even fault (ARM). In
particular, the compiler may have trouble with references to unaligned fields.
`repr(packed)` is not to be used lightly. Unless you have extreme requirements,
this should not be used.
This repr is a modifier on `repr(C)` and `repr(rust)`.
## repr(u8), repr(u16), repr(u32), repr(u64)
These specify the size to make a c-like enum (one which has no values in its variants).
These specify the size to make a C-like enum. If the discriminant overflows the
integer it has to fit in, it will be an error. You can manually ask Rust to
allow this by setting the overflowing element to explicitly be 0. However Rust
will not allow you to create an enum where two variants.
These reprs have no affect on struct or non-C-like enum.

Loading…
Cancel
Save