7.3 KiB
% Subtyping and Variance
Although Rust doesn't have any notion of inheritance, it does include subtyping. In Rust, subtyping derives entirely from lifetimes. Since lifetimes are scopes, we can partially order them based on the contains (outlives) relationship. We can even express this as a generic bound.
Subtyping on lifetimes in terms of that relationship: if 'a: 'b
("a contains b" or "a outlives b"), then 'a
is a subtype of 'b
. This is a
large source of confusion, because it seems intuitively backwards to many:
the bigger scope is a sub type of the smaller scope.
This does in fact make sense, though. The intuitive reason for this is that if
you expect an &'a u8
, then it's totally fine for me to hand you an &'static u8
,
in the same way that if you expect an Animal in Java, it's totally fine for me to
hand you a Cat. Cats are just Animals and more, just as 'static
is just 'a
and more.
(Note, the subtyping relationship and typed-ness of lifetimes is a fairly arbitrary construct that some disagree with. However it simplifies our analysis to treat lifetimes and types uniformly.)
Higher-ranked lifetimes are also subtypes of every concrete lifetime. This is because taking an arbitrary lifetime is strictly more general than taking a specific one.
Variance
Variance is where things get a bit complicated.
Variance is a property that type constructors have. A type constructor in Rust
is a generic type with unbound arguments. For instance Vec
is a type constructor
that takes a T
and returns a Vec<T>
. &
and &mut
are type constructors that
take a two types: a lifetime, and a type to point to.
A type constructor's variance is how the subtyping of its inputs affects the subtyping of its outputs. There are two kinds of variance in Rust:
- F is variant if
T
being a subtype ofU
impliesF<T>
is a subtype ofF<U>
- F is invariant otherwise (no subtyping relation can be derived)
(For those of you who are familiar with variance from other languages, what we refer to as "just" variance is in fact covariance. Rust does not have contravariance. Historically Rust did have some contravariance but it was scrapped due to poor interactions with other features.)
Some important variances:
&
is variant (as is*const
by metaphor)&mut
is invariantFn(T) -> U
is invariant with respect toT
, but variant with respect toU
Box
,Vec
, and all other collections are variantUnsafeCell
,Cell
,RefCell
,Mutex
and all "interior mutability" types are invariant (as is*mut
by metaphor)
To understand why these variances are correct and desirable, we will consider several
examples. We have already covered why &
should be variant when introducing subtyping:
it's desirable to be able to pass longer-lived things where shorter-lived things are
needed.
To see why &mut
should be invariant, consider the following code:
fn overwrite<T: Copy>(input: &mut T, new: &mut T) {
*input = *new;
}
fn main() {
let mut forever_str: &'static str = "hello";
{
let string = String::from("world");
overwrite(&mut forever_str, &mut &*string);
}
// Oops, printing free'd memory
println!("{}", forever_str);
}
The signature of overwrite
is clearly valid: it takes mutable references to
two values of the same type, and overwrites one with the other. If &mut
was
variant, then &mut &'a str
would be a subtype of &mut &'static str
, since
&'a str
is a subtype of &'static str
. Therefore the lifetime of
forever_str
would successfully be "shrunk" down to the shorter lifetime of
string
, and overwrite
would be called successfully. string
would
subsequently be dropped, and forever_str
would point to freed memory when we
print it! Therefore &mut
should be invariant.
This is the general theme of variance vs invariance: if variance would allow you to store a short-lived value in a longer-lived slot, then you must be invariant.
Box
and Vec
are interesting cases because they're variant, but you can
definitely store values in them! This is where Rust gets really clever: it's
fine for them to be variant because you can only store values
in them via a mutable reference! The mutable reference makes the whole type
invariant, and therefore prevents you from smuggling a short-lived type into
them.
Being variant does allows them to be weakened when shared immutably.
So you can pass a &Box<&'static str>
where a &Box<&'a str>
is expected.
However what should happen when passing by-value is less obvious. It turns out that, yes, you can use subtyping when passing by-value. That is, this works:
fn get_box<'a>(&'a u8) -> Box<&'a str> {
// string literals are `&'static str`s
Box::new("hello")
}
Weakening when you pass by-value is fine because there's no one else who
"remembers" the old lifetime in the Box. The reason a variant &mut
was
trouble was because there's always someone else who remembers the original
subtype: the actual owner.
The invariance of the cell types can be seen as follows: &
is like an &mut
for a
cell, because you can still store values in them through an &
. Therefore cells
must be invariant to avoid lifetime smuggling.
Fn
is the most subtle case because it has mixed variance. To see why
Fn(T) -> U
should be invariant over T, consider the following function
signature:
// 'a is derived from some parent scope
fn foo(&'a str) -> usize;
This signature claims that it can handle any &str
that lives at least as long
as 'a
. Now if this signature was variant with respect to &str
, that would mean
fn foo(&'static str) -> usize;
could be provided in its place, as it would be a subtype. However this function
has a stronger requirement: it says that it can only handle &'static str
s,
and nothing else. Therefore functions are not variant over their arguments.
To see why Fn(T) -> U
should be variant over U, consider the following
function signature:
// 'a is derived from some parent scope
fn foo(usize) -> &'a str;
This signature claims that it will return something that outlives 'a
. It is
therefore completely reasonable to provide
fn foo(usize) -> &'static str;
in its place. Therefore functions are variant over their return type.
*const
has the exact same semantics as &
, so variance follows. *mut
on the
other hand can dereference to an &mut whether shared or not, so it is marked
as invariant just like cells.
This is all well and good for the types the standard library provides, but
how is variance determined for type that you define? A struct, informally
speaking, inherits the variance of its fields. If a struct Foo
has a generic argument A
that is used in a field a
, then Foo's variance
over A
is exactly a
's variance. However this is complicated if A
is used
in multiple fields.
- If all uses of A are variant, then Foo is variant over A
- Otherwise, Foo is invariant over A
struct Foo<'a, 'b, A, B, C, D, E, F, G, H> {
a: &'a A, // variant over 'a and A
b: &'b mut B, // invariant over 'b and B
c: *const C, // variant over C
d: *mut D, // invariant over D
e: Vec<E>, // variant over E
f: Cell<F>, // invariant over F
g: G // variant over G
h1: H // would also be variant over H except...
h2: Cell<H> // invariant over H, because invariance wins
}