|
|
@ -3,53 +3,135 @@
|
|
|
|
Subtyping is a relationship between types that allows statically typed
|
|
|
|
Subtyping is a relationship between types that allows statically typed
|
|
|
|
languages to be a bit more flexible and permissive.
|
|
|
|
languages to be a bit more flexible and permissive.
|
|
|
|
|
|
|
|
|
|
|
|
The most common and easy to understand example of this can be found in
|
|
|
|
Subtyping in Rust is a bit different from subtyping in other languages. This
|
|
|
|
languages with inheritance. Consider an Animal type which has an `eat()`
|
|
|
|
makes it harder to give simple examples, which is a problem since subtyping,
|
|
|
|
method, and a Cat type which extends Animal, adding a `meow()` method.
|
|
|
|
and especially variance, are already hard to understand properly. As in,
|
|
|
|
Without subtyping, if someone were to write a `feed(Animal)` function, they
|
|
|
|
even compiler writers mess it up all the time.
|
|
|
|
wouldn't be able to pass a Cat to this function, because a Cat isn't *exactly*
|
|
|
|
|
|
|
|
an Animal. But being able to pass a Cat where an Animal is expected seems
|
|
|
|
To keep things simple, this section will consider a small extension to the
|
|
|
|
fairly reasonable. After all, a Cat is just an Animal *and more*. Something
|
|
|
|
Rust language that adds a new and simpler subtyping relationship. After
|
|
|
|
having extra features that can be ignored shouldn't be any impediment to
|
|
|
|
establishing concepts and issues under this simpler system,
|
|
|
|
using it!
|
|
|
|
we will then relate it back to how subtyping actually occurs in Rust.
|
|
|
|
|
|
|
|
|
|
|
|
This is exactly what subtyping lets us do. Because a Cat is an Animal *and more*
|
|
|
|
So here's our simple extension, *Objective Rust*, featuring three new types:
|
|
|
|
we say that Cat is a *subtype* of Animal. We then say that anywhere a value of
|
|
|
|
|
|
|
|
a certain type is expected, a value with a subtype can also be supplied. Ok
|
|
|
|
|
|
|
|
actually it's a lot more complicated and subtle than that, but that's the
|
|
|
|
```rust
|
|
|
|
basic intuition that gets you by in 99% of the cases. We'll cover why it's
|
|
|
|
trait Animal {
|
|
|
|
*only* 99% later in this section.
|
|
|
|
fn snuggle(&self);
|
|
|
|
|
|
|
|
fn eat(&mut self);
|
|
|
|
Although Rust doesn't have any notion of structural inheritance, it *does*
|
|
|
|
}
|
|
|
|
include subtyping. In Rust, subtyping derives entirely from lifetimes. Since
|
|
|
|
|
|
|
|
lifetimes are regions of code, we can partially order them based on the
|
|
|
|
trait Cat: Animal {
|
|
|
|
*contains* (outlives) relationship.
|
|
|
|
fn meow(&self);
|
|
|
|
|
|
|
|
}
|
|
|
|
Subtyping on lifetimes is in terms of that relationship: if `'big: 'small`
|
|
|
|
|
|
|
|
("big contains small" or "big outlives small"), then `'big` is a subtype
|
|
|
|
trait Dog: Animal {
|
|
|
|
|
|
|
|
fn bark(&self);
|
|
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
But unlike normal traits, we can use them as concrete and sized types, just like structs.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Now, say we have a very simple function that takes an Animal, like this:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
```rust,ignore
|
|
|
|
|
|
|
|
fn love(pet: Animal) {
|
|
|
|
|
|
|
|
pet.snuggle();
|
|
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
By default, static types must match *exactly* for a program to compile. As such,
|
|
|
|
|
|
|
|
this code won't compile:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
```rust,ignore
|
|
|
|
|
|
|
|
let mr_snuggles: Cat = ...;
|
|
|
|
|
|
|
|
love(mr_snuggles); // ERROR: expected Animal, found Cat
|
|
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Mr. Snuggles is a Cat, and Cats aren't *exactly* Animals, so we can't love him! 😿
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
This is annoying because Cats *are* Animals. They support every operation
|
|
|
|
|
|
|
|
an Animal supports, so intuitively `love` shouldn't care if we pass it a `Cat`.
|
|
|
|
|
|
|
|
We should be able to just **forget** the non-animal parts of our `Cat`, as they
|
|
|
|
|
|
|
|
aren't necessary to love it.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
This is exactly the problem that *subtyping* is intended to fix. Because Cats are just
|
|
|
|
|
|
|
|
Animals **and more**, we say Cat is a *subtype* of Animal (because Cats are a *subset*
|
|
|
|
|
|
|
|
of all the Animals). Equivalently, we say that Animal is a *supertype* of Cat.
|
|
|
|
|
|
|
|
With subtypes, we can tweak our overly strict static type system
|
|
|
|
|
|
|
|
with a simple rule: anywhere a value of type `T` is expected, we will also
|
|
|
|
|
|
|
|
accept values that are subtypes of `T`.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Or more concretely: anywhere an Animal is expected, a Cat or Dog will also work.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
As we will see throughout the rest of this section, subtyping is a lot more complicated
|
|
|
|
|
|
|
|
and subtle than this, but this simple rule is a very good 99% intuition. And unless you
|
|
|
|
|
|
|
|
write unsafe code, the compiler will automatically handle all the corner cases for you.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
But this is the Rustonomicon. We're writing unsafe code, so we need to understand how
|
|
|
|
|
|
|
|
this stuff really works, and how we can mess it up.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The core problem is that this rule, naively applied, will lead to *meowing Dogs*. That is,
|
|
|
|
|
|
|
|
we can convince someone that a Dog is actually a Cat. This completely destroys the fabric
|
|
|
|
|
|
|
|
of our static type system, making it worse than useless (and leading to Undefined Behaviour).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Here's a simple example of this happening when we apply subtyping in a completely naive
|
|
|
|
|
|
|
|
"find and replace" way.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
```rust,ignore
|
|
|
|
|
|
|
|
fn evil_feeder(pet: &mut Animal) {
|
|
|
|
|
|
|
|
let spike: Dog = ...;
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
// `pet` is an Animal, and Dog is a subtype of Animal,
|
|
|
|
|
|
|
|
// so this should be fine, right..?
|
|
|
|
|
|
|
|
*pet = spike;
|
|
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
fn main() {
|
|
|
|
|
|
|
|
let mut mr_snuggles: Cat = ...;
|
|
|
|
|
|
|
|
evil_feeder(&mut mr_snuggles); // Replaces mr_snuggles with a Dog
|
|
|
|
|
|
|
|
mr_snuggles.meow(); // OH NO, MEOWING DOG!
|
|
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Clearly, we need a more robust system than "find and replace". That system is *variance*,
|
|
|
|
|
|
|
|
which is a set of rules governing how subtyping should compose. Most importantly, variance
|
|
|
|
|
|
|
|
defines situations where subtyping should be disabled.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
But before we get into variance, let's take a quick peek at where subtyping actually occurs in
|
|
|
|
|
|
|
|
Rust: *lifetimes*!
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
> NOTE: The typed-ness of lifetimes is a fairly arbitrary construct that some
|
|
|
|
|
|
|
|
> disagree with. However it simplifies our analysis to treat lifetimes
|
|
|
|
|
|
|
|
> and types uniformly.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Lifetimes are just regions of code, and regions can be partially ordered with the *contains*
|
|
|
|
|
|
|
|
(outlives) relationship. Subtyping on lifetimes is in terms of that relationship:
|
|
|
|
|
|
|
|
if `'big: 'small` ("big contains small" or "big outlives small"), then `'big` is a subtype
|
|
|
|
of `'small`. This is a large source of confusion, because it seems backwards
|
|
|
|
of `'small`. This is a large source of confusion, because it seems backwards
|
|
|
|
to many: the bigger region is a *subtype* of the smaller region. But it makes
|
|
|
|
to many: the bigger region is a *subtype* of the smaller region. But it makes
|
|
|
|
sense if you consider our Animal example: *Cat* is an Animal *and more*,
|
|
|
|
sense if you consider our Animal example: Cat is an Animal *and more*,
|
|
|
|
just as `'big` is `'small` *and more*.
|
|
|
|
just as `'big` is `'small` *and more*.
|
|
|
|
|
|
|
|
|
|
|
|
Put another way, if someone wants a reference that lives for `'small`,
|
|
|
|
Put another way, if someone wants a reference that lives for `'small`,
|
|
|
|
usually what they actually mean is that they want a reference that lives
|
|
|
|
usually what they actually mean is that they want a reference that lives
|
|
|
|
for *at least* `'small`. They don't actually care if the lifetimes match
|
|
|
|
for *at least* `'small`. They don't actually care if the lifetimes match
|
|
|
|
exactly. For this reason `'static`, the forever lifetime, is a subtype
|
|
|
|
exactly. So it should be ok for us to **forget** that something lives for
|
|
|
|
of every lifetime.
|
|
|
|
`'big` and only remember that it lives for `'small`.
|
|
|
|
|
|
|
|
|
|
|
|
Higher-ranked lifetimes are also subtypes of every concrete lifetime. This is
|
|
|
|
The meowing dog problem for lifetimes will result in us being able to
|
|
|
|
because taking an arbitrary lifetime is strictly more general than taking a
|
|
|
|
store a short-lived reference in a place that expects a longer-lived one,
|
|
|
|
specific one.
|
|
|
|
creating a dangling reference and letting us use-after-free.
|
|
|
|
|
|
|
|
|
|
|
|
(The typed-ness of lifetimes is a fairly arbitrary construct that some
|
|
|
|
It will be useful to note that `'static`, the forever lifetime, is a subtype of
|
|
|
|
disagree with. However it simplifies our analysis to treat lifetimes
|
|
|
|
every lifetime because by definition it outlives everything. We will be using
|
|
|
|
and types uniformly.)
|
|
|
|
this relationship in later examples to keep them as simple as possible.
|
|
|
|
|
|
|
|
|
|
|
|
However you can't write a function that takes a value of type `'a`! Lifetimes
|
|
|
|
|
|
|
|
are always just part of another type, so we need a way of handling that.
|
|
|
|
|
|
|
|
To handle it, we need to talk about *variance*.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
With all that said, we still have no idea how to actually *use* subtyping of lifetimes,
|
|
|
|
|
|
|
|
because nothing ever has type `'a`. Lifetimes only occur as part of some larger type
|
|
|
|
|
|
|
|
like `&'a u32` or `IterMut<'a, u32>`. To apply lifetime subtyping, we need to know
|
|
|
|
|
|
|
|
how to compose subtyping. Once again, we need *variance*.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@ -59,200 +141,288 @@ To handle it, we need to talk about *variance*.
|
|
|
|
Variance is where things get a bit complicated.
|
|
|
|
Variance is where things get a bit complicated.
|
|
|
|
|
|
|
|
|
|
|
|
Variance is a property that *type constructors* have with respect to their
|
|
|
|
Variance is a property that *type constructors* have with respect to their
|
|
|
|
arguments. A type constructor in Rust is a generic type with unbound arguments.
|
|
|
|
arguments. A type constructor in Rust is any generic type with unbound arguments.
|
|
|
|
For instance `Vec` is a type constructor that takes a `T` and returns a
|
|
|
|
For instance `Vec` is a type constructor that takes a type `T` and returns
|
|
|
|
`Vec<T>`. `&` and `&mut` are type constructors that take two inputs: a
|
|
|
|
`Vec<T>`. `&` and `&mut` are type constructors that take two inputs: a
|
|
|
|
lifetime, and a type to point to.
|
|
|
|
lifetime, and a type to point to.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
> NOTE: For convenience we will often refer to `F<T>` as a type constructor just so
|
|
|
|
|
|
|
|
> that we can easily talk about `T`. Hopefully this is clear in context.
|
|
|
|
|
|
|
|
|
|
|
|
A type constructor F's *variance* is how the subtyping of its inputs affects the
|
|
|
|
A type constructor F's *variance* is how the subtyping of its inputs affects the
|
|
|
|
subtyping of its outputs. There are three kinds of variance in Rust:
|
|
|
|
subtyping of its outputs. There are three kinds of variance in Rust. Given two
|
|
|
|
|
|
|
|
types `Sub` and `Super`, where `Sub` is a subtype of `Super`:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
* `F` is *covariant* if `F<Sub>` is a subtype of `F<Super>` (subtyping "passes through")
|
|
|
|
|
|
|
|
* `F` is *contravariant* if `F<Super>` is a subtype of `F<Sub>` (subtyping is "inverted")
|
|
|
|
|
|
|
|
* `F` is *invariant* otherwise (no subtyping relationship exists)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
If `F` has multiple type parameters, we can talk about the individual variances
|
|
|
|
|
|
|
|
by saying that, for example, `F<T, U>` is covariant over `T` and invariant over `U`.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
It is very useful to keep in mind that covariance is, in practical terms, "the"
|
|
|
|
|
|
|
|
variance. Almost all consideration of variance is in terms of whether something
|
|
|
|
|
|
|
|
should be covariant or invariant. Actually witnessing contravariance is quite difficult
|
|
|
|
|
|
|
|
in Rust, though it does in fact exist.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Here is a table of important variances which the rest of this section will be devoted
|
|
|
|
|
|
|
|
to trying to explain:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| | | 'a | T | U |
|
|
|
|
|
|
|
|
|---|-----------------|:---------:|:-----------------:|:---------:|
|
|
|
|
|
|
|
|
| * | `&'a T ` | covariant | covariant | |
|
|
|
|
|
|
|
|
| * | `&'a mut T` | covariant | invariant | |
|
|
|
|
|
|
|
|
| * | `Box<T>` | | covariant | |
|
|
|
|
|
|
|
|
| | `Vec<T>` | | covariant | |
|
|
|
|
|
|
|
|
| * | `UnsafeCell<T>` | | invariant | |
|
|
|
|
|
|
|
|
| | `Cell<T>` | | invariant | |
|
|
|
|
|
|
|
|
| * | `fn(T) -> U` | | **contra**variant | covariant |
|
|
|
|
|
|
|
|
| | `*const T` | | covariant | |
|
|
|
|
|
|
|
|
| | `*mut T` | | invariant | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The types with \*'s are the ones we will be focusing on, as they are in
|
|
|
|
|
|
|
|
some sense "fundamental". All the others can be understood by analogy to the others:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
* Vec and all other owning pointers and collections follow the same logic as Box
|
|
|
|
|
|
|
|
* Cell and all other interior mutability types follow the same logic as UnsafeCell
|
|
|
|
|
|
|
|
* `*const` follows the logic of `&T`
|
|
|
|
|
|
|
|
* `*mut` follows the logic of `&mut T` (or `UnsafeCell<T>`)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
> NOTE: the *only* source of contravariance in the language is the arguments to
|
|
|
|
|
|
|
|
> a function, which is why it really doesn't come up much in practice. Invoking
|
|
|
|
|
|
|
|
> contravariance involves higher-order programming with function pointers that
|
|
|
|
|
|
|
|
> take references with specific lifetimes (as opposed to the usual "any lifetime",
|
|
|
|
|
|
|
|
> which gets into higher rank lifetimes, which work independently of subtyping).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Ok, that's enough type theory! Let's try to apply the concept of variance to Rust
|
|
|
|
|
|
|
|
and look at some examples.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
First off, let's revisit the meowing dog example:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
```rust,ignore
|
|
|
|
|
|
|
|
fn evil_feeder(pet: &mut Animal) {
|
|
|
|
|
|
|
|
let spike: Dog = ...;
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
// `pet` is an Animal, and Dog is a subtype of Animal,
|
|
|
|
|
|
|
|
// so this should be fine, right..?
|
|
|
|
|
|
|
|
*pet = spike;
|
|
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
fn main() {
|
|
|
|
|
|
|
|
let mut mr_snuggles: Cat = ...;
|
|
|
|
|
|
|
|
evil_feeder(&mut mr_snuggles); // Replaces mr_snuggles with a Dog
|
|
|
|
|
|
|
|
mr_snuggles.meow(); // OH NO, MEOWING DOG!
|
|
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
* F is *covariant* over `T` if `T` being a subtype of `U` implies
|
|
|
|
If we look at our table of variances, we see that `&mut T` is *invariant* over `T`.
|
|
|
|
`F<T>` is a subtype of `F<U>` (subtyping "passes through")
|
|
|
|
As it turns out, this completely fixes the issue! With invariance, the fact that
|
|
|
|
* F is *contravariant* over `T` if `T` being a subtype of `U` implies
|
|
|
|
Cat is a subtype of Animal doesn't matter; `&mut Cat` still won't be a subtype of
|
|
|
|
`F<U>` is a subtype of `F<T>` (subtyping is "inverted")
|
|
|
|
`&mut Animal`. The static type checker will then correctly stop us from passing
|
|
|
|
* F is *invariant* over `T` otherwise (no subtyping relation can be derived)
|
|
|
|
a Cat into `evil_feeder`.
|
|
|
|
|
|
|
|
|
|
|
|
It should be noted that covariance is *far* more common and important than
|
|
|
|
The soundness of subtyping is based on the idea that it's ok to forget unnecessary
|
|
|
|
contravariance in Rust. The existence of contravariance in Rust can mostly
|
|
|
|
details. But with references, there's always someone that remembers those details:
|
|
|
|
be ignored.
|
|
|
|
the value being referenced. That value expects those details to keep being true,
|
|
|
|
|
|
|
|
and may behave incorrectly if its expectations are violated.
|
|
|
|
|
|
|
|
|
|
|
|
Some important variances (which we will explain in detail below):
|
|
|
|
The problem with making `&mut T` covariant over `T` is that it gives us the power
|
|
|
|
|
|
|
|
to modify the original value *when we don't remember all of its constraints*.
|
|
|
|
|
|
|
|
And so, we can make someone have a Dog when they're certain they still have a Cat.
|
|
|
|
|
|
|
|
|
|
|
|
* `&'a T` is covariant over `'a` and `T` (as is `*const T` by metaphor)
|
|
|
|
With that established, we can easily see why `&T` being covariant over `T` *is*
|
|
|
|
* `&'a mut T` is covariant over `'a` but invariant over `T`
|
|
|
|
sound: it doesn't let you modify the value, only look at it. Without any way to
|
|
|
|
* `fn(T) -> U` is **contra**variant over `T`, but covariant over `U`
|
|
|
|
mutate, there's no way for us to mess with any details. We can also see why
|
|
|
|
* `Box`, `Vec`, and all other collections are covariant over the types of
|
|
|
|
`UnsafeCell` and all the other interior mutability types must be invariant: they
|
|
|
|
their contents
|
|
|
|
make `&T` work like `&mut T`!
|
|
|
|
* `UnsafeCell<T>`, `Cell<T>`, `RefCell<T>`, `Mutex<T>` and all other
|
|
|
|
|
|
|
|
interior mutability types are invariant over T (as is `*mut T` by metaphor)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
To understand why these variances are correct and desirable, we will consider
|
|
|
|
Now what about the lifetime on references? Why is it ok for both kinds of references
|
|
|
|
several examples.
|
|
|
|
to be covariant over their lifetimes? Well, here's a two-pronged argument:
|
|
|
|
|
|
|
|
|
|
|
|
We have already covered why `&'a T` should be covariant over `'a` when
|
|
|
|
First and foremost, subtyping references based on their lifetimes is *the entire point
|
|
|
|
introducing subtyping: it's desirable to be able to pass longer-lived things
|
|
|
|
of subtyping in Rust*. The only reason we have subtyping is so we can pass
|
|
|
|
where shorter-lived things are needed.
|
|
|
|
long-lived things where short-lived things are expected. So it better work!
|
|
|
|
|
|
|
|
|
|
|
|
Similar reasoning applies to why it should be covariant over T: it's reasonable
|
|
|
|
Second, and more seriously, lifetimes are only a part of the reference itself. The
|
|
|
|
to be able to pass `&&'static str` where an `&&'a str` is expected. The
|
|
|
|
type of the referent is shared knowledge, which is why adjusting that type in only
|
|
|
|
additional level of indirection doesn't change the desire to be able to pass
|
|
|
|
one place (the reference) can lead to issues. But if you shrink down a reference's
|
|
|
|
longer lived things where shorter lived things are expected.
|
|
|
|
lifetime when you hand it to someone, that lifetime information isn't shared in
|
|
|
|
|
|
|
|
anyway. There are now two independent references with independent lifetimes.
|
|
|
|
|
|
|
|
There's no way to mess with original reference's lifetime using the other one.
|
|
|
|
|
|
|
|
|
|
|
|
However this logic doesn't apply to `&mut`. To see why `&mut` should
|
|
|
|
Or rather, the only way to mess with someone's lifetime is to build a meowing dog.
|
|
|
|
be invariant over T, consider the following code:
|
|
|
|
But as soon as you try to build a meowing dog, the lifetime should be wrapped up
|
|
|
|
|
|
|
|
in an invariant type, preventing the lifetime from being shrunk. To understand this
|
|
|
|
|
|
|
|
better, let's port the meowing dog problem over to real Rust.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
In the meowing dog problem we take a subtype (Cat), convert it into a supertype
|
|
|
|
|
|
|
|
(Animal), and then use that fact to overwrite the subtype with a value that satisfies
|
|
|
|
|
|
|
|
the constraints of the supertype but not the subtype (Dog).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
So with lifetimes, we want to take a long-lived thing, convert it into a
|
|
|
|
|
|
|
|
short-lived thing, and then use that to write something that doesn't live long
|
|
|
|
|
|
|
|
enough into the place expecting something long-lived.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Here it is:
|
|
|
|
|
|
|
|
|
|
|
|
```rust,ignore
|
|
|
|
```rust,ignore
|
|
|
|
fn overwrite<T: Copy>(input: &mut T, new: &mut T) {
|
|
|
|
fn evil_feeder<T>(input: &mut T, val: T) {
|
|
|
|
*input = *new;
|
|
|
|
*input = val;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
fn main() {
|
|
|
|
fn main() {
|
|
|
|
let mut forever_str: &'static str = "hello";
|
|
|
|
let mut mr_snuggles: &'static str = "meow! :3"; // mr. snuggles forever!!
|
|
|
|
{
|
|
|
|
{
|
|
|
|
let string = String::from("world");
|
|
|
|
let spike = String::from("bark! >:V");
|
|
|
|
overwrite(&mut forever_str, &mut &*string);
|
|
|
|
let spike_str: &str = &spike; // Only lives for the block
|
|
|
|
|
|
|
|
evil_feeder(&mut mr_snuggles, spike_str); // EVIL!
|
|
|
|
}
|
|
|
|
}
|
|
|
|
// Oops, printing free'd memory
|
|
|
|
println!("{}", mr_snuggles); // Use after free?
|
|
|
|
println!("{}", forever_str);
|
|
|
|
|
|
|
|
}
|
|
|
|
}
|
|
|
|
```
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
The signature of `overwrite` is clearly valid: it takes mutable references to
|
|
|
|
And what do we get when we run this?
|
|
|
|
two values of the same type, and overwrites one with the other.
|
|
|
|
|
|
|
|
|
|
|
|
```text
|
|
|
|
But, if `&mut T` was covariant over T, then `&mut &'static str` would be a
|
|
|
|
error[E0597]: `spike` does not live long enough
|
|
|
|
subtype of `&mut &'a str`, since `&'static str` is a subtype of `&'a str`.
|
|
|
|
--> src/main.rs:9:32
|
|
|
|
Therefore the lifetime of `forever_str` would successfully be "shrunk" down
|
|
|
|
|
|
|
|
|
to the shorter lifetime of `string`, and `overwrite` would be called successfully.
|
|
|
|
9 | let spike_str: &str = &spike;
|
|
|
|
`string` would subsequently be dropped, and `forever_str` would point to
|
|
|
|
| ^^^^^ borrowed value does not live long enough
|
|
|
|
freed memory when we print it! Therefore `&mut` should be invariant.
|
|
|
|
10 | evil_feeder(&mut mr_snuggles, spike_str);
|
|
|
|
|
|
|
|
11 | }
|
|
|
|
This is the general theme of variance vs invariance: if variance would allow you
|
|
|
|
| - borrowed value only lives until here
|
|
|
|
to store a short-lived value in a longer-lived slot, then invariance must be used.
|
|
|
|
|
|
|
|
|
|
|
|
|
= note: borrowed value must be valid for the static lifetime...
|
|
|
|
More generally, the soundness of subtyping and variance is based on the idea that its ok to
|
|
|
|
```
|
|
|
|
forget details, but with mutable references there's always someone (the original
|
|
|
|
|
|
|
|
value being referenced) that remembers the forgotten details and will assume
|
|
|
|
Good, it doesn't compile! Let's break down what's happening here in detail.
|
|
|
|
that those details haven't changed. If we do something to invalidate those details,
|
|
|
|
|
|
|
|
the original location can behave unsoundly.
|
|
|
|
First let's look at the new `evil_feeder` function:
|
|
|
|
|
|
|
|
|
|
|
|
However it *is* sound for `&'a mut T` to be covariant over `'a`. The key difference
|
|
|
|
|
|
|
|
between `'a` and T is that `'a` is a property of the reference itself,
|
|
|
|
|
|
|
|
while T is something the reference is borrowing. If you change T's type, then
|
|
|
|
|
|
|
|
the source still remembers the original type. However if you change the
|
|
|
|
|
|
|
|
lifetime's type, no one but the reference knows this information, so it's fine.
|
|
|
|
|
|
|
|
Put another way: `&'a mut T` owns `'a`, but only *borrows* T.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
`Box` and `Vec` are interesting cases because they're covariant, but you can
|
|
|
|
|
|
|
|
definitely store values in them! This is where Rust's typesystem allows it to
|
|
|
|
|
|
|
|
be a bit more clever than others. To understand why it's sound for owning
|
|
|
|
|
|
|
|
containers to be covariant over their contents, we must consider
|
|
|
|
|
|
|
|
the two ways in which a mutation may occur: by-value or by-reference.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
If mutation is by-value, then the old location that remembers extra details is
|
|
|
|
|
|
|
|
moved out of, meaning it can't use the value anymore. So we simply don't need to
|
|
|
|
|
|
|
|
worry about anyone remembering dangerous details. Put another way, applying
|
|
|
|
|
|
|
|
subtyping when passing by-value *destroys details forever*. For example, this
|
|
|
|
|
|
|
|
compiles and is fine:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
```rust
|
|
|
|
```rust
|
|
|
|
fn get_box<'a>(str: &'a str) -> Box<&'a str> {
|
|
|
|
fn evil_feeder<T>(input: &mut T, val: T) {
|
|
|
|
// String literals are `&'static str`s, but it's fine for us to
|
|
|
|
*input = val;
|
|
|
|
// "forget" this and let the caller think the string won't live that long.
|
|
|
|
|
|
|
|
Box::new("hello")
|
|
|
|
|
|
|
|
}
|
|
|
|
}
|
|
|
|
```
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
If mutation is by-reference, then our container is passed as `&mut Vec<T>`. But
|
|
|
|
All it does it take a mutable reference and a value and overwrite the referent with it.
|
|
|
|
`&mut` is invariant over its value, so `&mut Vec<T>` is actually invariant over `T`.
|
|
|
|
What's important about this function is that it creates a type equality constraint. It
|
|
|
|
So the fact that `Vec<T>` is covariant over `T` doesn't matter at all when
|
|
|
|
clearly says in its signature the referent and the value must be the *exact same* type.
|
|
|
|
mutating by-reference.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
But being covariant still allows `Box` and `Vec` to be weakened when shared
|
|
|
|
Meanwhile, in the caller we pass in `&mut &'static str` and `&'spike_str str`.
|
|
|
|
immutably. So you can pass a `&Vec<&'static str>` where a `&Vec<&'a str>` is
|
|
|
|
|
|
|
|
expected.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The invariance of the cell types can be seen as follows: `&` is like an `&mut`
|
|
|
|
Because `&mut T` is invariant over `T`, the compiler concludes it can't apply any subtyping
|
|
|
|
for a cell, because you can still store values in them through an `&`. Therefore
|
|
|
|
to the first argument, and so `T` must be exactly `&'static str`.
|
|
|
|
cells must be invariant to avoid lifetime smuggling.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
`fn` is the most subtle case because they have mixed variance, and in fact are
|
|
|
|
The other argument is only an `&'a str`, which *is* covariant over `'a`. So the compiler
|
|
|
|
the only source of **contra**variance. To see why `fn(T) -> U` should be contravariant
|
|
|
|
adopts a constraint: `&'spike_str str` must be a subtype of `&'static str` (inclusive),
|
|
|
|
over T, consider the following function signature:
|
|
|
|
which in turn implies `'spike_str` must be a subtype of `'static` (inclusive). Which is to say,
|
|
|
|
|
|
|
|
`'spike_str` must contain `'static`. But only one thing contains `'static` -- `'static` itself!
|
|
|
|
|
|
|
|
|
|
|
|
```rust,ignore
|
|
|
|
This is why we get an error when we try to assign `&spike` to `spike_str`. The
|
|
|
|
// 'a is derived from some parent scope
|
|
|
|
compiler has worked backwards to conclude `spike_str` must live forever, and `&spike`
|
|
|
|
fn foo(&'a str) -> usize;
|
|
|
|
simply can't live that long.
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
So even though references are covariant over their lifetimes, they "inherit" invariance
|
|
|
|
|
|
|
|
whenever they're put into a context that could do something bad with that. In this case,
|
|
|
|
|
|
|
|
we inherited invariance as soon as we put our reference inside an `&mut T`.
|
|
|
|
|
|
|
|
|
|
|
|
This signature claims that it can handle any `&str` that lives at least as
|
|
|
|
As it turns out, the argument for why it's ok for Box (and Vec, Hashmap, etc.) to
|
|
|
|
long as `'a`. Now if this signature was **co**variant over `&'a str`, that
|
|
|
|
be covariant is pretty similar to the argument for why it's ok for
|
|
|
|
would mean
|
|
|
|
lifetimes to be covariant: as soon as you try to stuff them in something like a
|
|
|
|
|
|
|
|
mutable reference, they inherit invariance and you're prevented from doing anything
|
|
|
|
|
|
|
|
bad.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
However Box makes it easier to focus on by-value aspect of references that we
|
|
|
|
|
|
|
|
partially glossed over.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Unlike a lot of languages which allow values to be freely aliased at all times,
|
|
|
|
|
|
|
|
Rust has a very strict rule: if you're allowed to mutate or move a value, you
|
|
|
|
|
|
|
|
are guaranteed to be the only one with access to it.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Consider the following code:
|
|
|
|
|
|
|
|
|
|
|
|
```rust,ignore
|
|
|
|
```rust,ignore
|
|
|
|
fn foo(&'static str) -> usize;
|
|
|
|
let mr_snuggles: Box<Cat> = ..;
|
|
|
|
|
|
|
|
let spike: Box<Dog> = ..;
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
let mut pet: Box<Animal>;
|
|
|
|
|
|
|
|
pet = mr_snuggles;
|
|
|
|
|
|
|
|
pet = spike;
|
|
|
|
```
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
could be provided in its place, as it would be a subtype. However this function
|
|
|
|
There is no problem at all with the fact that we have forgotten that `mr_snuggles` was a Cat,
|
|
|
|
has a stronger requirement: it says that it can only handle `&'static str`s,
|
|
|
|
or that we overwrote him with a Dog, because as soon as we moved mr_snuggles to a variable
|
|
|
|
and nothing else. Giving `&'a str`s to it would be unsound, as it's free to
|
|
|
|
that only knew he was an Animal, **we destroyed the only thing in the universe that
|
|
|
|
assume that what it's given lives forever. Therefore functions definitely shouldn't
|
|
|
|
remembered he was a Cat**!
|
|
|
|
be **co**variant over their arguments.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
However if we flip it around and use **contra**variance, it *does* work! If
|
|
|
|
In contrast to the argument about immutable references being soundly covariant because they
|
|
|
|
something expects a function which can handle strings that live forever,
|
|
|
|
don't let you change anything, owned values can be covariant because they make you
|
|
|
|
it makes perfect sense to instead provide a function that can handle
|
|
|
|
change *everything*. There is no connection between old locations and new locations.
|
|
|
|
strings that live for *less* than forever. So
|
|
|
|
Applying by-value subtyping is an irreversible act of knowledge destruction, and
|
|
|
|
|
|
|
|
without any memory of how things used to be, no one can be tricked into acting on
|
|
|
|
|
|
|
|
that old information!
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Only one thing left to explain: function pointers.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
To see why `fn(T) -> U` should be covariant over `U`, consider the following signature:
|
|
|
|
|
|
|
|
|
|
|
|
```rust,ignore
|
|
|
|
```rust,ignore
|
|
|
|
fn foo(&'a str) -> usize;
|
|
|
|
fn get_animal() -> Animal;
|
|
|
|
```
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
can be passed where
|
|
|
|
This function claims to produce an Animal. As such, it is perfectly valid to
|
|
|
|
|
|
|
|
provide a function with the following signature instead:
|
|
|
|
|
|
|
|
|
|
|
|
```rust,ignore
|
|
|
|
```rust,ignore
|
|
|
|
fn foo(&'static str) -> usize;
|
|
|
|
fn get_animal() -> Cat;
|
|
|
|
```
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
is expected.
|
|
|
|
After all, Cats are Animals, so always producing a Cat is a perfectly valid way
|
|
|
|
|
|
|
|
to produce Animals. Or to relate it back to real Rust: if we need a function
|
|
|
|
|
|
|
|
that is supposed to produce something that lives for `'short`, it's perfectly
|
|
|
|
|
|
|
|
fine for it to produce something that lives for `'long`. We don't care, we can
|
|
|
|
|
|
|
|
just forget that fact.
|
|
|
|
|
|
|
|
|
|
|
|
To see why `fn(T) -> U` should be **co**variant over U, consider the following
|
|
|
|
However, the same logic does not apply to *arguments*. Consider trying to satisfy:
|
|
|
|
function signature:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
```rust,ignore
|
|
|
|
```rust,ignore
|
|
|
|
// 'a is derived from some parent scope
|
|
|
|
fn handle_animal(Animal);
|
|
|
|
fn foo(usize) -> &'a str;
|
|
|
|
|
|
|
|
```
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
This signature claims that it will return something that outlives `'a`. It is
|
|
|
|
with
|
|
|
|
therefore completely reasonable to provide
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
```rust,ignore
|
|
|
|
```rust,ignore
|
|
|
|
fn foo(usize) -> &'static str;
|
|
|
|
fn handle_animal(Cat);
|
|
|
|
```
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
in its place, as it does indeed return things that outlive `'a`. Therefore
|
|
|
|
The first function can accept Dogs, but the second function absolutely can't.
|
|
|
|
functions are covariant over their return type.
|
|
|
|
Covariance doesn't work here. But if we flip it around, it actually *does*
|
|
|
|
|
|
|
|
work! If we need a function that can handle Cats, a function that can handle *any*
|
|
|
|
|
|
|
|
Animal will surely work fine. Or to relate it back to real Rust: if we need a
|
|
|
|
|
|
|
|
function that can handle anything that lives for at least `'long`, it's perfectly
|
|
|
|
|
|
|
|
fine for it to be able to handle anything that lives for at least `'short`.
|
|
|
|
|
|
|
|
|
|
|
|
`*const` has the exact same semantics as `&`, so variance follows. `*mut` on the
|
|
|
|
And that's why function types, unlike anything else in the language, are
|
|
|
|
other hand can dereference to an `&mut` whether shared or not, so it is marked
|
|
|
|
**contra**variant over their arguments.
|
|
|
|
as invariant just like cells.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
This is all well and good for the types the standard library provides, but
|
|
|
|
Now, this is all well and good for the types the standard library provides, but
|
|
|
|
how is variance determined for type that *you* define? A struct, informally
|
|
|
|
how is variance determined for type that *you* define? A struct, informally
|
|
|
|
speaking, inherits the variance of its fields. If a struct `Foo`
|
|
|
|
speaking, inherits the variance of its fields. If a struct `MyType`
|
|
|
|
has a generic argument `A` that is used in a field `a`, then Foo's variance
|
|
|
|
has a generic argument `A` that is used in a field `a`, then MyType's variance
|
|
|
|
over `A` is exactly `a`'s variance. However if `A` is used in multiple fields:
|
|
|
|
over `A` is exactly `a`'s variance over `A`.
|
|
|
|
|
|
|
|
|
|
|
|
* If all uses of A are covariant, then Foo is covariant over A
|
|
|
|
However if `A` is used in multiple fields:
|
|
|
|
* If all uses of A are contravariant, then Foo is contravariant over A
|
|
|
|
|
|
|
|
* Otherwise, Foo is invariant over A
|
|
|
|
* If all uses of `A` are covariant, then MyType is covariant over `A`
|
|
|
|
|
|
|
|
* If all uses of `A` are contravariant, then MyType is contravariant over `A`
|
|
|
|
|
|
|
|
* Otherwise, MyType is invariant over `A`
|
|
|
|
|
|
|
|
|
|
|
|
```rust
|
|
|
|
```rust
|
|
|
|
use std::cell::Cell;
|
|
|
|
use std::cell::Cell;
|
|
|
|
|
|
|
|
|
|
|
|
struct Foo<'a, 'b, A: 'a, B: 'b, C, D, E, F, G, H, In, Out, Mixed> {
|
|
|
|
struct MyType<'a, 'b, A: 'a, B: 'b, C, D, E, F, G, H, In, Out, Mixed> {
|
|
|
|
a: &'a A, // covariant over 'a and A
|
|
|
|
a: &'a A, // covariant over 'a and A
|
|
|
|
b: &'b mut B, // covariant over 'b and invariant over B
|
|
|
|
b: &'b mut B, // covariant over 'b and invariant over B
|
|
|
|
|
|
|
|
|
|
|
@ -272,3 +442,4 @@ struct Foo<'a, 'b, A: 'a, B: 'b, C, D, E, F, G, H, In, Out, Mixed> {
|
|
|
|
k2: Mixed, // invariant over Mixed, because invariance wins all conflicts
|
|
|
|
k2: Mixed, // invariant over Mixed, because invariance wins all conflicts
|
|
|
|
}
|
|
|
|
}
|
|
|
|
```
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|