From 4bf5fa6c99757311eb2da051b2c1ee0b67dff639 Mon Sep 17 00:00:00 2001 From: Edd Barrett Date: Wed, 15 Mar 2017 16:27:04 +0000 Subject: [PATCH 01/12] First attempt at extending the lifetimes section. --- src/lifetimes.md | 58 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 58 insertions(+) diff --git a/src/lifetimes.md b/src/lifetimes.md index e2f0cc8..27fe650 100644 --- a/src/lifetimes.md +++ b/src/lifetimes.md @@ -213,3 +213,61 @@ totally ok*, because it keeps us from spending all day explaining our program to the compiler. However it does mean that several programs that are totally correct with respect to Rust's *true* semantics are rejected because lifetimes are too dumb. + +# Unifying Lifetimes + +XXX: is unification a good term to be using? + +Often the Rust compiler must prove that two references with different lifetimes +are compatible. We call this *unification* of lifetimes. + +Consider the following program: + +```rust,ignore +fn main() { + let s1 = String::from("short"); + { + let s2 = String::from("a long long long string"); + println!("{}", min(&s1, &s2)); + } +} + +fn min<'a>(x: &'a str, y: &'a str) -> &'a str { + if x.len() < y.len() { + return x; + } else { + return y; + } +} +``` + +The idea is that `min()` returns a reference to the shorter of the two strings +referenced by its arguments, but *without* allocating a new string. + +The two references passed at the call-site of `min()` have different lifetimes +since their referents are defined in different scopes. In this example, the +string value `s1` is valid longer than the string value `s2`, yet the signature +of `min()` requires that these two references have the same lifetime. +Furthermore, the returned string reference must share this same lifetime too. +Essentially, the signature of `min()` asks the compiler to find a single +lifetime in the *caller* under which the three references annotated `'a` remain +valid. + +Rust will try to do this by *converting* each of these three reference +lifetimes in `main()` into *one* lifetime which is shorter than, or equally as +long as, each of the references in isolation. A reference `&'o T` can be +converted to to `&'p T` if (and only if) it can be proven that `'o` lives as +long as (or longer than) `'p`. In our example the references `'&s1`, `&s2` and +the returned reference can all be shown to be valid as long as the implicit +scope created by the `let s2` binding (see above for information on implicit +scopes introduced by `let` bindings). So in this case, we can prove that the +lifetimes of `&s1`, `&s2` and the returned reference can be unified, and as a +result he compiler accepts the program. + +If, on the other hand, the compiler cannot find such a lifetime, then the +lifetime constraints described by the program are inconsistent, and the +compiler will reject the program. For example + +```rust,ignore +XXX +``` From 4b794ca07784f863baac9ec629d5113140ba2983 Mon Sep 17 00:00:00 2001 From: Edd Barrett Date: Wed, 15 Mar 2017 16:34:36 +0000 Subject: [PATCH 02/12] Tweaks. --- src/lifetimes.md | 30 +++++++++++++++--------------- 1 file changed, 15 insertions(+), 15 deletions(-) diff --git a/src/lifetimes.md b/src/lifetimes.md index 27fe650..45c4bcb 100644 --- a/src/lifetimes.md +++ b/src/lifetimes.md @@ -228,11 +228,11 @@ fn main() { let s1 = String::from("short"); { let s2 = String::from("a long long long string"); - println!("{}", min(&s1, &s2)); + println!("{}", shortest(&s1, &s2)); } } -fn min<'a>(x: &'a str, y: &'a str) -> &'a str { +fn shortest<'a>(x: &'a str, y: &'a str) -> &'a str { if x.len() < y.len() { return x; } else { @@ -241,20 +241,20 @@ fn min<'a>(x: &'a str, y: &'a str) -> &'a str { } ``` -The idea is that `min()` returns a reference to the shorter of the two strings -referenced by its arguments, but *without* allocating a new string. +The idea is that `shortest()` returns a reference to the shorter of the two +strings referenced by its arguments, but *without* allocating a new string. -The two references passed at the call-site of `min()` have different lifetimes -since their referents are defined in different scopes. In this example, the -string value `s1` is valid longer than the string value `s2`, yet the signature -of `min()` requires that these two references have the same lifetime. -Furthermore, the returned string reference must share this same lifetime too. -Essentially, the signature of `min()` asks the compiler to find a single -lifetime in the *caller* under which the three references annotated `'a` remain -valid. +The two references passed at the call-site of `shortest()` have different lifetimes +since their referents are defined in different scopes. The +string value `s1` is live longer than the string value `s2`, yet the signature +of `shortest()` requires that these two references have the same lifetime. +Furthermore, the returned string reference must also share this same lifetime too. +So how does the compiler make this so? -Rust will try to do this by *converting* each of these three reference -lifetimes in `main()` into *one* lifetime which is shorter than, or equally as +Essentially, the signature of `shortest()` asks the compiler to find a single +lifetime in the *caller* under which the three references annotated `'a` remain +valid. Rust will try to *convert* each of these three reference +lifetimes in `main()` into *one* *unified* lifetime which is shorter than, or equally as long as, each of the references in isolation. A reference `&'o T` can be converted to to `&'p T` if (and only if) it can be proven that `'o` lives as long as (or longer than) `'p`. In our example the references `'&s1`, `&s2` and @@ -266,7 +266,7 @@ result he compiler accepts the program. If, on the other hand, the compiler cannot find such a lifetime, then the lifetime constraints described by the program are inconsistent, and the -compiler will reject the program. For example +compiler will reject the program. For example: ```rust,ignore XXX From 0a6ad6179b0efadcf4e0f8d7623dad53e1cab7b8 Mon Sep 17 00:00:00 2001 From: Edd Barrett Date: Wed, 15 Mar 2017 16:44:18 +0000 Subject: [PATCH 03/12] MOre tweaks, shorten. --- src/lifetimes.md | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-) diff --git a/src/lifetimes.md b/src/lifetimes.md index 45c4bcb..9bb6624 100644 --- a/src/lifetimes.md +++ b/src/lifetimes.md @@ -246,19 +246,18 @@ strings referenced by its arguments, but *without* allocating a new string. The two references passed at the call-site of `shortest()` have different lifetimes since their referents are defined in different scopes. The -string value `s1` is live longer than the string value `s2`, yet the signature +string reference `&s1` is valid longer than the string reference `&s2`, yet the signature of `shortest()` requires that these two references have the same lifetime. Furthermore, the returned string reference must also share this same lifetime too. -So how does the compiler make this so? +So how does the compiler make sure this is the case? -Essentially, the signature of `shortest()` asks the compiler to find a single -lifetime in the *caller* under which the three references annotated `'a` remain -valid. Rust will try to *convert* each of these three reference -lifetimes in `main()` into *one* *unified* lifetime which is shorter than, or equally as -long as, each of the references in isolation. A reference `&'o T` can be +At the call-site of `shortest()`, the compiler must try to *convert* the lifetimes of +the references corresponding to a reference marked `&'a` in the function signature +into a single *unified* lifetime. This new lifetime must be shorter than, or equally as +long as, each of the reference lifetimes in isolation. A reference `&'o T` can be converted to to `&'p T` if (and only if) it can be proven that `'o` lives as long as (or longer than) `'p`. In our example the references `'&s1`, `&s2` and -the returned reference can all be shown to be valid as long as the implicit +the returned reference can all be shown to be valid for the scope created by the `let s2` binding (see above for information on implicit scopes introduced by `let` bindings). So in this case, we can prove that the lifetimes of `&s1`, `&s2` and the returned reference can be unified, and as a From 005053e382cd9a8fd62a165bd4c47e4a52ecdfd4 Mon Sep 17 00:00:00 2001 From: Edd Barrett Date: Wed, 15 Mar 2017 16:47:43 +0000 Subject: [PATCH 04/12] More shortening. --- src/lifetimes.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/lifetimes.md b/src/lifetimes.md index 9bb6624..65f8088 100644 --- a/src/lifetimes.md +++ b/src/lifetimes.md @@ -252,7 +252,7 @@ Furthermore, the returned string reference must also share this same lifetime to So how does the compiler make sure this is the case? At the call-site of `shortest()`, the compiler must try to *convert* the lifetimes of -the references corresponding to a reference marked `&'a` in the function signature +the references marked `&'a` in the `shortest()` function signature into a single *unified* lifetime. This new lifetime must be shorter than, or equally as long as, each of the reference lifetimes in isolation. A reference `&'o T` can be converted to to `&'p T` if (and only if) it can be proven that `'o` lives as @@ -260,7 +260,7 @@ long as (or longer than) `'p`. In our example the references `'&s1`, `&s2` and the returned reference can all be shown to be valid for the scope created by the `let s2` binding (see above for information on implicit scopes introduced by `let` bindings). So in this case, we can prove that the -lifetimes of `&s1`, `&s2` and the returned reference can be unified, and as a +lifetimes can be unified, and as a result he compiler accepts the program. If, on the other hand, the compiler cannot find such a lifetime, then the From 87b4ca44fa14d523e7a0d1a0897ddf0a6974589d Mon Sep 17 00:00:00 2001 From: Edd Barrett Date: Thu, 16 Mar 2017 12:56:30 +0000 Subject: [PATCH 05/12] Try to explain via de-sugraing, as is the style of the nomicon. --- src/lifetimes.md | 80 ++++++++++++++++++++++++++++++++---------------- 1 file changed, 54 insertions(+), 26 deletions(-) diff --git a/src/lifetimes.md b/src/lifetimes.md index 65f8088..78d4867 100644 --- a/src/lifetimes.md +++ b/src/lifetimes.md @@ -226,13 +226,11 @@ Consider the following program: ```rust,ignore fn main() { let s1 = String::from("short"); - { - let s2 = String::from("a long long long string"); - println!("{}", shortest(&s1, &s2)); - } + let s2 = String::from("a long long long string"); + println!("{}", shortest(&s1, &s2)); } -fn shortest<'a>(x: &'a str, y: &'a str) -> &'a str { +fn shortest<'k>(x: &'k str, y: &'k str) -> &'k str { if x.len() < y.len() { return x; } else { @@ -243,30 +241,60 @@ fn shortest<'a>(x: &'a str, y: &'a str) -> &'a str { The idea is that `shortest()` returns a reference to the shorter of the two strings referenced by its arguments, but *without* allocating a new string. +Let's de-sugar main so we can see the implicit lifetimes: + +```rust,ignore +fn main() { + 'a { + let s1 = String::from("short"); + 'b { + let s2 = String::from("a long long long string"); + 'c { + // Like before, an anonymous scope introduced since &s1 doesn't + // need to last as long as s1 itself, and similarly for s2. + println!("{}", shortest(&'b s1, &'a s2)); + } + } + } +} +``` -The two references passed at the call-site of `shortest()` have different lifetimes -since their referents are defined in different scopes. The -string reference `&s1` is valid longer than the string reference `&s2`, yet the signature -of `shortest()` requires that these two references have the same lifetime. -Furthermore, the returned string reference must also share this same lifetime too. -So how does the compiler make sure this is the case? +Now we see that the references passed as arguments to `shortest()`, i.e. `&s1` +and `&s2`, actually have different lifetimes, `&'a` and `&'b`, since their +referents are defined in different scopes. However, the signature of +`shortest()` requires that these two references (and also the returned +reference, which has lifetime `'c`) have the same lifetime. So how does the +compiler make sure this is the case? At the call-site of `shortest()`, the compiler must try to *convert* the lifetimes of -the references marked `&'a` in the `shortest()` function signature -into a single *unified* lifetime. This new lifetime must be shorter than, or equally as -long as, each of the reference lifetimes in isolation. A reference `&'o T` can be -converted to to `&'p T` if (and only if) it can be proven that `'o` lives as -long as (or longer than) `'p`. In our example the references `'&s1`, `&s2` and -the returned reference can all be shown to be valid for the -scope created by the `let s2` binding (see above for information on implicit -scopes introduced by `let` bindings). So in this case, we can prove that the -lifetimes can be unified, and as a -result he compiler accepts the program. - -If, on the other hand, the compiler cannot find such a lifetime, then the -lifetime constraints described by the program are inconsistent, and the -compiler will reject the program. For example: +the references marked `&'a` in the signature of `shortest()` +into a single compatible lifetime. This new lifetime must be shorter than, or equally as +long as, all three of the original reference lifetimes involved. Therefore a reference `&'o` can be +converted to to `&'p` if `'o` lives at least as long as `'p`. + +In our example: + + * `&'b` outlives `'c`, so `&'b` can be converted to `&'c`. + * `&'a` outlives `'c`, so `&'a` can be converted to `&'c`. + * The returned reference has lifetime `&'c` already. + +So these references are unified in `&'c`, therefore the lifetime constraints +imposed by `shortest()` are consistent, and the compiler accepts the program. + +Now consider a slight variation on `main()` like this: ```rust,ignore -XXX +fn main() { + let a = String::from("short"); + { + let c: &str; + let b = String::from("a long long long string"); + c = min(&a, &b); + + } +} ``` + +XXX Desugar. + +XXX Explain why it doesn't work. From c5c93f2d4c315c96b4f05c3c2c7ec6942a3d3218 Mon Sep 17 00:00:00 2001 From: Edd Barrett Date: Thu, 16 Mar 2017 15:53:45 +0000 Subject: [PATCH 06/12] Tweaks, but somethign is wrong. --- src/lifetimes.md | 79 ++++++++++++++++++++++++++++-------------------- 1 file changed, 46 insertions(+), 33 deletions(-) diff --git a/src/lifetimes.md b/src/lifetimes.md index 78d4867..fd2604f 100644 --- a/src/lifetimes.md +++ b/src/lifetimes.md @@ -214,20 +214,15 @@ to the compiler. However it does mean that several programs that are totally correct with respect to Rust's *true* semantics are rejected because lifetimes are too dumb. -# Unifying Lifetimes - -XXX: is unification a good term to be using? - -Often the Rust compiler must prove that two references with different lifetimes -are compatible. We call this *unification* of lifetimes. +# A More Involved Example. Consider the following program: ```rust,ignore fn main() { let s1 = String::from("short"); - let s2 = String::from("a long long long string"); - println!("{}", shortest(&s1, &s2)); + let s2r = &String::from("a long long long string"); + println!("{}", shortest(&s1, s2r)); } fn shortest<'k>(x: &'k str, y: &'k str) -> &'k str { @@ -241,18 +236,17 @@ fn shortest<'k>(x: &'k str, y: &'k str) -> &'k str { The idea is that `shortest()` returns a reference to the shorter of the two strings referenced by its arguments, but *without* allocating a new string. -Let's de-sugar main so we can see the implicit lifetimes: +Let's de-sugar `main()` so we can see the implicit lifetimes: ```rust,ignore fn main() { 'a { let s1 = String::from("short"); - 'b { - let s2 = String::from("a long long long string"); + b' { + let s2r: &'b = &'b String::from("a long long long string"); 'c { - // Like before, an anonymous scope introduced since &s1 doesn't - // need to last as long as s1 itself, and similarly for s2. - println!("{}", shortest(&'b s1, &'a s2)); + // Annonymous scope for the borrow of s1 + println!("{}", shortest(&'c s1, s2r)); } } } @@ -260,8 +254,8 @@ fn main() { ``` Now we see that the references passed as arguments to `shortest()`, i.e. `&s1` -and `&s2`, actually have different lifetimes, `&'a` and `&'b`, since their -referents are defined in different scopes. However, the signature of +and `&s2`, actually have different lifetimes (`&'b` and `&'c` respectively), since the borrows occur in different scopes. +However, the signature of `shortest()` requires that these two references (and also the returned reference, which has lifetime `'c`) have the same lifetime. So how does the compiler make sure this is the case? @@ -269,32 +263,51 @@ compiler make sure this is the case? At the call-site of `shortest()`, the compiler must try to *convert* the lifetimes of the references marked `&'a` in the signature of `shortest()` into a single compatible lifetime. This new lifetime must be shorter than, or equally as -long as, all three of the original reference lifetimes involved. Therefore a reference `&'o` can be -converted to to `&'p` if `'o` lives at least as long as `'p`. - -In our example: +long as, all three of the original reference lifetimes involved. In other words, we must convert to the shortest of the three lifetimes to `&'c`. Conversion from a reference `&'o` can be converted to to `&'p` if `'o` lives at least as long as `'p`, therefore: - * `&'b` outlives `'c`, so `&'b` can be converted to `&'c`. - * `&'a` outlives `'c`, so `&'a` can be converted to `&'c`. + * `The first argument &'c s1` already has lifetime `&'c`, so we don't have to do anything here. + * `&'b` outlives `&'c`, so we can convert `s2r: &'b` to `s2r: &'c`. * The returned reference has lifetime `&'c` already. -So these references are unified in `&'c`, therefore the lifetime constraints -imposed by `shortest()` are consistent, and the compiler accepts the program. +After conversion, the call-site satisfies the signature of `shorter()`, we have +proven the lifetimes in this program to be consistent, and therefore the +compiler accepts the program. Now consider a slight variation on `main()` like this: ```rust,ignore -fn main() { - let a = String::from("short"); - { - let c: &str; - let b = String::from("a long long long string"); - c = min(&a, &b); +fn main() { + let s1 = String::from("short"); + let res; + let s2 = String::from("a long long long string"); + res = shortest(&s1, &s2); + println!("{}", res); +} +``` +De-sugared it looks like this: + +```rust,ignore +fn main() { + 'a { + let s1 = String::from("short"); + 'b { + let res: &'b str; + 'c { + let s2 = String::from("a long long long string"); + 'd { + // Annonymous scope for the borrows of s1 and s2 + // Assigning to the outer scope causes s1 and s2 to have 'b + res: &'b = shortest(&'d s1, &'d s2); + println!("{}", res); + } + } + } } } ``` -XXX Desugar. - -XXX Explain why it doesn't work. +XXX: Something is wrong. The above program does not compile, so we should be +able to show that the lifetimes are inconsistent. To do so we would to be have +to show that we can't convert `&'b` to `&'d`, but since `&'b` outlives `&'d`, +we can. Hrmm. From 7b93ec842845add7997e89d45f808070c5ec12a0 Mon Sep 17 00:00:00 2001 From: Edd Barrett Date: Fri, 17 Mar 2017 12:05:18 +0000 Subject: [PATCH 07/12] Try again. --- src/lifetimes.md | 195 ++++++++++++++++++++++++++++++++++++----------- 1 file changed, 149 insertions(+), 46 deletions(-) diff --git a/src/lifetimes.md b/src/lifetimes.md index fd2604f..c7af936 100644 --- a/src/lifetimes.md +++ b/src/lifetimes.md @@ -214,15 +214,102 @@ to the compiler. However it does mean that several programs that are totally correct with respect to Rust's *true* semantics are rejected because lifetimes are too dumb. -# A More Involved Example. +# Inter-procedural Borrow Checking of Function Arguments Consider the following program: ```rust,ignore fn main() { let s1 = String::from("short"); - let s2r = &String::from("a long long long string"); - println!("{}", shortest(&s1, s2r)); + let s2 = String::from("a long long long string"); + print_shortest(&s1, &s2); +} + +fn shortest<'k>(x: &'k str, y: &'k str) { + if x.len() < y.len() { + println!("{}", x); + } else { + println!("{}", y); + } +} +``` + +`print_shortest()` simply prints the shorter of its two pass-by-reference +string arguments. In Rust, each let binding has its own scope. Let's make the +scopes introduced to `main()` explicit: + +```rust,ignore +fn main() { + 's1 { + let s1 = String::from("short"); + 's2 { + let s2 = String::from("a long long long string"); + print_shortest(&s1, &s2); + } +} +``` + +And now let's explicitly mark the lifetimes of each reference's referent too: + +```rust,ignore +fn main() { + 's1 { + let s1 = String::from("short"); + 's2 { + let s2 = String::from("a long long long string"); + print_shortest(&'s1 s1, &'s2 s2); + } +} +``` + +Now we see that the references passed as arguments to `print_shortest()` +actually have different lifetimes (and thus a different type!) since the values +they refer to were introduced in different scopes. At the call site of +`print_shortest()` the compiler must now check that the lifetimes in the +*caller* (`main()`) are consistent with the lifetimes in the signature of the +*callee* (`print_shortest()`). + +The signature of `print_shortest()` simply requires that both of it's arguments +have the same lifetime (because both arguments are marked with the same +lifetime identifier in the signature). If in `main()` we had done: + +```rust,ignore +print_shortest(&s1, &s1); + +``` + +Then this the consistency is trivially proven, since both arguments would have +the same lifetime `&'s1` at the call-site. However, for our example, the +arguments have different lifetimes. We don't want Rust to reject the program +because it actually is safe. Instead the compiler uses some rules for +converting between lifetimes whilst retaining referential safety. The first +such rule is as follows: + +> A function argument of type `&'p T` can be coerced with an argument of type +> `&'q T` if the lifetime of `&'p T` is equal or longer than `&'q T`. + +At our call site, the type of the arguments are `&'s1 str` and `&'s2 str`, and +we know that a `&'s1 str' outlives an `&'s2 str`, so we can substitute `&'s1 +s1` with `&'s2 s2`. After this both arguments are of lifetime `&'s2` and the +call-site is consistent with the signature of `print_shortest()`. + +[More formally, the basis for the above rule is in *type variance*. Under this +model, you would consider a longer lifetime a sub-type of a shorter lifetime, +and for function arguments to be *co-variant*. However, an understanding of +variance isn't strictly required to understand the Rust borrow checker. We've +tried here to instead to explain using intuitive terminlolgy.] + +# Inter-procedural Borrow Checking of Function Return Values + +Now consider a slight variation of this example: + +```rust,ignore +fn main() { + let s1 = String::from("short"); + let res; + let s2 = String::from("a long long long string"); + res = shortest(&s1, &s2); + println!("{}", res); } fn shortest<'k>(x: &'k str, y: &'k str) -> &'k str { @@ -234,80 +321,96 @@ fn shortest<'k>(x: &'k str, y: &'k str) -> &'k str { } ``` -The idea is that `shortest()` returns a reference to the shorter of the two -strings referenced by its arguments, but *without* allocating a new string. -Let's de-sugar `main()` so we can see the implicit lifetimes: +`print_shortest()` has been renamed to `shortest()`, which instead of printing, +now returns the shorter of the two strings. It does this using only references +for efficiency, avoiding the need to re-allocate a new string to pass back to +`main()`. The responsibility of printing the result has been shifted to `main()`. + +Let's again de-sugar `main()` by adding explicit scopes and lifetimes: ```rust,ignore fn main() { - 'a { + 's1 { let s1 = String::from("short"); - b' { - let s2r: &'b = &'b String::from("a long long long string"); - 'c { - // Annonymous scope for the borrow of s1 - println!("{}", shortest(&'c s1, s2r)); + 'res { + let res: &'res str; + 's2 { + let s2 = String::from("a long long long string"); + res: &'res: str = shortest(&'s1 s1, &'s2 s2); + println!("{}", res); } } } } ``` -Now we see that the references passed as arguments to `shortest()`, i.e. `&s1` -and `&s2`, actually have different lifetimes (`&'b` and `&'c` respectively), since the borrows occur in different scopes. -However, the signature of -`shortest()` requires that these two references (and also the returned -reference, which has lifetime `'c`) have the same lifetime. So how does the -compiler make sure this is the case? +Again, at the call-site of `shortest()` the comipiler needs to check the +consistency of the arguments in the caller with the signature of the callee. +The signature of `shortest()` fisrt says that the two reference arguments have +the same lifetime, which can be prove ok in the same way as before, thus giving +us: -At the call-site of `shortest()`, the compiler must try to *convert* the lifetimes of -the references marked `&'a` in the signature of `shortest()` -into a single compatible lifetime. This new lifetime must be shorter than, or equally as -long as, all three of the original reference lifetimes involved. In other words, we must convert to the shortest of the three lifetimes to `&'c`. Conversion from a reference `&'o` can be converted to to `&'p` if `'o` lives at least as long as `'p`, therefore: +```rust,ignore +res: &'res = shortest(&'s2 s1, &'s2 s2); +``` - * `The first argument &'c s1` already has lifetime `&'c`, so we don't have to do anything here. - * `&'b` outlives `&'c`, so we can convert `s2r: &'b` to `s2r: &'c`. - * The returned reference has lifetime `&'c` already. +But we now have the additional reference to check. We must now prove that the +returned reference can have the same lifetime as the arguments of lifetime +'&'s2'. This brings us to a second rule: -After conversion, the call-site satisfies the signature of `shorter()`, we have -proven the lifetimes in this program to be consistent, and therefore the -compiler accepts the program. +> The return value of a function `&'r T` can be converted to an argument `&'s T` +> if the lifetime of `&'r T` is equal or shorter than `&'s T`. -Now consider a slight variation on `main()` like this: +To make our program compile, we would have to subsitute `res: &'res` for `res: +&'s2`, but we can't since `&'res` in fact out-lives `&'s2`. This program is in +fact inconsistent and the compiler rightfully rejects the program because we +try make a reference (`&'res`) which outlives one of the values it refer to +(`&'s2`). + +[Formally, function return values are said to be *contravariant*, the opposite +of *covariant*.] + +How can we fix this porgram? Well if you were to swap the `let s2 = ...` with +the `res = ...` line, you would have: ```rust,ignore fn main() { let s1 = String::from("short"); - let res; let s2 = String::from("a long long long string"); + let res; res = shortest(&s1, &s2); println!("{}", res); } ``` -De-sugared it looks like this: +Which de-sugars to: ```rust,ignore fn main() { - 'a { + 's1 { let s1 = String::from("short"); - 'b { - let res: &'b str; - 'c { - let s2 = String::from("a long long long string"); - 'd { - // Annonymous scope for the borrows of s1 and s2 - // Assigning to the outer scope causes s1 and s2 to have 'b - res: &'b = shortest(&'d s1, &'d s2); - println!("{}", res); - } + 's2 { + let s2 = String::from("a long long long string"); + 'res { + let res: &'res str; + res: &'res str = shortest(&'s1 s1, &'s2 s2); + println!("{}", res); } } } } ``` -XXX: Something is wrong. The above program does not compile, so we should be -able to show that the lifetimes are inconsistent. To do so we would to be have -to show that we can't convert `&'b` to `&'d`, but since `&'b` outlives `&'d`, -we can. Hrmm. +Then at the call-site of `shortest()`: + * `&'s1 s1` outlives `&'s2 s2`, so we can replace the first argument with `&'s2 s1`. + * `&'res str` lives shorter than `'&s2`, so the return value lifetime can become `res: &'s2 str` + +Leaving us with: + +```rust,ignore +res: &'s2 str = shortest(&'s2 s1, &'s2 s2); +``` + +Which matches the signature of `shortest()` and thus this compiles. +Intuitively, the return reference can't point to a freed value as the values +live strictly longer than the return reference. From 37fe162af107ece3a4ce0a50977d5e245b40d263 Mon Sep 17 00:00:00 2001 From: Edd Barrett Date: Fri, 17 Mar 2017 12:16:06 +0000 Subject: [PATCH 08/12] Drop parentheses when referring to functions inline. Fix a typo. --- src/lifetimes.md | 34 +++++++++++++++++----------------- 1 file changed, 17 insertions(+), 17 deletions(-) diff --git a/src/lifetimes.md b/src/lifetimes.md index c7af936..8112792 100644 --- a/src/lifetimes.md +++ b/src/lifetimes.md @@ -234,9 +234,9 @@ fn shortest<'k>(x: &'k str, y: &'k str) { } ``` -`print_shortest()` simply prints the shorter of its two pass-by-reference +`print_shortest` simply prints the shorter of its two pass-by-reference string arguments. In Rust, each let binding has its own scope. Let's make the -scopes introduced to `main()` explicit: +scopes introduced to `main` explicit: ```rust,ignore fn main() { @@ -262,23 +262,23 @@ fn main() { } ``` -Now we see that the references passed as arguments to `print_shortest()` +Now we see that the references passed as arguments to `print_shortest` actually have different lifetimes (and thus a different type!) since the values they refer to were introduced in different scopes. At the call site of -`print_shortest()` the compiler must now check that the lifetimes in the -*caller* (`main()`) are consistent with the lifetimes in the signature of the -*callee* (`print_shortest()`). +`print_shortest` the compiler must now check that the lifetimes in the +*caller* (`main`) are consistent with the lifetimes in the signature of the +*callee* (`print_shortest`). -The signature of `print_shortest()` simply requires that both of it's arguments +The signature of `print_shortest` simply requires that both of it's arguments have the same lifetime (because both arguments are marked with the same -lifetime identifier in the signature). If in `main()` we had done: +lifetime identifier in the signature). If in `main` we had done: ```rust,ignore print_shortest(&s1, &s1); ``` -Then this the consistency is trivially proven, since both arguments would have +Then consistency is trivially proven, since both arguments would have the same lifetime `&'s1` at the call-site. However, for our example, the arguments have different lifetimes. We don't want Rust to reject the program because it actually is safe. Instead the compiler uses some rules for @@ -291,7 +291,7 @@ such rule is as follows: At our call site, the type of the arguments are `&'s1 str` and `&'s2 str`, and we know that a `&'s1 str' outlives an `&'s2 str`, so we can substitute `&'s1 s1` with `&'s2 s2`. After this both arguments are of lifetime `&'s2` and the -call-site is consistent with the signature of `print_shortest()`. +call-site is consistent with the signature of `print_shortest`. [More formally, the basis for the above rule is in *type variance*. Under this model, you would consider a longer lifetime a sub-type of a shorter lifetime, @@ -321,12 +321,12 @@ fn shortest<'k>(x: &'k str, y: &'k str) -> &'k str { } ``` -`print_shortest()` has been renamed to `shortest()`, which instead of printing, +`print_shortest` has been renamed to `shortest`, which instead of printing, now returns the shorter of the two strings. It does this using only references for efficiency, avoiding the need to re-allocate a new string to pass back to -`main()`. The responsibility of printing the result has been shifted to `main()`. +`main`. The responsibility of printing the result has been shifted to `main`. -Let's again de-sugar `main()` by adding explicit scopes and lifetimes: +Let's again de-sugar `main` by adding explicit scopes and lifetimes: ```rust,ignore fn main() { @@ -344,9 +344,9 @@ fn main() { } ``` -Again, at the call-site of `shortest()` the comipiler needs to check the +Again, at the call-site of `shortest` the comipiler needs to check the consistency of the arguments in the caller with the signature of the callee. -The signature of `shortest()` fisrt says that the two reference arguments have +The signature of `shortest` fisrt says that the two reference arguments have the same lifetime, which can be prove ok in the same way as before, thus giving us: @@ -401,7 +401,7 @@ fn main() { } ``` -Then at the call-site of `shortest()`: +Then at the call-site of `shortest`: * `&'s1 s1` outlives `&'s2 s2`, so we can replace the first argument with `&'s2 s1`. * `&'res str` lives shorter than `'&s2`, so the return value lifetime can become `res: &'s2 str` @@ -411,6 +411,6 @@ Leaving us with: res: &'s2 str = shortest(&'s2 s1, &'s2 s2); ``` -Which matches the signature of `shortest()` and thus this compiles. +Which matches the signature of `shortest` and thus this compiles. Intuitively, the return reference can't point to a freed value as the values live strictly longer than the return reference. From 09274f379d334b86c5c3e7732535d8fdf9a383f8 Mon Sep 17 00:00:00 2001 From: Edd Barrett Date: Fri, 17 Mar 2017 12:26:14 +0000 Subject: [PATCH 09/12] Small fixes. --- src/lifetimes.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/src/lifetimes.md b/src/lifetimes.md index 8112792..de4a8fe 100644 --- a/src/lifetimes.md +++ b/src/lifetimes.md @@ -289,7 +289,7 @@ such rule is as follows: > `&'q T` if the lifetime of `&'p T` is equal or longer than `&'q T`. At our call site, the type of the arguments are `&'s1 str` and `&'s2 str`, and -we know that a `&'s1 str' outlives an `&'s2 str`, so we can substitute `&'s1 +we know that a `&'s1 str'` outlives an `&'s2 str`, so we can substitute `&'s1 s1` with `&'s2 s2`. After this both arguments are of lifetime `&'s2` and the call-site is consistent with the signature of `print_shortest`. @@ -356,16 +356,16 @@ res: &'res = shortest(&'s2 s1, &'s2 s2); But we now have the additional reference to check. We must now prove that the returned reference can have the same lifetime as the arguments of lifetime -'&'s2'. This brings us to a second rule: +`&'s2`. This brings us to a second rule: > The return value of a function `&'r T` can be converted to an argument `&'s T` > if the lifetime of `&'r T` is equal or shorter than `&'s T`. To make our program compile, we would have to subsitute `res: &'res` for `res: -&'s2`, but we can't since `&'res` in fact out-lives `&'s2`. This program is in -fact inconsistent and the compiler rightfully rejects the program because we -try make a reference (`&'res`) which outlives one of the values it refer to -(`&'s2`). +&'s2`, but we can't since `&'res` in fact out-lives `&'s2`. This program is +inconsistent and the compiler rightfully rejects the program because we +try make a reference (`res`) which outlives one of the values it may refer to +(`s2`). [Formally, function return values are said to be *contravariant*, the opposite of *covariant*.] From f6b3f2adb9fe7da009248950a8970398e0557b2b Mon Sep 17 00:00:00 2001 From: Edd Barrett Date: Wed, 17 May 2017 10:40:36 +0100 Subject: [PATCH 10/12] Fix outstanding comments from the Rust devs. --- src/lifetimes.md | 33 +++++++++++++++++---------------- 1 file changed, 17 insertions(+), 16 deletions(-) diff --git a/src/lifetimes.md b/src/lifetimes.md index de4a8fe..b274c67 100644 --- a/src/lifetimes.md +++ b/src/lifetimes.md @@ -218,7 +218,7 @@ are too dumb. Consider the following program: -```rust,ignore +```rust fn main() { let s1 = String::from("short"); let s2 = String::from("a long long long string"); @@ -234,7 +234,7 @@ fn shortest<'k>(x: &'k str, y: &'k str) { } ``` -`print_shortest` simply prints the shorter of its two pass-by-reference +`print_shortest` prints the shorter of its two pass-by-reference string arguments. In Rust, each let binding has its own scope. Let's make the scopes introduced to `main` explicit: @@ -246,6 +246,7 @@ fn main() { let s2 = String::from("a long long long string"); print_shortest(&s1, &s2); } + } } ``` @@ -259,6 +260,7 @@ fn main() { let s2 = String::from("a long long long string"); print_shortest(&'s1 s1, &'s2 s2); } + } } ``` @@ -269,13 +271,12 @@ they refer to were introduced in different scopes. At the call site of *caller* (`main`) are consistent with the lifetimes in the signature of the *callee* (`print_shortest`). -The signature of `print_shortest` simply requires that both of it's arguments +The signature of `print_shortest` requires that both of it's arguments have the same lifetime (because both arguments are marked with the same lifetime identifier in the signature). If in `main` we had done: ```rust,ignore print_shortest(&s1, &s1); - ``` Then consistency is trivially proven, since both arguments would have @@ -293,17 +294,17 @@ we know that a `&'s1 str'` outlives an `&'s2 str`, so we can substitute `&'s1 s1` with `&'s2 s2`. After this both arguments are of lifetime `&'s2` and the call-site is consistent with the signature of `print_shortest`. -[More formally, the basis for the above rule is in *type variance*. Under this -model, you would consider a longer lifetime a sub-type of a shorter lifetime, -and for function arguments to be *co-variant*. However, an understanding of -variance isn't strictly required to understand the Rust borrow checker. We've -tried here to instead to explain using intuitive terminlolgy.] +> More formally, the basis for the above rule is in *type variance*. Under this +> model, you would consider a longer lifetime a sub-type of a shorter lifetime, +> and for function arguments to be *co-variant*. However, an understanding of +> variance isn't strictly required to understand the Rust borrow checker. We've +> tried here to instead to explain using intuitive terminlolgy. -# Inter-procedural Borrow Checking of Function Return Values +## Inter-procedural Borrow Checking of Function Return Values Now consider a slight variation of this example: -```rust,ignore +```rust fn main() { let s1 = String::from("short"); let res; @@ -344,9 +345,9 @@ fn main() { } ``` -Again, at the call-site of `shortest` the comipiler needs to check the +Again, at the call-site of `shortest` the compiler needs to check the consistency of the arguments in the caller with the signature of the callee. -The signature of `shortest` fisrt says that the two reference arguments have +The signature of `shortest` first says that the two reference arguments have the same lifetime, which can be prove ok in the same way as before, thus giving us: @@ -367,13 +368,13 @@ inconsistent and the compiler rightfully rejects the program because we try make a reference (`res`) which outlives one of the values it may refer to (`s2`). -[Formally, function return values are said to be *contravariant*, the opposite -of *covariant*.] +> Formally, function return values are said to be *contravariant*, the opposite +> of *covariant*. How can we fix this porgram? Well if you were to swap the `let s2 = ...` with the `res = ...` line, you would have: -```rust,ignore +```rust fn main() { let s1 = String::from("short"); let s2 = String::from("a long long long string"); From 48f7e914ffe10e7015134fa1e885d4cbdbc63f90 Mon Sep 17 00:00:00 2001 From: Edd Barrett Date: Wed, 17 May 2017 10:55:59 +0100 Subject: [PATCH 11/12] Re-add some `ignore`s, fix one code example. --- src/lifetimes.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/lifetimes.md b/src/lifetimes.md index b274c67..7d0f788 100644 --- a/src/lifetimes.md +++ b/src/lifetimes.md @@ -225,7 +225,7 @@ fn main() { print_shortest(&s1, &s2); } -fn shortest<'k>(x: &'k str, y: &'k str) { +fn print_shortest<'k>(x: &'k str, y: &'k str) { if x.len() < y.len() { println!("{}", x); } else { @@ -304,7 +304,7 @@ call-site is consistent with the signature of `print_shortest`. Now consider a slight variation of this example: -```rust +```rust,ignore fn main() { let s1 = String::from("short"); let res; @@ -374,7 +374,7 @@ try make a reference (`res`) which outlives one of the values it may refer to How can we fix this porgram? Well if you were to swap the `let s2 = ...` with the `res = ...` line, you would have: -```rust +```rust,ignore fn main() { let s1 = String::from("short"); let s2 = String::from("a long long long string"); From e4c2e8b8aac737baee1b60b4bd9f23f7afd137a5 Mon Sep 17 00:00:00 2001 From: Edd Barrett Date: Mon, 27 Nov 2017 17:26:51 +0000 Subject: [PATCH 12/12] Address comments and small improvements. --- src/lifetimes.md | 135 +++++++++++++---------------------------------- 1 file changed, 36 insertions(+), 99 deletions(-) diff --git a/src/lifetimes.md b/src/lifetimes.md index 7d0f788..d6d74c8 100644 --- a/src/lifetimes.md +++ b/src/lifetimes.md @@ -1,6 +1,6 @@ # Lifetimes -Rust enforces these rules through *lifetimes*. Lifetimes are effectively +Rust ensures memory safety through *lifetimes*. Lifetimes are effectively just names for scopes somewhere in the program. Each reference, and anything that contains a reference, is tagged with a lifetime specifying the scope it's valid for. @@ -14,7 +14,7 @@ make your code Just Work. However once you cross the function boundary, you need to start talking about lifetimes. Lifetimes are denoted with an apostrophe: `'a`, `'static`. To dip -our toes with lifetimes, we're going to pretend that we're actually allowed +our toes into lifetimes, we're going to pretend that we're actually allowed to label scopes with lifetimes, and desugar the examples from the start of this chapter. @@ -81,7 +81,7 @@ z = y; -# Example: references that outlive referents +# Example: References that Outlive Referents Alright, let's look at some of those examples from before: @@ -165,7 +165,7 @@ our implementation *just a bit*.) -# Example: aliasing a mutable reference +# Example: Aliasing a Mutable Reference How about the other example: @@ -214,9 +214,10 @@ to the compiler. However it does mean that several programs that are totally correct with respect to Rust's *true* semantics are rejected because lifetimes are too dumb. -# Inter-procedural Borrow Checking of Function Arguments +# Inter-procedural Borrow Checking -Consider the following program: +Earlier we discussed lifetime constraints within a single function. Now let's +talk about the constraints *between* functions. Consider the following program: ```rust fn main() { @@ -235,22 +236,8 @@ fn print_shortest<'k>(x: &'k str, y: &'k str) { ``` `print_shortest` prints the shorter of its two pass-by-reference -string arguments. In Rust, each let binding has its own scope. Let's make the -scopes introduced to `main` explicit: - -```rust,ignore -fn main() { - 's1 { - let s1 = String::from("short"); - 's2 { - let s2 = String::from("a long long long string"); - print_shortest(&s1, &s2); - } - } -} -``` - -And now let's explicitly mark the lifetimes of each reference's referent too: +string arguments. Let's first de-sugar to make the reference lifetimes of +`main` explicit: ```rust,ignore fn main() { @@ -264,6 +251,9 @@ fn main() { } ``` +(for brevity, we don't show the third implicit scope that would be introduced +to limit to lifetimes of the borrows in the call to `print_shortest`) + Now we see that the references passed as arguments to `print_shortest` actually have different lifetimes (and thus a different type!) since the values they refer to were introduced in different scopes. At the call site of @@ -286,23 +276,18 @@ because it actually is safe. Instead the compiler uses some rules for converting between lifetimes whilst retaining referential safety. The first such rule is as follows: -> A function argument of type `&'p T` can be coerced with an argument of type -> `&'q T` if the lifetime of `&'p T` is equal or longer than `&'q T`. +> A reference can always be shrunk to one of a shorter lifetime. In other +> words, `&'a T` can be implicitly converted to `&'b T` as long as `'a` +outlives `'b.` At our call site, the type of the arguments are `&'s1 str` and `&'s2 str`, and -we know that a `&'s1 str'` outlives an `&'s2 str`, so we can substitute `&'s1 -s1` with `&'s2 s2`. After this both arguments are of lifetime `&'s2` and the +we know that a `&'s1 str` outlives an `&'s2 str`, so we can shrink `&'s1 +s1` to `&'s2 s1`. After this both arguments are of lifetime `&'s2` and the call-site is consistent with the signature of `print_shortest`. -> More formally, the basis for the above rule is in *type variance*. Under this -> model, you would consider a longer lifetime a sub-type of a shorter lifetime, -> and for function arguments to be *co-variant*. However, an understanding of -> variance isn't strictly required to understand the Rust borrow checker. We've -> tried here to instead to explain using intuitive terminlolgy. - ## Inter-procedural Borrow Checking of Function Return Values -Now consider a slight variation of this example: +Now consider a slight variation of the previous example: ```rust,ignore fn main() { @@ -324,7 +309,7 @@ fn shortest<'k>(x: &'k str, y: &'k str) -> &'k str { `print_shortest` has been renamed to `shortest`, which instead of printing, now returns the shorter of the two strings. It does this using only references -for efficiency, avoiding the need to re-allocate a new string to pass back to +for efficiency, avoiding the need to allocate a new string to pass back to `main`. The responsibility of printing the result has been shifted to `main`. Let's again de-sugar `main` by adding explicit scopes and lifetimes: @@ -345,73 +330,25 @@ fn main() { } ``` -Again, at the call-site of `shortest` the compiler needs to check the -consistency of the arguments in the caller with the signature of the callee. -The signature of `shortest` first says that the two reference arguments have -the same lifetime, which can be prove ok in the same way as before, thus giving -us: - -```rust,ignore -res: &'res = shortest(&'s2 s1, &'s2 s2); -``` - -But we now have the additional reference to check. We must now prove that the -returned reference can have the same lifetime as the arguments of lifetime -`&'s2`. This brings us to a second rule: +Again, at the call-site of `shortest` the compiler needs to check the consistency +of the arguments in the caller with the signature of the callee. The signature +of shortest says that all three references must have the same lifetime `'k`, so +we have to find a lifetime `'k` such that: -> The return value of a function `&'r T` can be converted to an argument `&'s T` -> if the lifetime of `&'r T` is equal or shorter than `&'s T`. + * `&s1`: `&'s1 str` can be converted to the first argument `&'k str` + * `&s2`: `&'s2 str` can be converted to the second argument `&'k str` + * The return value `&'k str` can be converted to res: `&'res str` -To make our program compile, we would have to subsitute `res: &'res` for `res: -&'s2`, but we can't since `&'res` in fact out-lives `&'s2`. This program is -inconsistent and the compiler rightfully rejects the program because we -try make a reference (`res`) which outlives one of the values it may refer to -(`s2`). +This leads to three requirements: -> Formally, function return values are said to be *contravariant*, the opposite -> of *covariant*. + * `'s1` must outlive `'k` + * `'s2` must outlive `'k` + * `'k` must outlive `'res` -How can we fix this porgram? Well if you were to swap the `let s2 = ...` with -the `res = ...` line, you would have: - -```rust,ignore -fn main() { - let s1 = String::from("short"); - let s2 = String::from("a long long long string"); - let res; - res = shortest(&s1, &s2); - println!("{}", res); -} -``` - -Which de-sugars to: - -```rust,ignore -fn main() { - 's1 { - let s1 = String::from("short"); - 's2 { - let s2 = String::from("a long long long string"); - 'res { - let res: &'res str; - res: &'res str = shortest(&'s1 s1, &'s2 s2); - println!("{}", res); - } - } - } -} -``` - -Then at the call-site of `shortest`: - * `&'s1 s1` outlives `&'s2 s2`, so we can replace the first argument with `&'s2 s1`. - * `&'res str` lives shorter than `'&s2`, so the return value lifetime can become `res: &'s2 str` - -Leaving us with: - -```rust,ignore -res: &'s2 str = shortest(&'s2 s1, &'s2 s2); -``` +So by transitivity (`'s2` outlives `'k` outlives `'res`), we also require `'s2` +to outlive `'res`, which is not the case. The borrow checker rightfully +rejects our program because we are making a reference (`res`) which outlives +one of the values it may refer to (`s2`). -Which matches the signature of `shortest` and thus this compiles. -Intuitively, the return reference can't point to a freed value as the values -live strictly longer than the return reference. +The program is fixed by swapping the definition order of `res` and `s2`. Then +`res` lives longer than both `s1` and `s2`.