Rust Needs Metaphors, Part 1: Lifetimes
I recently had a conversation with a friend who said: “I don’t trust most of what I hear about Rust, because it sounds like agitprop from the Rust Evangelism Strike Force, but people like you whom I trust and respect seem to like it. Can you help me understand why?”
I tried to. I said things that make sense, in the abstract: it offers a very powerful type system, it allows me to feel more confident about my code in surprising ways, and so on. As I went on, though, I realized something: while that’s all true, it is inadequate as an explanation for the experience that Rust provides.
A few days later, another conversation on lobste.rs led me to another realization: I deeply disagree with the comment that Rust implies more “cognitive overhead” than Go, but I still lacked the language to explain why, so I didn’t respond.
So here we are. This may be messy, and it’s certainly going to be incomplete, but I hope to lay out a few foundational metaphors that help to explain the learning curve in concrete and practical terms, hopefully helping more people enjoy the language.
Understanding the Impact of Lifetimes
The obvious initial learning burden, the first thing you hit, is the “lifetime” problem. It’s Rust’s novelty, of course, the ticket to memory safety without garbage collection. It’s also what Rust complains about most when you first start trying to write it. It can feel almost abusive. “Temporary value dropped while borrowed, you idiot.” “Closure may outlive borrowed value. Can’t you do anything right?”
An important switch flipped for me when I realized that these errors are learning opportunities. They are illustrating places where potential memory errors would have occurred in less “assertive” languages, and often tell you how to correct them (complete with ASCII art and ANSI coloration). The problem is that there’s very little practical literature about these corrections, no accepted metaphors to use in relation to the changes you need to make, and the changes can feel unnatural at first.
It’s as if you’ve gone to the doctor, said: “It hurts when I do this,” and he helpfully tells you “well, don’t do that”—but it feels as if he’s telling you not to blink your eyes.
There are a few basic rules of thumb that I have arrived at, though, that I think can help make the jump:
- Bindings aren’t a matter of style.
- Data must always be managed.
- If you have to type
'a
, you might be doing it wrong. - When in doubt, make an army of clones.
Each of these could probably be a blog post on its own, but I’m going to try to treat them briefly here and may revisit them in future posts.
Bindings aren’t a matter of style
If you’re used to writing a typical garbage-collected language, like Python or Ruby, you probably think of local variables as a convenience. You extract them to make things easier to read, or to break up a long chain of functions into a few statements. This is perfectly valid, and similar approaches can be used in Rust.
In Rust, though, bindings matter. Many lifetime-related errors can be alleviated by
simply binding the value to a local before using it. This is because a let
binding is
associating the value with the scope in which the let
occurs, whereas calling a
function in a chained statement (particularly if you’re using lambdas) may lead to a
value being bound to a scope that gets dropped before you expect.
Consider the following:
fn main() {
let x = "foo".to_string().as_mut_str();
x.make_ascii_uppercase();
println!("{}", x);
}
Here, the compiler will raise an error because "foo"
is converted to a String
, but
the next call creates a mutable reference to it that actually lasts longer than the
string itself. This error can be fixed by making sure that the string lives long enough:
fn main() {
let mut foo = "foo".to_string();
let x = foo.as_mut_str();
x.make_ascii_uppercase();
println!("{}", x);
}
You can try it in the playground.
Data must always be managed
The implication of the previous rule permeates the language. Every piece of data that you need to use needs to be pulled “up” to a common ancestor. If you envision an application as a tree, Rust expects that all data must exist at or before the “fork” of all of the branches that make use of that data. This can feel really unnatural at first. We’re used to having data implicitly created at a distance, with no real ceremony around its management or eventual demise.
If we applied this logic to the erroneous code above, you can see that the reference
(produced by as_mut_str
) is referring to data that we haven’t pulled “up” to the outer
scope:
fn main() {
let x = ("foo".to_string()).as_mut_str();
println!("{}", x);
}
Similar issues arise when you attempt to return a reference from a function, although the error message is more friendly:
fn get_int<'a>() -> &'a u8 {
let x = 3;
&x
}
This will produce an error that tells you that the function cannot return a reference to
data that it owns. After all, that local, x
, will “die” when the lifetime to which it
is bound ends. The reference to it, which is being returned from the function, will not.
Things like shared configuration must be treated similarly. Traditional languages often use module-level variables and singletons to contain these sorts of things; Rust typically doesn’t.
fn main() {
let config = get_config(); // returns a config object.
first(config);
second(config); // this will raise an error!
}
In this case, first
takes ownership of config
. In order to use it in second
, you
need to either pass it by reference or clone it. In either case, the main
function
will retain ownership of the original object, and only ownership of the reference or
clone will pass into the functions. You can try it out in the playground.
If you have to type 'a
, you might be doing it wrong
Introductions to Rust often seem to leap into “implementing data structures.” The problem is that this is exactly where Rust’s ergonomics are the worst. Self-referential data structures can be hard, or in some cases impossible, to implement in Rust. Even “plain references” can be confusing at first, because you are required to tie them to a lifetime.
So don’t write them.
You can get surprisingly far without using references in a struct
. You might need to
accept references to a string (&str
), slice (&[u8]
), or other value (e.g. &Config
)
as arguments to a function, but the compiler is good at inferring lifetimes in most
cases. If you aren’t consciously choosing to write an explicit lifetime, ignore the
compiler’s admonition that you need one, back up, and think about why it thinks you do.
You probably made a mistake.
If you’d like to see why I advise this technique instead of implementing data structures
check out “Learn Rust With Entirely Too Many Linked Lists,” which I think
demonstrates why I think this is a bad path for the true beginner to take. Also, check
out the various structures available in std::collections
and third-party
crates.
When in doubt, make an army of clones
It’s not bad form to use .clone()
, especially not when you’re working on a proof of
concept or initial implementation. Don’t get tangled up in a maze of lifetimes: just
sidestep the issue by passing in a clone when you need to do so. If the implementation
ends up working out well, you can come back and see whether you can refactor to use a
simple reference (&T
) or a more specialized alternative like Arc<T>
.
Conclusions
This is not a complete treatment of any of these concepts. It’s also not the last blog post in this series. I intend this to be very practical and concrete, a way to help people overcome those first faltering steps so that they can run into more interesting compiler errors. If you have questions, tweet them at me or ask wherever blog posts are sold.