I don’t like most Lifetime tutorials, therefore this is a tutorial that actually teaches them (at least in my own weird way). But before talking about lifetimes we have to talk about what it means to live in rust.
We start with a definition you may be familiar with, or may be hiding from:
$$\text{Life} = \text{Birth} + \text{Death}$$
where $\text{birth}$ = variable declaration/binding.
{}”; This includes functions of course.and $\text{death}$ = memory is automatically dropped/deallocated when it’s at the end of it’s scope. The following death scenarios outline all the variations of variables approaching end of scope, i.e. dying memory
Death Situation #1: Natural end of scope
{ // start block
let x = String::from("hello"); // scope of x starts here
} // scope of x ends here; x dropped here
Death Situation #2: The value gets moved into a function’s scope, and dies at the end of such function’s scope
fn take_ownership(s: String) {
// s is alive in this scope
} // s dies here
let x = String::from("hello");
take_ownership(x); // x dies here (moved into function; moved because only one variable can own a value, from Rust's ownership laws)
Death Situation #3: reassignment (move of one variable to another; ownership transfer)
let mut x = String::from("jozef");
x = String::from("lumaj"); // old String value of x (i.e. String:"jozef") dropped here
Tangent: Move Types vs Copy Types
Marker Traits:
- Before talking about Copy types, understand this:
- Marker traits are just ways to classify a type into a certain category, obviously to mark behaviour.
- This sounds like just a plain old trait, but its not just that, it actually a trait with no methods. They exist purely to signal something to the compiler about a type’s properties or capabilities.
- A simple example of practically using marker traits would be when some arbitrary core function has a trait bound using a marker type
fn core_function<T: MarkerTrait> {...}so that this functionality is only accessed by types marked byMarkerTraitClone Types:
- A Clone type is any type that implements the
Clonetrait.- The
Clonetrait gives us the behaviour of duplicating a given value, when you implement it on a type, callingcloneon that value will duplicate that value in memory (clones are always explicitly called).- Clone can be done on both small or heavy memory objects.
- The usage of this method is just
InstanceOfTypeThatCloneIsImplementedOn.clone(), for exampleString::from("yo").clone()or1i32.clone()Copy Types:
- A Copy type is any type that implements the
Copytrait.
- And any type that implements
Copyneeds to implementClonefirst, asCopyis essentially the express/quick-use version ofClone- The copy trait is just a marker trait, which allows automatic/implicit cloning during actions like variable assignment, passing in parameters to functions, getting return values from functions, struct construction, etc. It’s a marker trait because the compiler is handling their implicit usage.
- Below in Move Types we will discuss what the default behaviour is without the
Copytrait.- From rust documentation we see “If a type is
Copythen itsCloneimplementation only needs to return*self”, which shows us that Copy is an inherently simple copy of bits, you’re literally just copying the de-referenced raw data.pub trait Copy: Clone { //needs to implement the Clone trait // Empty, that's it, because it's a marker trait }Move Types
- A Move type in Rust is any type that doesn’t implement the
Copytrait. This is actually the default behaviour in rust.
- And to be clear, the
Clonetrait doesnt affect this since implementing theClonetrait is adding an explicit implementation which has to be explicitly called by you. We are worried about how the compiler is implicitly (not) copying data.- This means when you assign or pass it, ownership transfers instead of copying.
- core types like
i32,f64, etc all implement theCopytrait out the box, which is done since they are relatively lightweight and it’s more convenient to assume that they are duplicated rather than moved.let y: i32 = 10 //integers are by default Copy let x = y //value is copied here, therefore y is still alive in this scope let z = String::from("z") let x = z //value is moved here, z is dropped since it does not have ownership of this value anymore (not a valid pointer anymore)
Tangent$^2$: Does Stack vs Heap Matter?
Associating Stack data with
Copytypes and Heap data withMovetypes is a good mental model, though it doesnt give exatly the true reason why this correlation exists:
- Copy types = simple bit-by-bit duplication; essentially just a
mvin assembly.- There is no other context or associated functionality needed to encapsulate these values, for example a
i32’s value is entirely encapsulated in the contiguous 32 bytes that represent the integer.- The Stack is a perfect place for these fixed value types that can easily be pushed and popped anywhere on the stack without worrying about anything else outside of these bits.
- From Rust’s documentation, “any type implementing
Dropcan’t beCopy, because it’s managing some resource besides its ownsize_of::<T>bytes”.
- The
Droptrait is essentially a necessity for any dynamically sized or complex data type, as they come with additional overhead and data management needs that are outside of the raw data bits that are stored. For example, aStringneeding to know it’s length and location in memory in order to manage itself, therefore it implementsDropto handle such complexity.- The
Droptrait is discussed in detail soon below.- Since anything with
Droplogic is inherently more complex than a contiguous string of bits that can fully encapsulate a type (which you could just ‘drop’ by overwriting in the stack), Rust’s designers have decided to not have implicit logic (sinceCopyis essentially an implicitly calledClonemethod) being committed for potentially expensive operations, which could have you pulling our your hair if your program is being ran slow but you wouldn’t know why unless you pulled up the assembly representation of your program.Here are some typical values, showing the correlation (but not causation) between Stack/Copy and Heap/Move types:
Copy Move Stack-Allocated i32,bool,(i32, f64),&T, pointers to heap-allocated data[String; 2], closure capturingStringHeap-Allocated N/A since needs Drop-implemented logicString,Vec<T>,Box<T>
Death Situation #4: call drop() manually
let x = String::from("hello");
drop(x); // x dropped immediately
core::mem::drop
- This is the source code of the mentioned drop method:
pub fn drop<T>(_x: T) {}.
- That’s it. Just that. 👌🏻
- It destroys your value by moving it in a empty function’s scope (i.e. death situation #2), where rust automatically invokes destruction at the end of any scope.
Tangent: The
Droptrait (core::ops::Drop)The
Droptrait is used to customize the behaviour of how values are dropped out of memory, used for more complex use cases and nested types.pub trait Drop { // The only method; here for you to customize the behaviour of drop fn drop(&mut self); }
- You cannot have the Drop trait type with the Copy trait, they are mutually exclusive (see above to the Copy vs Move types tangent).
- You can’t call
Drop::drop()directly on a value you own (to prevent mis-use; double drop calls are very possible in this scenario) - you must either usecore::mem::dropor let it go out of scope naturally to makeDrop::drop()run.An example of implementing the
Droptrait would be in the compiler, where threads (created from therayoncrate) need termination flags to be sent to them to end and have any associated memory destroyed. Which is not just a basic scope-ending scenario, its more complicated and therefore needs it’s own custom logicimpl Drop for ThreadPool { fn drop(&mut self) { self.registry.terminate(); // this essentially just sends termination flags to the threads in the pool } }Tangent$^2$: Destructors & Drop Glue
The previous
core::mem::drop()andcore::ops::Drop.drop()are both methods that are still high-level abstractions. How does the compiler see a scope ending and decide to run a destructor? What really even is a destructor??(1) Destructors for
CopyTypes: A destructor for aCopytype is not really a destructor:
- The registers storing the actual data or the stack-pointers to the
Copy-able data are just now able to be overwritten.
- And a stack-pointer, if the
Copy-able data is on the stack, has it’s own natural scope, as is the nature of a LIFO (Last In First Out) mechanism.- You push on data to the stack, it’s able to be referenced by the CPU while its on top of the stack, then we pop that value before leaving a function/block, the stack pointer moves back a location, $-1$, in the stack. Now from the stack’s point of view whatever is in that $+1$ address in the stack doesn’t matter, it’s empty in the stack’s view and will just overwrite that memory location for whatever gets eventually popped on it. This is natural destruction.
(2) Destructors for types with
DropImplemented:The documentation for
Dropstates:This destructor consists of two components:
- A call to
Drop::dropfor that value, if this specialDroptrait is implemented for its type.- The automatically generated “drop glue” which recursively calls the destructors of all the fields of this value.
Such “Drop Glue” is extra code that is injected into your compiled program, done by the compiler, in order to recursively destroy all memory objects going out of scope.
- (a) The base case is using the implementation of
dropin theDroptrait
- aside: this is not exactly a base case in your traditional recursion sense, as it runs at all levels, even on the parent type’s Drop method immediately, but it gets the idea across
- (b) If the current type that being dropped contains other types, the compiler checks if
needs_drop_components_with_async(), which let’s itself know if it needs to drill down into it to see if it has components that need to call their own drop implementations.// for example, you can see the logic here. // This function, called needs_drop_components_with_async, goes through all of the potential types this can be called on and returns back components which need to be dropped in a vector, and does it recursively. // the following code is in the middle of the needs_drop_components_with_async function // ... and we are in the midst of a match statement // If any field needs drop, then the whole tuple does. ty::Tuple(fields) => fields.iter().try_fold(SmallVec::new(), move |mut acc, elem| { acc.extend(needs_drop_components_with_async(tcx, elem, asyncness)?); // call itself since this tuple is still too high level, we need to get to the base case of dropping Ok(acc) }),
- the actual process of dropping memory through this is expressed at the MIR level, which involves a whole cascade of calls to various drop methods until you hit
__rust_deallocwhich is just ==a system library call just like C’sfree()==
Death Situation #5: Temporary expressions dropped immediately
String::from("temp").len();
'jozefs_life), I am guaranteeing to God that I wont die and will be available to be used in his grand plan during this period!
Named-Annotations Syntax: 'place_your_name_here
'a, 'b, 'my_lifetime etc.Lifetime Annotations are used in the following data structures:
The reason why it’s need for these objects specifically is because they:
(1) Introduce some form of abstraction/generalization over data, as they all use some sort of parameters in their definitions
- You have to think like a mathematician or logician to get the subtly of this statement! For example, In Godel’s 1930 first incompleteness theorem (you can find the following also in many major logician’s work), I can paraphrase that he writes out something like:
- “for all $x$, $P(x)$”, where P is some statement like “The dog is $x$” … we call such a statement the generalization of P
- We can see that having these parameters creates a more abstract and general statement about dogs in general, rather than just a single concrete “the dog is white” for example.
This abstraction/generalization naturally introduces ambiguity to the compiler, as we will see soon.
(2) Introduce their own scopes, where there is a barrier between their internal and external access to values. (Most) Inner scopes only know their own world and cannot guarantee that the outside world is not deleting data that our above data structures are referencing.
- Therefore, Explicit Lifetimes will only come up in writing Rust when dealing with references, as a reminder, references can come in the two following flavours:
&T: A shared reference/borrow of any given typeT, which is essentially a pointer to some data, where you cannot change the value being pointed to&mut T: A mutable reference/borrow of any given typeT, which is a pointer as well but now we are able to edit the data being referenced to.and so, you will see explicit lifetimes being always associated with some reference, such as
&'a T
The two above points are interrelated because parameters are the only values which can (usually) traverse such scopes, making them a potential target of ambiguity of their life-status in the eyes of a given scope.
The lifetime system exists because references can outlive the values they point to (i.e. the actual value dies before the reference does, this is common in nested scopes), which would create dangling pointers (i.e. referencing something that doesn’t exist).
Let’s look at how the syntax actually plays out:
Function Signatures + Lifetime Annotations:
Here is a simple example of how we would specify lifetime annotations for functions:
fn print_refs<'a, 'b>(x: &'a i32, y: &'b i32) { println!("x is {} and y is {}", x, y); }
- To be up-front, this function does not actually need explicit lifetime parameters written by us the programmer (not until we add a return value that is a reference itself), I am showing them here for demonstration purposes of the syntax. I will discuss implicit lifetimes later.
- In general, the syntax
some_object_in_rust<'a, 'b>is read “some_object_in_rust’s own lifetime cannot exceed that of'aor'b”, because if that were true there would be a state wheresome_object_in_rustcould try and access a non existent value with lifetime'a(or'betc), which would be a use-after-free error.
Important Tangent: Generics in Rust
- We can see from above we have to declare our lifetimes as a and b in angle brackets, e.g.
<'a>… why?- I dont know if you’ve studied Type Theory at all, but here is the jist of the two fundamental type theories:
- Typed Lambda Calculus: Its lambda calculus so we study functions and how a bunch functions can create any algorithm you can think of, and it’s typed, meaning we impose restrictions on the set of values a parameter can actually be for a given function.
- Second Order Typed Lambda Calculus: Not only do we have the above, but you know those types that were concrete restrictions on our parameters? Yeah, now they are also parameters to our function too, which is called polymorphism
A function in second order lambda calculus would be something like: $\lambda \alpha: *. \lambda x: \alpha. x$, where if I had to do a crude 1-to-1 translation into some rust pseduo-code, then it would look something like:
fn function_outside(param_alpha: any_type_you_want){ fn function_inside(param_x: param_alpha){ return param_x } }This is ugly, therefore my assumption is that the rust developers said we need something like
fn function_inside(params_for_types)(params_for_values), but then this is weird to read and makes parsing a harder job for the compiler too. So why not use angle brackets? Our rust pseudo code will look something like this now:fn function_inside<param_alpha: any_type_you_want>(param_x: param_alpha) {...}Actually why even have a generic name for a type like
param_alpha: any_type_you_want, types are already intrinsically generic names for things!aside; tiny tangent: … maybe this we shouldn’t just throw out this colon-type-specification thing (i.e.
param_alpha: any_type_you_want), maybe it’s useful for type bounds 😳. It’s just another, higher-order, abstraction on types, like how the lambda calculus defines it, but no need to get deep into it here.Let’s simplify it again with this knowledge:
fn function_inside<param_alpha>(param_x: param_alpha) // or even simpler: fn func<T>(x: T)Therefore we can see that a lifetime parameter/annotation is just a type parameter, which syntactically differs from regular generics from their
'apostrophe, and semantically differs from regular generics by that they are specifying how long a type lives rather than what a sequence of bits is supposed to represent to us, as a class of objects generally.
- Now something like
x: &'a i32is binding the parameter x to be in the class of objects that are bothi32and live for the lifetime of'aWe use generics for lifetimes for simple equality checks between lifetimes (i.e. they are relative), the most basic showcase of equality checks
'a = 'aand'a != 'b(aandbcan coincidentally be the same value of lifetime, but we are not constraining them to be equal in our lifetime annotation), so we can easily tell the compiler relative differences between lifetimes since that is all we need to ensure that one value does not outlive another. To specify that values of type'boutlives type'awe actually use a colon syntax like:'b: 'awhen defining b as a generic.this colon syntax is called “lifetime coercion” because you’re forcing one lifetime to be at least as long as the lifetime of another
The actual concrete values of these generic lifetimes are the actual variable’s inherent lifetimes, which these generics become associated with in say a function signature, which I will show off below in the greater context around where lifetime annotations are specified.
What’s going on in our
print_refsfunction?
- remember, lifetime annotations are to be able to ensure that one reference does not outlive the other. That is a reference to something outside the function’s scope.
- (1) we have declared that
xis a reference (&) to some value and has a lifetime annotation of'a. The actual concrete lifetime of'ais the lifetime ofxitself- (2) we have declared that
yis a reference (&) to some value and has a lifetime annotation of'b, which is important because triviallya != b, so we have explicitly stated that x and y do not necessarily have to live for the same amount of time.- (3) wait we have done something so simple, just call
println!and have returned nothing! Just looking at the definition of this function sucks …. it needs more context ….aside: And in general, when dealing with Lifetimes we have to be good Hegelians and understand that the tension between inside of a function and the outside is interdependent through the variables we passed through it. My gripe with many Lifetime tutorials stems from the isolation, like just looking at a function’s signature, they explain Lifetimes through. Its simplification for simplification’s sake, and leads to confusion since we don’t have all the context.
//define the same function fn print_refs<'a, 'b>(x: &'a i32, y: &'b i32) { println!("x is {} and y is {}", x, y); } // We need to actually use the function in action to understand the lifetime annotations: { //scope 1 let w: i32 = 1; { //scope 2 let u: i32 = 2; print_refs(&w, &u); } // u dies here, this is the end of it's lifetime } // w dies here, this is the end of it's lifetime
- As we can see from the above, now it makes sense why it was okay that x and y could have the different lifetimes of
'aand'brespectively, its because a simple method just takes both references in the moment of execution and runs them as parameters, its so simple that outer scope is almost irrelevant, as long as they live the entirely of the print_refs function (which they will since this is a synchronous function) then we know that print_refs will have access to these references. Here are some other scope combinations that could work{ let w: i32 = 1; let u: i32 = 2; print_refs(&w, &u); } // w and u and the function have the same scope{ let w: i32 = 1; let u: i32 = 2; { print_refs(&w, &u); } // the function is in a nested scope so w and u definitely still live after this nested scope } // w and u have the same scope
Now let’s spice up this example by adding in a return value which is a reference:
//define the same function but with return value fn print_refs<'a, 'b>(x: &'a i32, y: &'b i32) -> &'a i32 { println!("x is {} and y is {}", x, y); x } { let w: i32 = 1; let output; // defining it here so we can reference it in this scope later { let u: i32 = 2; output = print_refs(&w, &u); println!("{}", output); } println!("{}", output); // also valid } // since 'a was assinged the life of x, and we inputted &w in x's place, output will actually die here, the same lifetime as w!!!! // if our return value was instead -> &'b i32, then would would get a compiler error: // error: lifetime may not live long enough // and if you ran this with the signature fn print_refs<'a, 'b>(x:&i32, y: &i32) -> &i32, you would get the error: // missing lifetime specifier
- By necessity, the reference in the return value has to be one of the parameter’s given lifetimes
- This is because a function doesn’t have access to any other references in it’s scope by definition, as internally created references will be killed at the end of the function body since that’s the end of the function scope.
- Rust checks each function’s signature independently without looking at the implementation of the function, so when we created the function’s signature above we need to tell it what the output lifetime will be, it can’t just look inside as see that we returned x. Therefore we specify
fn print_refs<'a, 'b>(x: &'a i32, y: &'b i32) -> &'a i32instead offn print_refs<'a, 'b>(x: &'a i32, y: &'b i32) -> &i32
- There is no set algorithm that could efficiently just look into the body and determine the lifetime (we are specifying the algorithm to begin with! That’s the type of self-reference the Halting Problem is made out of!), the following example would need to compute all code paths to understand whether the output lifetime will be a or b.
While the previous example was ambiguous for the compiler, the following example is also ambiguous for the developer (without understanding the broader scope of course)
fn print_refs<'a, 'b>(x: &'a i32, y: &'b i32) -> &??? { if *x > *y { x // Lifetime 'a } else { y // Lifetime 'b } } // the compiler needs to know: after the function returns, how long can I safely use this reference? // therefore in this case we would need a definitive answer, and we have two choices here: // (1) we can make both lifetimes the same. fn print_refs<'a>(x: &'a i32, y: &'a i32) -> &'a i32 {...} // (2) we can use the shorter of the two lifetimes: // 'b: 'a means b outlives a, we coerce b to live at least as long as a, since a is for sure the shorter of the two then returning a refernence based on 'b we know lives when returning fn print_refs<'a, 'b: 'a>(x: &'a i32, y: &'b i32) -> &'a i32 {...}
- We are again making it easy for the borrow checker to know what should and shouldn’t be valid at any given point in our program, Rust wants to ensure that use-after-free errors are impossible, and by handing it these explicit contracts of what should be available when makes this goal possible. This is the crux of why we would need to explicitly state to the compiler lifetimes for functions, for knowing the actual life of a return value, as the function body essentially hides this from the compiler.
- The compiler now has the tools/info to know if there are memory issues with my program, so just adding the lifetime annotations doesn’t magically make my program run, rather it allows the borrow checker to do it’s job without ambiguity.
- A good rule of thumb for knowing that an explicit lifetime annotation is needed for functions: If you’re returning a reference and there are multiple input references, you need explicit lifetimes. Otherwise, Rust handles it by just being able to determine scope for all referenced values (see implicit lifetimes below).
- A discussion about Async functions and how they relate to lifetimes is in my post “Learning Async Rust Sucks”
Structs + Lifetime Annotations:
- A struct needs a lifetime if and only if it contains a reference. This is because we need to ensure that the data that the struct is referencing is not invalid, so the struct needs to live at least as long as the internal data.
- The below showcase all the general moments when you would need to add lifetime annotations to structs.
// (1) single reference, i.e. one lifetime parameter struct Article<'a> { title: &'a str, content: &'a str, } // (2) multiple references with the same lifetime struct BlogPost<'a> { author: &'a str, article: &'a Article<'a>, //we had to specify the nested lifetime too } // (3) multiple references with different lifetimes struct Comment<'a, 'b> { text: &'a str, post: &'b BlogPost<'b>, } // (4) mix of owned data and references struct User<'a, T> { id: T, // owned - no lifetime needed name: String, // owned - no lifetime needed bio: &'a str, // borrowed - needs lifetime } // (5) using a struct that has explicit lifetime annotations as a parameter to a function, where we are using a reference to this struct; a reference to a struct that has references for it's interior data. This ensure that during the function call, whatever is referenced in the struct is still valid fn dummy_func<'b>(person: &SomeStruct<'b>) { println!("This is a struct with an interior reference: {}", SomeStruct.data_field); }The above patterns highlight how you’ll use it syntactically, but the semantics become lost when we are just looking at these blocks.
Lets look at the below example:
{ let v: i32 = 999; let user_concrete: User; { let w: i64 = 888888; let user_concrete = User { name: String::from("Lumaj"), thirty_two: &v, sixty_four: &w }; println!("{:#?}", user_concrete); // works } println!("{:#?}", user_concrete); // WONT WORK, 'b = lifetime of w, that's dead! // error[E0381]: used binding `user_concrete` isn't initialized // i.e. user_concrete was already dropped due to the compiler listening to your annotations and choosing to drop the value at the end of the inner scope }
Enums + Lifetime Annotations:
- these work in pretty much the exact same way as struct’s lifetime annotations
enum Either<'a> { Num(i32), Ref(&'a i32), } enum Content<'a> { Article { title: &'a str, body: &'a str }, Image { url: &'a str, alt: &'a str }, Empty, } enum Reference<'a, 'b> { First(&'a str), Second(&'b str), Both(&'a str, &'b str), }
Impls + Lifetime Annotations:
implblocks are for implementing functionality for types in rust, and usually its for our types defined with structs and enums.In the most simple case you would want to just repeat the lifetimes that are added to the struct or enum that you are implementing functionality on:
struct Article<'a> { title: &'a str, } impl<'a> Article<'a> { fn new(title: &'a str) -> Self { Article { title } } fn title(&self) -> &'a str { self.title } }Though any given method with an impl can introduce it’s own lifetimes, just like any other function, which do not have to be specified at the impl signature (therefore at the impl block’s signature, all you have to do is specify the given type’s own references):
impl<'a> Article<'a> { fn compare<'b>(&self, other: &'b str) -> bool { self.title == other } }
Traits + Lifetime Annotations:
- traits are essentially a blueprint for implementations; its like grouping types by a set list of functionalities.
- the documentation has a good definition as well: “A trait is a collection of methods defined for an unknown type:
Self. They can access other methods declared in the same trait.”- Therefore they act similar to impl, where in the most simple case you just repeat the type’s lifetime annotations in their defintion.
// this example is taken directly from rust's documentation struct Borrowed<'a> { x: &'a i32, } // Annotate lifetimes to impl. // The default trait is in the standard library, its for giving a fallback default value easily // as you can see, the actual Default trait itself didnt need to be defined with the lifetime itself, only when it came to implementing it's behaviour did we have to specify it impl<'a> Default for Borrowed<'a> { fn default() -> Self { Self { x: &10, } } }But we can also have traits where in the definition itself it requires that we are using a reference with an explicit lifetime:
trait Summarizable<'a> { fn summary(&self) -> &'a str; }So if we double-up on both the struct and the trait needing an explicit lifetime annotation it would look something like:
struct Article<'a> { title: &'a str, } // the impl + trait + struct all defining explicit lifetime annotation impl<'a> Summarizable<'a> for Article<'a> { // 3 levels of abstraction, 3 lifetime annotations fn summary(&self) -> &'a str { self.title } }Generics can have trait bounds (e.g.
T: Copy + Clone, i.e.Tmust implementCopyandClone), but also lifetimes can be part of those trait bounds too:
- It’s important to note that the colon
:follows the previously discussed relationship of one lifetime outliving another lifetime by definition (e.g.'b: 'ameans b necessarily outlives a) when applying bounds on lifetimes, while the colon still means the same thing for regular generics (i.e. that it must implement some trait)
- so, as discussed in the documentation, we see:
- (1) “T: ’a: All references in T must outlive lifetime ’a.”
- (2) “
T: Trait + 'a: TypeTmust implement traitTraitand all references inTmust outlive'a.”
- and if the
Traithere has it’s own lifetimes specified in it’s definition, we have to specify those lifetimes as well asTrait<'lifetime>, so as in (2) we would haveT: Trait<'lifetime> + 'afn print_summary<'a, T: Summarizable<'a>>(item: &T) { // ... } fn process<'a, 'b, T>(item: &'a T, context: &'b str) -> &'a str where T: Display + 'a, // T must implement Display and outlive 'a 'b: 'a, // 'b must outlive 'a { // ... }
'a, 'b, 'c, etc. But there are a few names reserved for special use cases:'_ is a reserved name for lifetimes that you don’t need to reference by name. These are usually used to clarify lifetime relationships. It’s really a stylistic choice to signify a simple lifetime relationship. Calling it “reserved” is a stretch as it’s functionally similar to a regular lifetime name.// These are functionally identical:
struct Article<'a> {
title: &'a str,
}
struct Article<'_> {
title: &'_ str,
}
impl Article<'_> { // matches the struct definition
fn new(title: &str) -> Article<'_> { ... }
}
'static Lifetimes:'static is a reserved name for lifetimes. It designates that the value lives for the remaining length of the entire program being ran. It’s the longest possible lifetime for a value.'static is because this is the lifetime assigned to variables declared with the static keyword (also for owned values declared the const keyword). See “const vs let vs static in rust” below for more on this. The 'static lifetime can be explicitly given to any variable declared with let, const, or static// an example of a value that has a `'static` lifetime is the `str` type. These values get stored in the program's binary and are therefore always available.
fn get_greeting() -> &'static str {
"Hello" // string literal is 'static by default, let the borrow checker know that our function is doing this return type of such a lifetime
}
Tangent:
constvsletvsstaticin rust
const
constvalues are evaluated during compile-time; they are actually just stored in the binary that you compiled, so it’s location in memory is just within the binary’s location in memory, i.e. it’s compiled inline with the other code that you wrote.
- e.g.
const PI: f64 = 3.14159;- it always has a
'staticlifetime, since it never gets dropped (its part of the binary not allocated to memory!)- must be explicitly typed on definition
- essentially treated as a macro expansion / code replacement. Therefore the compiler produces a copy of that value for everywhere it’s referenced in the code.
- This is useful when a value is truly fixed and never needs to be referenced as a value at runtime. e.g. $\pi$ in a single equation.
- the RFC for
constdeclares: “constants declare constant values. These represent a value, not a memory address. This is the most common thing one would reach for and would replace static as we know it today in almost all cases.”- pros:
- No runtime overhead
- Simple, and you know for sure is memory safe since it doesn’t allocate memory
- cons:
- Repeated duplication leads to bigger binary size
- No stable, fixed memory address of this value. (if you reference a constant value
&const_value, you actually get a temporary (for as long as the lifetime the reference is to be valid) value in memory at some location the compiler determines; it’s no really a const at this point, its more like astaticthat you cannot determine the memory address for)- Cannot be passed to FFI expecting a real global symbol.
- cannot call functions that are not
const fn
const fn:
- a function that can be evaluated at compile time
- Can use these function in places that require compile-time constants, such as
constorstatic.const fn square(x: i32) -> i32 { x * x } const VALUE: i32 = square(5);
let
letvalues are placed in memory by the binary itself during runtime (on the stack, that being the data itself or a pointer to data loaded to the heap), which is the more common behaviour us programmers are used to.
- types can be inferred by the compiler
- automatically (drop’s inserted by compiler) cleaned up when out of scope.
- Mutable variables are local only (cannot be shared safely across threads without
Mutex/Atomic).
static
statickind of combines bothconstandletin the sense that it is an explicitly typed, fixed value, with a'staticlifetime but it is actually an accessible memory location (because it’s initialized/placed-in-memory while running your binary) rather than being buried in the binary itself; also it’s memory location is fixed. The value of a static must be computable at compile time (i.e. cannot compute at runtime dynamically), but the allocation and initialization happen at runtime, beforemain()runs.
static GREETING: &str = "hello world";the compiler puts “hello world” into thedatasection of the assembly/binary to be loaded at runtime.- Can be mutable, but only in
unsaferust, because you can potentially create data races.- This is a more true version of a “global” variable, where it is a central location that is available in memory at any point in the program (though it maybe not be accessible at every point, say if you defined it in a nested scope, then it’s only accessible in that nested scope to be referenced by other pieces of code, though it’s still in that pinned location in memory technically)
- the RFC for
staticdeclares: “statics declare global variables. These represent a memory address. They would be rarely used: the primary use cases are global locks, global atomic counters, and interfacing with legacy C libraries.”staticvalues can be accessed from any function in any scope (given that you defined the static variable in a location that is reachable to that function’s scope), you don’t have to pass it as a parameter.- also, you may not have a
main()at all, say in library crates, so having (possibly mutable) globals that are notletvariables are key in this instance.- pros:
- no compiler and therefore binary size overhead, and in memory, there is exactly one copy of this data in
.rodata, so overall less memory usage.
- Can hold large objects without duplicating them.
- provides a fixed location in memory for pointing to
- cons:
- some rust expressions only allow const values (array sizes, enum discriminants, generic const parameters, const fn)
- Runtime overhead introduced, it’s size of data determining the overhead amount.
- Mutability of these are unsafe. You should instead use smart pointers and wrapping types like
ArcandMutexto manage shared mutable access.// Globals are declared outside all other scopes. let variables cannot be outside of main() static DUMMY: &str = "Dummy"; const NUM: i32 = 10; fn main() { ...
Lifetime Elision:
Elision rules for functions and methods:
- We discussed previously that the rule of thumb when using explicit lifetime annotations is If you’re returning a reference and there are multiple input references, you need explicit lifetimes. This is because ambiguity gets introduced since internal logic of a function hides from the compiler what the lifetime of an output reference will be. There are times when there is no ambiguity in simple cases, some we saw previously:
(1) No output references:
fn print(s: &str) // the compiler automatically expands this to: fn print<'a>(s: &'a str)(2) One input reference, one output references:
fn first_word(s: &str) -> &str // the compiler automatically expands this to: fn first_word<'a>(s: &'a str) -> &'a str(3)
&selfor&mut selffor methods in impl blocksimpl Article { fn title(&self) -> &str // self's lifetime used for output // the compiler automatically expands this to: fn title<'a>(&'a self) -> &'a str }The actual rules of elision are for the compiler, the consequences of these were showcased above and are more practical to know, but it’s good to detail the below regardless to understand why the above applies. You can think of these rules as fancy autocomplete, and you only have to step in once things get too ambiguous:
- The compiler assigns automatically a lifetime to each reference that is a parameter.
'awill get assigned to the first parameter,'bto the second, and so on.- If there is exactly one input lifetime parameter, that lifetime is assigned to the output lifetime parameters.
fn func_name<'a, T>(param_1: &'a T) -> &'a T,fn func_name<'a, T>(param_1: &'a T) -> (&'a T, &'a T), etc.- If there’s more than one input parameter to a method, but one of these input parameters is
&selfor&mut self, then assign all output parameters the lifetime of the&selfor&mut selfvalue. This rule actually makes lifetime annotations in methods very uncommon.Some people use anonymous lifetimes to make clear that elision is happening. You may find this in some documentation.
// Without '_' - completely implicit fn process(s: &str) -> &str { s } // With '_' - says "yes, I know there are lifetimes here" fn process(s: &'_ str) -> &'_ str { s }
And that’s all you need to know about Lifetime annotations! Hope that clears everything up and gives you a good guide to reference.
The compiler’s error messages will be your best friend, as in a lot of cases it will point you in the right direction for what explicit lifetimes to use.
Thanks for reading my 7,500-word essay on Lifetimes!!
