About Posts

GAT emulation

Generic Associated Types have been the dream of many a Rust programmer for years now. Unfortunately, they have not landed in stable yet. This article presents a pretty good workaround I just learned that's too good not to share.

Motivation

Imagine some code like:

impl File {
    fn stream_data<'a>(&'a self) -> FileStream<'a> {
        ...
    }
}

impl Socket {
    fn stream_data<'a>(&'a self) -> SocketStream<'a> {
        ...
    }
}

We have two types, File and Socket which have a very similar method — stream_data.

They both return some FooStream<'a> type that presumably borrows some fields from the underlying struct.

The question is, how could we make a trait that abstracted across these? Our first attempt might look something like:

trait StreamData {
    type Stream;

    fn stream_data<'a>(&'a self) -> &'a Self::Stream;
}

But there is a subtle difference between this method signature and the two methods we want to abstract. Those two methods both return an owned stream that happens to contain some references bound by lifetime 'a.

But this trait method returns a borrowed stream. This is a totally different thing. It would require the type implementing StreamData to contain an owned Self::Stream inside of it. This may not be feasible in many cases.

You might also be tempted to write:

trait StreamData<'a> {
    type Stream;

    fn stream_data(&'a self) -> Self::Stream;
}

Which would give you access to a lifetime parameter when declaring the associated type, but this lifetime parameter also infects any code that wants to be generic over your trait.

What we really want is to say the associated type is allowed to carry references with the lifetime 'a. We want to write some code like:

trait StreamData {
    type Stream<'a>;

    fn stream_data<'a>(&'a self) -> Self::Stream<'a>;
}

We declare the associated type Stream to have a generic lifetime bound, and then in the method declaration, we say the lifetime is the same lifetime as the &self in the method receiver.

Unfortunately, the ability to do this is not currently legal in Rust. The feature enabling it is called Generic Associated Types and as of mid-2022, it is nearly implemented, but the stabilization phase is dragging on due to some unanswered design questions. It's good that the Rust compiler people are taking the time needed to get it right, but it's a shame for those who want the feature now!

During my time in Rust at S3, there were many instances in which I wanted GATs. One example was pretty similar to the code above — we had a trait that abstracted some data store, and we wanted to get a stream to some data in it, but the stream shouldn't be allowed to outlive the underlying data store.

Workaround

I learned this trick by examining the output of the excellent nougat crate which does what I am going to present in an automated fashion with procedural macros. I think showing how it's done manually is super valuable and easier to debug than the macro soup that gets emitted.

The basic idea is to move your associated type to a helper trait, and then make your main trait depend on the helper trait with higher-ranked trait bounds syntax.

trait StreamDataHelper<'a, LifetimeBounds = &'a Self> {
    type Stream;
}

trait StreamData : for<'a> StreamDataHelper<'a> {
    fn stream_data<'a>(&'a self) -> <Self as StreamDataHelper<'a>>::Stream;
}

A few things to note here.

The first is the weird LifetimeBounds = &'a Self type parameter. This is letting the compiler know that Self does not contain any references that live longer than the 'a lifetime. There isn't anything special about the name LifetimeBounds, it's just descriptive — we could have used any valid type parameter name here.

An alternative would be to use where Self: 'a syntax, which is more explicit, but also pushes the requirement to be explicit onto the user of this trait, whereas the LifetimeBounds = &'a Self syntax is (ab)using default generic type parameters so that the caller doesn't need to specify it.

The second thing is to note that all of the 'a lifetimes in the stream_data method signature can be elided, but I left them in for clarity.

What's going on here is we are putting a lifetime on the trait like I mentioned we might be tempted to above, but we avoid that lifetime infecting any code that wants to be generic over our trait with HRTB syntax. Other code can still just be generic over StreamData. But when they call some_stream.stream_data() they get back a Stream object that is bound to the lifetime of that method call.

The one annoyance is you can't refer to the associated type as easily. If you have a S: StreamData, you can't just refer to S::Stream. You have to refer to <S as StreamDataHelper>::Stream.

So this is still clunkier than real GATs, but at least it compiles and works!

A worked example

The classic motivating example for GATs is to create a LendingIterator trait. The standard library iterator trait yields owned values. The type signature is:

trait Iterator {
    type Item;
    fn next(&mut self) -> Option<Self::Item>;
}

It's still possible for the owned Self::Item to be a reference if the implementing struct has a lifetime:

struct Iter<'a, T> {
    slice: &'a [T],
    i: usize,
}

impl<'a, T> Iterator for Iter<'a, T> {
    type Item = &'a T;
    fn next(&mut self) -> Option<Self::Item> {
        let item = self.slice.get(self.i)?;
        self.i += 1;
        Some(item)
    }
}

But we see this breaks down if we try to yield mutable references:

struct IterMut<'a, T> {
    slice: &'a mut [T],
    i: usize,
}

impl<'a, T> Iterator for IterMut<'a, T> {
    type Item = &'a mut T;
    fn next(&mut self) -> Option<Self::Item> {
        let item = self.slice.get_mut(self.i)?;
        self.i += 1;
        Some(item)
    }
}

We get the compiler error:

error[E0495]: cannot infer an appropriate lifetime for autoref due to conflicting requirements
   |
   |         let item = self.slice.get_mut(self.i)?;
   |                               ^^^^^^^
   |
note: first, the lifetime cannot outlive the anonymous lifetime defined here...
   |
   |     fn next(&mut self) -> Option<Self::Item> {
   |             ^^^^^^^^^
note: ...so that reference does not outlive borrowed content
   |
   |         let item = self.slice.get_mut(self.i)?;
   |                    ^^^^^^^^^^
note: but, the lifetime must be valid for the lifetime `'a` as defined here...
   |
   | impl<'a, T> Iterator for Iter<'a, T> {

The ultimate cause here is that the compiler cannot know that we aren't going to yield two mutable references to the same thing, which is illegal.

The standard library implements things like [T]::iter_mut using the unsafe keyword. This is needed because we want to be able to get mutable references to unaliased element, and unsafe is the only way to do it.

But another workaround would be throw out the need to get multiple mutable references. There are many use cases where you only need one reference at a time, such as processing items from an iterator in a streaming fashion. That is, the lifetime bound of the reference could be bound to the next method call rather than the whole struct. This is exactly what we covered above.

So we want:

trait LendingIterator {
    type Item<'a>;
    fn next<'a>(&'a mut self) -> Option<Self::Item<'a>>
}

impl<T> LendingIterator for IterMut<'_, T> {
    type Item<'a>;
    fn next<'a>(&'a mut self) -> Option<Self::Item<'a>> {
        let item = self.slice.get_mut(self.i)?;
        self.i += 1;
        Some(item)
    }
}

Unfortunately, this requires GATs. But now we know how to work around this!

trait LendingIteratorItem<'a, LifetimeBounds = &'a Self> {
    type T;
}

trait LendingIterator: for<'a> LendingIteratorItem<'a> {
    fn next(&mut self) -> Option<<Self as LendingIteratorItem>::T>;
}

struct IterMut<'a, T> {
    data: &'a mut [T],
    i: usize,
}

impl<'a, T> LendingIteratorItem<'a> for IterMut<'_, T> {
    type T = &'a mut T;
}

impl<T> LendingIterator for IterMut<'_, T> {
    fn next(&mut self) -> Option<<Self as LendingIteratorItem>::T> {
        let result = self.data.get_mut(self.i)?;
        self.i += 1;
        Some(result)
    }
}

Now we can write code like:

let mut data = vec![1, 2, 3];

let mut iter = IterMut {
    data: &mut data,
    i: 0,
};

while let Some(x) = iter.next() {
    *x += 1;
}

println!("{:?}", data);

And we get the expected [2, 3, 4] output.

Conclusion

Yes, this method is clunky, but it does seem to totally allow working around the lack of GATs on stable Rust.

It can be made less clunky with the excellent nougat crate, which allows you to write code like:

#[gat]
trait LendingIterator {
    type Item<'a>: where Self: 'a;
    fn next(&mut self) -> Option<Self::Item<'_>>;
}

struct IterMut<'a, T> {
    data: &'a mut [T],
    i: usize,
}

#[gat]
impl<T> LendingIterator for IterMut<'_, T> {
    type Item<'a> = &'a mut T where T: 'a;
    fn next(&mut self) -> Option<Self::Item<'_>> {
        let result = self.data.get_mut(self.i)?;
        self.i += 1;
        Some(result)
    }
}

And it will spit out a somewhat more mechanical and hard to read version of the code we produced above (but you don't have to see it!). Another advantage to using the crate is once GATs land on stable Rust, you should just be able to delete the #[gat] annotations and it'll all Just Work™.

I find doing it manually to be helpful and makes debugging error messages easier to reason about, but I could not have learned this technique without the nougat crate, so I am very thankful to its author Daniel Henry-Mantilla.