Rust: Looping on a member variable without mutably borrowing self

The Story

Late last night, I stumbled upon a rather clever hack in one of my Rust projects.

I’d been working on an iterator which implements grapheme-aware CamelCase word-splitting when I decided to do some cleanup and ran cargo clippy rather than the much quicker cargo test I’d been using to iterate on it.

Not-surprisingly, given how much I’ve been letting myself get carried away with my hobby projects and “coding while half-asleep”, it popped up some warnings, but this is the one which started things off:

warning: this loop could be written as a `for` loop
 --> src/util/naming/camelcase.rs:260:9
 |
260 | / while let Some((byte_offset, grapheme)) = self.in_iter.next() {
261 | | // Extract the base `char` so `classify_char` can call things like `is_uppercase`
262 | | let base = grapheme.chars().nth(0).expect("non-empty grapheme cluster");
263 | |
... |
280 | | }
281 | | }
    | |_________^ help: try `for (byte_offset, grapheme) in self.in_iter { .. }`
    |
    = note: #[warn(while_let_on_iterator)] on by default
    = help: for further information visit https://github.com/Manishearth/rust-clippy/wiki#while_let_on_iterator

Sounds reasonable, so I tried the suggested syntax and got a new error:

error[E0507]: cannot move out of borrowed content
 --> src/util/naming/camelcase.rs:260:40
 |
260 | for (byte_offset, grapheme) in self.in_iter {
 | ^^^^ cannot move out of borrowed content

Well, I may be tired out of my mind, but I recognized what that error meant in theory, so I tried adding &.

error[E0277]: the trait bound `&unicode_segmentation::GraphemeIndices<'_>: std::iter::Iterator` is not satisfied
   --> game_launcher_core/src/util/naming/camelcase.rs:262:9
    |
262 | / for (byte_offset, grapheme) in &(self.in_iter) {
263 | | // Extract the base `char` so `classify_char` can call things like `is_uppercase`
264 | | let base = grapheme.chars().nth(0).expect("non-empty grapheme cluster");
265 | |
... |
282 | | }
283 | | }
    | |_________^ the trait `std::iter::Iterator` is not implemented for `&unicode_segmentation::GraphemeIndices<'_>`
    |
    = note: `&unicode_segmentation::GraphemeIndices<'_>` is not an iterator; maybe try calling `.iter()` or a similar method
    = note: required by `std::iter::IntoIterator::into_iter`

Now, my tiredness bit me. It never occurred to me that a unicode_segmentation::GraphemeIndices needs to be mutably bound to work, nor that “not implemented for &” said nothing about whether it was implemented for &mut.

Completely stumped, I popped over to  #rust where, after blindly trying several helpful suggestions, I finally tried &mut self.in_iter.

That would normally have worked… except for one small problem:

error[E0499]: cannot borrow `*self` as mutable more than once at a time
   --> game_launcher_core/src/util/naming/camelcase.rs:277:40
    |
262 |         for (byte_offset, grapheme) in &mut self.in_iter {
    |                                             ------------ first mutable borrow occurs here
...
277 |                 CCaseAction::Skip => { self._next_word(byte_offset, true) },
    |                                        ^^^^ second mutable borrow occurs here
...
283 |         }
    |         - first borrow ends here

error[E0499]: cannot borrow `*self` as mutable more than once at a time
   --> game_launcher_core/src/util/naming/camelcase.rs:278:45
    |
262 |         for (byte_offset, grapheme) in &mut self.in_iter {
    |                                             ------------ first mutable borrow occurs here
...
278 |                 CCaseAction::StartWord => { self._next_word(byte_offset, false) },
    |                                             ^^^^ second mutable borrow occurs here
...
283 |         }
    |         - first borrow ends here

error[E0499]: cannot borrow `*self` as mutable more than once at a time
   --> game_launcher_core/src/util/naming/camelcase.rs:279:54
    |
262 |         for (byte_offset, grapheme) in &mut self.in_iter {
    |                                             ------------ first mutable borrow occurs here
...
279 |                 CCaseAction::AlreadyStartedWord => { self._next_word(prev_offset, false) },
    |                                                      ^^^^ second mutable borrow occurs here
...
283 |         }
    |         - first borrow ends here

The code calls &mut self methods inside the loop body and, because the loop returns before exhausting the iterator, I can’t mem::replace it out of the binding.

In short, I had stumbled upon the one way to do this, on my first try, completely by accident, and, were it not for a naïve Clippy lint, I would have never realized how special  this syntax is.

It was around that point that the wheel-bound hamster powering my brain woke up long enough for me to start making the connections and ask the last few questions necessary for it to all make sense…

The Reasoning

The problem here is a collision between two characteristics of Rust’s design:

  1. for works via the IntoIterator trait, which means that, as far as the compiler knows, releasing and re-borrowing the resulting iterator would discard the iteration state and start over. (ie. There’s no magic in the compiler to to recognize when it’s already got an iterator.)
  2. My self._next_word takes a mutable borrow over all of &self …which will fail if for is still holding a reference to one of its members.

The clever trick behind while let Some(...) = self.in_iter.next() is that it bypasses IntoIterator. As such, the compiler can be certain that self.in_iter won’t go away between iterations, and can release the borrow.

As a result, you’re left with something that functions like a for loop, but only holds onto the item which the iterator returned, leaving self free to be borrowed, in its entirety, by all and sundry.

So, there you have it. If you’re writing a struct which holds onto an iterator and your for loop is making things difficult by blocking method calls, try bypassing IntoIterator. Get an iterator manually, then reformulate your loop to use while let Some(...) instead.

If Clippy complains, add #[allow(while_let_on_iterator)] and get on with your day.

P.S. Don’t worry about the “coding while half-asleep” part. I write my unit test suites and audit/refactor the previous day’s work while wide-awake and alert. 😉

CC BY-SA 4.0 Rust: Looping on a member variable without mutably borrowing self by Stephan Sokolow is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

This entry was posted in Geek Stuff. Bookmark the permalink.

1 Response to Rust: Looping on a member variable without mutably borrowing self

  1. James Crooks says:

    This is a nice little explanation of an important difference between the for and while loop constructs in Rust which is absent in other languages like C.

    The for loop executes a block of code once for each element it pulls from an iterator (absent additional control flow such as return or break). In order to make this assurance, it holds ownership over the iterator and manages the state of the iterator for you.

    In contrast, a while loop just checks a condition at the beginning of each iteration (in terms of control flow, separate from destructuring and name binding). When the condition check completes, the while loop is no longer involved until the next iteration begins, and it leaves state management up to the programmer. Because you’re responsible for state management instead of the looping construct, you get back the ability to make borrows, but lose the contract that your code block executes once for each iteration through the loop. This can be clearly demonstrated by calling .next() on the iterator inside of the while loop.

    As such the post isn’t describing a syntactical quirk of the language, but rather a semantic difference in guarantees made by the different control flow constructs. This contrasts with C-descended languages, where for and while are essentially isomorphic under some rearrangement of code.

Leave a Reply

Your email address will not be published. Required fields are marked *

By submitting a comment here you grant this site a perpetual license to reproduce your words and name/web site in attribution under the same terms as the associated post.

All comments are moderated. If your comment is generic enough to apply to any post, it will be assumed to be spam. Borderline comments will have their URL field erased before being approved.