Stabilize `<[T]>::get_many_mut()` #128318

ChayimFriedman2 · 2024-07-28T20:34:26Z

Tracking issue: #104642.

Best reviewed commit by commit.

Stabilization Report

Implementation History

Implemented in #83608. Originally included an immutable version too, but that was deemed unnecessary. Also, at first the indices array was required to be sorted, but this was changed.

It was suggested that a DisjointIndices type will be added, that will encapsulate the invariant and allow you to index with O(1). This has the benefits that it allows reuse of the same keys, and possibly to add a constructor that allows O(N) instead of O(NlgN) check by requiring the indices array to be sorted. has I decided to not go for this route, since (1) it is unclear how useful those benefits are (I've never needed them personally), (2) about the second benefit, we can always add a get_many_sorted_mut() method, and that will have roughly the same implementation and API complexity as DisjointIndices with two constructors, if not better, and (3) we provide the get_many_unchecked_mut() primitive, so anyone can build on top of that while the tricky unsafe details of how to soundly implement get_many_unchecked_mut() are hidden from them. Also, it will make the call significantly more convoluted (get_many_mut(&DisjointIndices::new([...])?)? instead of get_many_mut(&[...])?, and while it is possible to simplify that with a trait, that will imply more implementation complexity and weaker inference.

It was also suggested that we add the ability to index with ranges. I decided to not do that either. The reasons being unclear benefit and greater implementation and API complexity.

Another suggestion was to add a index_many_mut() method that will panic on out of bounds/overlapping indices, possibly with a better panic message than get_many_mut().expect() (due to the ability to differentiate between the two error conditions). While not in this stabilization request, we can always add it later if deemed worthy.

Yet another request that came is the ability to tell from the error whether it was out of bounds or overlapping indices. I reserved the right to do this (by having a private field on the error type), but didn't actually do it now because it leads to worse codegen on common cases (#128214).

In this PR, I:

Changed the implementation to switch to sort when N>=9. I don't know how likely this case is, but it doesn't add a lot of code, and it seems to be a footgun if we don't add it.
Changed the API to take a reference to the array of indices, instead of an owned array. The reason, as per the commit message:

We only need an immutable access to the indices array, and it we take it owned we may force a copy for big arrays. For small arrays, LLVM will fully inline and unroll the body anyway, so this doesn't matter.

In simple cases, LLVM is able to avoid the copy, but it has troubles in more complex cases (https://godbolt.org/z/s371G4r9e). Given that I expect get_many_mut() to be used most of the times with few, probably only two, indices, this seems unlikely to matter and adds one character to type, but it isn't that bad and avoiding a copy seems nice.
Re-exported GetManyMutError from std and alloc, which was probably just forgotten.

API Summary

impl [T] {
    pub unsafe fn get_many_unchecked_mut<const N: usize>(&mut self, indices: &[usize; N]) -> [&mut T; N];
    pub fn get_many_mut<const N: usize>(&mut self, indices: &[usize; N]) -> Result<[&mut T; N], GetManyMutError<N>>;
}

pub struct GetManyMutError<const N: usize> { /* private */ }

Experience Report

I used it in Advent of Code 2022 day 5, and also suggested it here. It worked great, but in this case I didn't have to handle the error so I can't tell if knowing the kind of error is really needed.

There are few polyfill crates (https://crates.io/crates/indices, https://crates.io/crates/index_many, https://crates.io/crates/get_many_mut). They don't seem to get much usage. There is also a highly upvoted question on Stack Overflow, and I see this question popping up there sometimes.

… of O(N^2) by sorting first

We only need an immutable access to the indices array, and it we take it owned we may force a copy for big arrays. For small arrays, LLVM will fully inline and unroll the body anyway, so this doesn't matter. In simple cases, LLVM is able to avoid the copy, but it has troubles in more complex cases (https://godbolt.org/z/s371G4r9e). Given that I expect `get_many_mut()` to be used most of the times with few, probably only two, indices, this seems unlikely to matter and adds one character to type, but it isn't that bad and avoiding a copy seems nice.

rustbot · 2024-07-28T20:34:34Z

r? @tgross35

rustbot has assigned @tgross35.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

tgross35 · 2024-07-28T20:36:50Z

This hasn't yet had an FCP so needs libs-api to take a look

r? libs-api

tgross35 · 2024-07-28T20:54:18Z

library/core/src/slice/mod.rs

+    // Based on benchmarks, it is faster to sort starting with 9 indices.
+    if N >= 9 {
+        let mut sorted_indices = *indices;


The commit message states:

We only need an immutable access to the indices array, and it we take it owned we may force a copy for big arrays. For small arrays, LLVM will fully inline and unroll the body anyway, so this doesn't matter.

But the array is copied here, so it seems any gains there are unfortunately immediately lost for anything N >= 9. Depending on the exact performance it may be worth checking indices.is_sorted() before duplicating and sorting - do you have the results of those benchmarks?

This is not true, as we need both the unsorted and sorted array. So if we take by value we would copy twice.

About checking is_sorted(), I can do that, but is it worth it? How common is sorted slice that you don't know is sorted? After all, if you know it's sorted you can use a guaranteed O(N) API (not in std) - and I don't know how common is this alone, yet alone by accidence.

However you made me realize that my assembly inspection was wrong - LLVM did copy the array, but only once. So it seems it is good enough to know when the copy can be elided. I wonder if that means we can trust it?

I meant that if copying is showing up to be significant on benchmarks (enough to justify the by-val to by-ref change), then maybe we should try to avoid the copy whenever possible. But LLVM is usually pretty good about figuring out when it makes sense to pass larger types as a pointer, so I am curious how different the benchmark results turned out.

The DisjointIndices type would make this a lot cleaner since it gives a way to move copying and sorting out of a potentially hot loop.

I didn't benchmark passing by ref/value, all I benchmarked was sorting versus O(N^2) check. Without a concrete example where LLVM does not elide the copy, of course benchmarks won't show any difference.

The DisjointIndices type would make this a lot cleaner since it gives a way to move copying and sorting out of a potentially hot loop.

This is the "reusing indices" benefit, but as I said, I doubt how useful this is.

bors · 2024-07-30T01:05:09Z

☔ The latest upstream changes (presumably #128360) made this pull request unmergeable. Please resolve the merge conflicts.

Amanieu · 2024-10-02T21:21:42Z

Could you please split the implementation changes into a separate PR from stabilization? Also we discussed the signature in the libs-api meeting yesterday and would prefer that the array of indices be taken by value instead of by reference. This is in line with other APIs that take and return arrays.

ChayimFriedman2 · 2024-10-05T17:43:15Z

@Amanieu Will do, but this may take me some days.

jdahlstrom · 2024-10-10T11:38:00Z

Hmm, libs-api requested that HashMap::get_many_mut return [Option<&mut V>; N], a return type quite inconsistent with what’s proposed to be stabilized here. Should probably be discussed and decided what to do before stabilizing either method.

cc @joshtriplett

ChayimFriedman2 · 2024-11-16T21:04:24Z

@jdahlstrom Making it return [Option<&mut T>; N] will affect performance badly, so I wouldn't want this. I also think slice's get_many_mut() is more performance-sensitive than hashmap's (that does more work anyway), so I think the difference is acceptabke.

ChayimFriedman2 · 2024-11-17T19:16:44Z

I'm closing this and will open new PRs with the requested changes.

ChayimFriedman2 added 4 commits July 28, 2024 20:25

In <[T]>::get_many_mut(), where N>=9, do a O(N*lg(N)) check instead…

4c3bd81

… of O(N^2) by sorting first

Re-export GetManyMutError from alloc::slice and std::slice

d618ebc

Stabilize <[T]>::get_many_mut()

3b9517d

rustbot assigned tgross35 Jul 28, 2024

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Jul 28, 2024

rustbot added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label Jul 28, 2024

rustbot assigned m-ou-se and unassigned tgross35 Jul 28, 2024

tgross35 added needs-fcp This change is insta-stable, so needs a completed FCP to proceed. and removed T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Jul 28, 2024

tgross35 reviewed Jul 28, 2024

View reviewed changes

tgross35 added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Oct 5, 2024

jdahlstrom mentioned this pull request Oct 15, 2024

Tracking Issue for map_many_mut #97601

Open

3 tasks

ChayimFriedman2 closed this Nov 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stabilize `<[T]>::get_many_mut()` #128318

Stabilize `<[T]>::get_many_mut()` #128318

ChayimFriedman2 commented Jul 28, 2024

rustbot commented Jul 28, 2024

tgross35 commented Jul 28, 2024

tgross35 Jul 28, 2024 •

edited

Loading

ChayimFriedman2 Jul 28, 2024

tgross35 Jul 28, 2024

ChayimFriedman2 Jul 28, 2024

bors commented Jul 30, 2024

Amanieu commented Oct 2, 2024

ChayimFriedman2 commented Oct 5, 2024

jdahlstrom commented Oct 10, 2024 •

edited

Loading

ChayimFriedman2 commented Nov 16, 2024

ChayimFriedman2 commented Nov 17, 2024

Stabilize <[T]>::get_many_mut() #128318

Stabilize <[T]>::get_many_mut() #128318

Conversation

ChayimFriedman2 commented Jul 28, 2024

Stabilization Report

Implementation History

API Summary

Experience Report

rustbot commented Jul 28, 2024

tgross35 commented Jul 28, 2024

tgross35 Jul 28, 2024 • edited Loading

Choose a reason for hiding this comment

ChayimFriedman2 Jul 28, 2024

Choose a reason for hiding this comment

tgross35 Jul 28, 2024

Choose a reason for hiding this comment

ChayimFriedman2 Jul 28, 2024

Choose a reason for hiding this comment

bors commented Jul 30, 2024

Amanieu commented Oct 2, 2024

ChayimFriedman2 commented Oct 5, 2024

jdahlstrom commented Oct 10, 2024 • edited Loading

ChayimFriedman2 commented Nov 16, 2024

ChayimFriedman2 commented Nov 17, 2024

Stabilize `<[T]>::get_many_mut()` #128318

Stabilize `<[T]>::get_many_mut()` #128318

tgross35 Jul 28, 2024 •

edited

Loading

jdahlstrom commented Oct 10, 2024 •

edited

Loading