Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stabilize <[T]>::get_many_mut() #128318

Closed
Closed
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 23 additions & 13 deletions library/core/src/slice/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -4851,20 +4851,30 @@ impl<T, const N: usize> SlicePattern for [T; N] {
}

/// This checks every index against each other, and against `len`.
///
/// This will do `binomial(N + 1, 2) = N * (N + 1) / 2 = 0, 1, 3, 6, 10, ..`
/// comparison operations.
fn get_many_check_valid<const N: usize>(indices: &[usize; N], len: usize) -> bool {
// NB: The optimizer should inline the loops into a sequence
// of instructions without additional branching.
let mut valid = true;
for (i, &idx) in indices.iter().enumerate() {
valid &= idx < len;
for &idx2 in &indices[..i] {
valid &= idx != idx2;
// Based on benchmarks, it is faster to sort starting with 9 indices.
if N >= 9 {
let mut sorted_indices = *indices;
Comment on lines +4851 to +4853
Copy link
Contributor

@tgross35 tgross35 Jul 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The commit message states:

We only need an immutable access to the indices array, and it we take it owned we may force a copy for big arrays. For small arrays, LLVM will fully inline and unroll the body anyway, so this doesn't matter.

But the array is copied here, so it seems any gains there are unfortunately immediately lost for anything N >= 9. Depending on the exact performance it may be worth checking indices.is_sorted() before duplicating and sorting - do you have the results of those benchmarks?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not true, as we need both the unsorted and sorted array. So if we take by value we would copy twice.

About checking is_sorted(), I can do that, but is it worth it? How common is sorted slice that you don't know is sorted? After all, if you know it's sorted you can use a guaranteed O(N) API (not in std) - and I don't know how common is this alone, yet alone by accidence.

However you made me realize that my assembly inspection was wrong - LLVM did copy the array, but only once. So it seems it is good enough to know when the copy can be elided. I wonder if that means we can trust it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant that if copying is showing up to be significant on benchmarks (enough to justify the by-val to by-ref change), then maybe we should try to avoid the copy whenever possible. But LLVM is usually pretty good about figuring out when it makes sense to pass larger types as a pointer, so I am curious how different the benchmark results turned out.

The DisjointIndices type would make this a lot cleaner since it gives a way to move copying and sorting out of a potentially hot loop.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't benchmark passing by ref/value, all I benchmarked was sorting versus O(N^2) check. Without a concrete example where LLVM does not elide the copy, of course benchmarks won't show any difference.

The DisjointIndices type would make this a lot cleaner since it gives a way to move copying and sorting out of a potentially hot loop.

This is the "reusing indices" benefit, but as I said, I doubt how useful this is.

sorted_indices.sort_unstable();
for &[i, j] in sorted_indices.array_windows() {
if i == j {
return false;
}
}
let biggest_index = *sorted_indices.last().expect("indices array should not be empty");
biggest_index < len
} else {
// NB: The optimizer should inline the loops into a sequence
// of instructions without additional branching.
let mut valid = true;
for (i, &idx) in indices.iter().enumerate() {
valid &= idx < len;
for &idx2 in &indices[..i] {
valid &= idx != idx2;
}
}
valid
}
valid
}

/// The error type returned by [`get_many_mut<N>`][`slice::get_many_mut`].
Expand All @@ -4883,8 +4893,8 @@ fn get_many_check_valid<const N: usize>(indices: &[usize; N], len: usize) -> boo
/// assert!(v.get_many_mut([1, 1]).is_err());
/// ```
#[unstable(feature = "get_many_mut", issue = "104642")]
// NB: The N here is there to be forward-compatible with adding more details
// to the error type at a later point
// NB: The N and the private field here is there to be forward-compatible with
// adding more details to the error type at a later point
pub struct GetManyMutError<const N: usize> {
_private: (),
}
Expand Down