Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid stack allocation in xxhash64 #547

Closed
andygrove opened this issue Jun 10, 2024 · 7 comments · Fixed by #575
Closed

Avoid stack allocation in xxhash64 #547

andygrove opened this issue Jun 10, 2024 · 7 comments · Fixed by #575
Labels
enhancement New feature or request performance

Comments

@andygrove
Copy link
Member

What is the problem the feature request solves?

Here is the current xxhash64 implementation:

pub(crate) fn spark_compatible_xxhash64<T: AsRef<[u8]>>(data: T, seed: u64) -> u64 {
    // TODO: Rewrite with a stateless hasher to reduce stack allocation?
    let mut hasher = XxHash64::with_seed(seed);
    hasher.write(data.as_ref());
    hasher.finish()
}

As the TODO comment states, it may be better to implement without stack allocation.

Here is Spark's version for reference:

  public static long hashInt(int input, long seed) {
    long hash = seed + PRIME64_5 + 4L;
    hash ^= (input & 0xFFFFFFFFL) * PRIME64_1;
    hash = Long.rotateLeft(hash, 23) * PRIME64_2 + PRIME64_3;
    return fmix(hash);
  }

Describe the potential solution

No response

Additional context

No response

@parthchandra
Copy link
Contributor

Are we expecting a performance improvement as a result? (I doubt stack allocations are a performance bottleneck)

@advancedxy
Copy link
Contributor

Linking this to #517 too.

Are we expecting a performance improvement as a result? (I doubt stack allocations are a performance bottleneck)

+1. I think we may need to figure out the root cause first. It might that the implementation of XxHash64 introduces some performance bottleneck, rather the stack allocation. I can try to reproduce and check this issue this weekend if I have some spare time.

@andygrove
Copy link
Member Author

Thanks @advancedxy. Yes, it definitely needs more investigation.

@andygrove
Copy link
Member Author

Here are benchmark results comparing murmur3 to xxhash64. xxhash64 is 3x slower (not sure if that is the expectation) but what is more interesting is that there is a warning during warmup (Unable to complete 100 samples in 5.0s).

     Running benches/hash.rs (target/release/deps/hash-3beecac83c235dce)
Gnuplot not found, using plotters backend
hash/murmur3/8192       time:   [342.78 µs 345.32 µs 349.49 µs]
Found 7 outliers among 100 measurements (7.00%)
  4 (4.00%) high mild
  3 (3.00%) high severe
Benchmarking hash/xxhash64/8192: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 5.0s, enable flat sampling, or reduce sample count to 70.
hash/xxhash64/8192      time:   [992.21 µs 994.27 µs 996.88 µs]
Found 9 outliers among 100 measurements (9.00%)
  5 (5.00%) low mild
  3 (3.00%) high mild
  1 (1.00%) high severe

@andygrove
Copy link
Member Author

@advancedxy I am working on re-implementing this now in a simpler way as an experiement ... I will aim to have a PR up by end of day

@parthchandra
Copy link
Contributor

xxhash64 is 3x slower (not sure if that is the expectation)

xxhash64 is by no means slower than murmur3. Here's a sample benchmark: https://lz4.github.io/lz4-java/1.3.0/xxhash-benchmark/

@advancedxy
Copy link
Contributor

I am working on re-implementing this now in a simpler way as an experiement ... I will aim to have a PR up by end of day

Thanks for your effort.

Here are benchmark results comparing murmur3 to xxhash64. xxhash64 is 3x slower (not sure if that is the expectation)

I don't think this is expected as @parthchandra already pointed out xxhash64 should be much faster than murmur3. Even the TPC-H q8 is not related to XXHash64, we should still investigate it further and improve its performance. I can help with that, though it's not urgent then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request performance
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants