Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

foldhash::fast::RandomState is very bad #577

Open
turalcar opened this issue Oct 14, 2024 · 15 comments
Open

foldhash::fast::RandomState is very bad #577

turalcar opened this issue Oct 14, 2024 · 15 comments

Comments

@turalcar
Copy link

https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=8d9004344a0ddd7799b90e24f5866e75

This counts collisions. The number fluctuates wildly from 1.5% to 7% and mostly depends on global seed (if one measurement is bad then all of them are). Both foldhash::quality::RandomState and ahash::RandomState remain at around 2.9% and would be acceptable as default.

@Amanieu
Copy link
Member

Amanieu commented Oct 14, 2024

What kind of numbers do you get from ahash or fxhash?

cc @orlp

@orlp
Copy link

orlp commented Oct 14, 2024

For such small sets and such a small input domain you are testing very specific relations between input bits and output bits (only the top 7 output bits for the tag and the bottom log2(n / 16) output bits for the bucket, if I'm not mistaken). So only 13 of the output bits are being focused on in this test (even less since there's no pre-reserve call), with only 10 input bits.

If an unlucky seed is chosen, you can hit an unlucky avalanche probability, as is documented in the readme for foldhash-f.

In the initialization procedure of foldhash I already force some bits on in the global seed, perhaps I can extend this a bit more to force a few more bits on/off such that the small-input case always at least avalanches into the top 7 bits.

@turalcar
Copy link
Author

1000 was chosen arbitrarily. I reran this with 0x3800 items for maximum density and averaged over 32768 runs.
fxhash is an interesting one - it seems to always use the same seed: 0.226 collisions per insertion.
ahash: 0.0637 collisions per insertion
std: 0.0637 collisions per insertion
foldhash::quality: 0.0637 collisions per insertion.
foldhash::fast: I ran this 10 times, avg: 0.0699, min: 0.0422, max: 0.213
The result is similar for 0x3801 (minimum density): all hashes provide 0.0273 collision rate, except fxhash which has exactly 0, and foldhash::fast which gave avg: 0.0292, min: 0.0201, max: 0.0458.

@turalcar
Copy link
Author

The motivating issue for me was actually small strings. Using to_string() before hashing doesn't change collision rates for most hash functions. For foldhash::fast it becomes consistent but still a bit worse (0.0715 at max density, 0.0349 at min density). For fxhash it becomes 0.133 at max density, 0.0503 at min density.

@orlp
Copy link

orlp commented Oct 14, 2024

Upon further investigation it appears that the issue isn't (just) simply that the 7-bit tag has a bad distribution, but also that for small number inputs it can be correlated with the bottom bits for some seeds. I think I can also resolve this problem by choosing seeds more carefully though.

@arthurprs
Copy link

arthurprs commented Oct 26, 2024

@orlp, I was just wondering if you have further thoughts on this edge case? There are various "low u32" keyed hashtables in https://github.com/arthurprs/canopydb that could be affected by this, and I'm not sure if I need to switch to the "quality" hash to prevent such degenerate cases.

@orlp
Copy link

orlp commented Oct 26, 2024

@arthurprs I'm working on analyzing it, at this point I don't have a concrete answer yet. I've collected 1GB worth of seeds along with a performance score on the average number of equality checks performed when inserting the numbers 0..1<<16 into an hashbrown set, and I'm trying to establish a concrete pattern for what determines whether a seed is good or not, and why. Then I can hopefully update the seed selection to only generate good seeds.

Foldhash-q is rock-solid regardless of seed, so if you want to be 100% certain performance is predictable, it's a good choice, especially in the meantime.

@Greatness7
Copy link

I'm seeing a massive decrease in performance when upgrading hashbrown from 0.14 to 0.15 and I think the issue describe here may be the problem.

My application went from well over 60fps to below 10fps after the update. I only realized hashbrown was the cause after attaching a profiler and seeing an outsized impact in the flame graph.

Flame graph when using hashbrown v1.14:
Image

Flame graph when using hashbrown v1.15:
Image

With v1.14 hashbrowns functions aren't significant enough to be noticed in the graph, but with v1.15 they dominate most of the cpu time.

@orlp
Copy link

orlp commented Dec 5, 2024

@Greatness7 Does the problem resolve if you switch to the old hasher, ahash, with v1.15? If so I'm very curious as to what you're hashing.

@Greatness7
Copy link

Greatness7 commented Dec 5, 2024

@Greatness7 Does the problem resolve if you switch to the old hasher, ahash, with v1.15? If so I'm very curious as to what you're hashing.

Yes I have confirmed that (in v1.15) swapping out

use hashbrown::DefaultHashBuilder;

with

type DefaultHashBuilder = core::hash::BuildHasherDefault<ahash::AHasher>;

solves the performance issues that I'm seeing.

For context this is a GUI application using the vizia framework. I am not the original author of the framework so I don't know exactly how it all works or what is being hashed, but DefaultHashBuilder is only used in one place here. Which I think means it's hashing TypeIds.

@orlp
Copy link

orlp commented Dec 6, 2024

@Greatness7 One more question, does the problem also resolve if you use the quality version of foldhash? So

type DefaultHashBuilder = foldhash::quality::RandomState;

@orlp
Copy link

orlp commented Dec 6, 2024

@Greatness7 Actually looking at it a bit more closely I don't think this is the same issue as the one being raised here. This is simply incorrect:

pub(crate) fn get_storeid<T: 'static + Hash>(t: &T) -> StoreId {
    let mut s = DefaultHashBuilder::default().build_hasher();
    TypeId::of::<T>().hash(&mut s);
    t.hash(&mut s);
    StoreId(s.finish())
}

It assumes DefaultHashBuilder::default() is always the same, but that's not true for foldhash, it randomly initializes. I think the performance issues are due to having, essentially, random StoreIds.

For foldhash you would need to use foldhash::fast::FixedState, rather than foldhash::fast::RandomState which is what hashbrown defaults to.

I don't believe hashbrown has ever guaranteed that hashbrown::DefaultHashBuilder::default always returns the same hasher, even though it did in the past.

@Greatness7
Copy link

@orlp

I tested foldhash::quality::RandomState and it did not resolve the performance issue. Trying with foldhash::fast::FixedState causes the program to no longer work correctly so I cannot easily judge performance there.

I believe the function in question originally used DefaultHasher::new from std, and when later switching to hashbrown to get some performance gains started using DefaultHashBuilder ( link to the diff ). I guess it may have behaved similarly by coincidence? Do you know what the hashbrown equivalent of this std function would be?

@orlp
Copy link

orlp commented Dec 6, 2024

@Greatness7

Trying with foldhash::fast::FixedState causes the program to no longer work correctly so I cannot easily judge performance there.

I strongly suspect the program is bugged regardless then, and that the old hasher just didn't expose the bug. Does the program assume that get_storeid returns unique ids without any collisions? Because no hash guarantees that.

Do you know what the hashbrown equivalent of this std function would be?

There is no equivalent in hashbrown, hashbrown isn't a hashing library, and that get_storeid function is just a hash.

Note that if all you did was change back to

type DefaultHashBuilder = core::hash::BuildHasherDefault<ahash::AHasher>;

and this is truly only used in that get_storeid function, the performance issues aren't due to the hasher used in hashbrown, because in that scenario hashbrown is stilll using foldhash.

@Greatness7
Copy link

Greatness7 commented Dec 7, 2024

@Greatness7

Trying with foldhash::fast::FixedState causes the program to no longer work correctly so I cannot easily judge performance there.

I strongly suspect the program is bugged regardless then, and that the old hasher just didn't expose the bug. Does the program assume that get_storeid returns unique ids without any collisions? Because no hash guarantees that.

Do you know what the hashbrown equivalent of this std function would be?

There is no equivalent in hashbrown, hashbrown isn't a hashing library, and that get_storeid function is just a hash.

Note that if all you did was change back to

type DefaultHashBuilder = core::hash::BuildHasherDefaultahash::AHasher;

and this is truly only used in that get_storeid function, the performance issues aren't due to the hasher used in hashbrown, because in that scenario hashbrown is stilll using foldhash.

You may be right that it could've been bugged before and it was just some property of the old DefaultHashBuilder hiding the problems. Interesting though that it was like that for years without showing any signs and the v1.15 update caused things to degrade immediately.

I'll investigate further with the original author and try to figure out if it was incorrect usage all along.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants