Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add basic HashSet implementation #19

Merged
merged 13 commits into from
Mar 2, 2020
Merged

Add basic HashSet implementation #19

merged 13 commits into from
Mar 2, 2020

Conversation

twe4ked
Copy link
Contributor

@twe4ked twe4ked commented Jan 24, 2020

Add a very basic implementation of HashSet as mentioned in #17.


This change is Reviewable

src/set/mod.rs Outdated Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Jan 24, 2020

Codecov Report

Merging #19 into master will increase coverage by 0.86%.
The diff coverage is 100%.

Impacted Files Coverage Δ
src/set.rs 100% <100%> (ø)
src/map.rs 87.55% <0%> (+1.01%) ⬆️

@jonhoo
Copy link
Owner

jonhoo commented Jan 24, 2020

This looks great! Three comments:

  • I'd like to see the docs slightly expanded to be more similar to those in the standard library. You don't need to go into quite as much detail, but some of it would be useful.
  • There are some additional methods that would be good to forward to the inner map. Off the top of my head, .remove() and .iter() (which'll forward to .keys()) are the most obvious candidates.
  • We should port some of the Java tests that are related to sets. From a quick glance, those seem to be these and these. They are technically for a set view of a map, but should be straightforward enough to adapt to this a standalone set I think.

@twe4ked
Copy link
Contributor Author

twe4ked commented Jan 24, 2020

I haven't added any additional tests yet (aside from the doc tests). I haven't had a chance to take a look at the examples you linked but I'm not sure how much they're needed yet as this is only using the public API which is already tested.

I'll still take a look at adding some of the ones you linked.

@twe4ked
Copy link
Contributor Author

twe4ked commented Jan 24, 2020

codecov/patch — 57.14% of diff hit (target 82.7%)

Oh interesting, it doesn't count doc tests.

@jonhoo
Copy link
Owner

jonhoo commented Jan 24, 2020

I think the reason to add tests to this is more in case we decide to make more optimizations specifically to the set version later.

Not sure what you mean by doctests not being covered? They should be. There was an issue about it in the past, but that appears to have been solved.

src/set/mod.rs Outdated Show resolved Hide resolved
src/set/mod.rs Outdated Show resolved Hide resolved
@jonhoo
Copy link
Owner

jonhoo commented Jan 24, 2020

Just a heads up that I moved a bunch of code around (though it is mostly unchanged) and renamed FlurryHashMap to just HashMap. Shouldn't be too hard to merge with, but just so you are prepared for git complaining 😅

@twe4ked
Copy link
Contributor Author

twe4ked commented Jan 24, 2020

Not sure what you mean by doctests not being covered? They should be. There was an issue about it in the past, but that appears to have been solved.

I guess I got confused by the diff code cov dropping. I might have been the changes I rebased in. I still don't know why the coverage isn't 100%. Should be fixed when I add in the other tests.

Just a heads up that I moved a bunch of code around

Thanks for the heads up!

@cuviper
Copy link
Collaborator

cuviper commented Jan 31, 2020

When you get back to this, it would be nice to also have a HashSetRef like #45 did for maps. That could be a separate PR though.

@twe4ked
Copy link
Contributor Author

twe4ked commented Feb 2, 2020

This branch is out-of-date with the base branch

Haha, I just rebased this branch!

All good, taking a look at the tests now :)

src/set/mod.rs Outdated
map: HashMap<T, (), S>,
}

impl<T> HashSet<T, crate::DefaultHashBuilder>
Copy link
Collaborator

@cuviper cuviper Feb 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DefaultHashBuilder is correct for fn new, but the other methods should allow a generic S: BuildHasher. Consider adding the other constructors too: with_capacity using the default, with_hasher and with_capacity_and_hasher using S.

@twe4ked
Copy link
Contributor Author

twe4ked commented Feb 17, 2020

All good, taking a look at the tests now :)

I had a look at the relevant tests from the Java codebase and honestly it looked like everything is covered by the doc tests. They were all pretty simple tests that didn't seem hugely relevant to how this is implemented here.

I'm happy to “finish” this PR off by adding the missing constructor methods and getting it back up to date with master but I don't think I'll have time for much further than that. Also happy to close as I haven't done anything groundbreaking here and it should be easy for someone else to pick up if they want to :)

HashSetRef also sounds like a great idea but, as mentioned, probably shouldn't block this PR being merged.

Thoughts?

@jonhoo
Copy link
Owner

jonhoo commented Feb 18, 2020

If you bring it up to speed with master, I'd be happy to merge without a HashSetRef :)

Sticking to just the doc tests seems fine by me!

@twe4ked
Copy link
Contributor Author

twe4ked commented Feb 19, 2020

Tests are failing with:

/bin/bash --noprofile --norc /Users/runner/runners/2.164.8/work/_temp/b4288b92-8bed-41ec-ad74-9fcb74730d03.sh
    Updating git repository `https://github.com/cynecx/crossbeam.git`
    Updating crates.io index
error: failed to select a version for the requirement `crossbeam-epoch = "^0.9"`
  candidate versions found which didn't match: 0.8.0, 0.7.2, 0.7.1, ...
  location searched: crates.io index

Looks like crossbeam-rs/crossbeam#458 has been merged which may have broken something?

The branch is still around though, so not sure why it can't use it.

@twe4ked twe4ked changed the title Implement FlurryHashSet insert() and contains() Add basic HashSet implementation Feb 19, 2020
@jonhoo
Copy link
Owner

jonhoo commented Feb 19, 2020

Ah, sorry, should be fixed on master now!

@twe4ked
Copy link
Contributor Author

twe4ked commented Feb 19, 2020

@jonhoo
Copy link
Owner

jonhoo commented Feb 20, 2020

Ah, yes. I have a fix pending for that in #63.

@jonhoo
Copy link
Owner

jonhoo commented Feb 20, 2020

master should now include doctests in coverage

@twe4ked
Copy link
Contributor Author

twe4ked commented Feb 22, 2020

/bin/bash --noprofile --norc /home/vsts/work/_temp/ea0e731b-71c1-4fb0-bdd9-a3623a442bb5.sh
error: component 'miri' for target 'x86_64-unknown-linux-gnu' is unavailable for download for channel nightly
Sometimes not all components are available in any given nightly.

I think miri might be unavailable on nightly at the moment?

https://rust-lang.github.io/rustup-components-history/

@jonhoo
Copy link
Owner

jonhoo commented Feb 22, 2020

Yup, it is — see also #61 (comment). Should hopefully be fixed soon-ish.

@twe4ked
Copy link
Contributor Author

twe4ked commented Feb 25, 2020

It finally passed! 🎉

Copy link
Owner

@jonhoo jonhoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 3 files at r1, 1 of 1 files at r2.
Reviewable status: all files reviewed, 6 unresolved discussions (waiting on @twe4ked)


src/set.rs, line 44 at r2 (raw file):

#[derive(Debug)]
pub struct HashSet<T: 'static, S = crate::DefaultHashBuilder>
where

I'm trying out a new review tool, so if this turns out super weird, I apologize in advance :p
We don't need to require all these bounds here. Instead, we should only specify those bounds further down on the impl blocks that require it. You can probably just use the same bounds that HashMap uses for its impls.


src/set.rs, line 117 at r2 (raw file):

}

impl<T> HashSet<T, crate::DefaultHashBuilder>

Minor: It makes more sense to me for this impl block to come before the with_hasher methods.


src/set.rs, line 167 at r2 (raw file):

    /// Adds a value to the set.
    ///
    /// If the set did not have this value present, true is returned.

Nit: Should false and true here be in backticks? Same in a couple of other places in the docs.


src/set.rs, line 182 at r2 (raw file):

    /// assert!(set.contains(&2));
    /// ```
    pub fn insert(&self, value: T) -> bool {

Should we also add HashSet::replace while we're at it? It'd have to return an Option<&'guard T> for us, but should be a straightforward near-copy of insert.


src/set.rs, line 183 at r2 (raw file):

    /// ```
    pub fn insert(&self, value: T) -> bool {
        let guard = epoch::pin();

Reading through this again, I think we should switch all of the HashSet methods to taking a &Guard, just like the HashMap APIs. And then probably also provide a HashSet::guard that calls into the underlying HashMap::guard. That way it will be much easier to add HashSetRef later.


src/set.rs, line 234 at r2 (raw file):

    /// assert_eq!(set.remove(&2), false);
    /// ```
    pub fn remove<Q>(&self, value: &Q) -> bool

Should we add HashSet::take while we're at it? It'd have to return an Option<&'guard T> for us, but beyond that the functionality should be easy-enough to add.

@twe4ked
Copy link
Contributor Author

twe4ked commented Mar 2, 2020

I've actioned your feedback except for adding HashSet::take and HashSet::replace. I had a quick look into them and it wasn't immediately clear to me how to go about implementing it and I don't have time to dig into it right now.

Looks like miri is unavailable again.

Copy link
Owner

@jonhoo jonhoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 1 of 3 files reviewed, 3 unresolved discussions (waiting on @jonhoo and @twe4ked)

@jonhoo
Copy link
Owner

jonhoo commented Mar 2, 2020

Thanks! I'm going to go ahead and merge this, and then I'll do some tweaks after the fact since you're short on time :)

@jonhoo jonhoo merged commit f0e8bee into jonhoo:master Mar 2, 2020
@jonhoo
Copy link
Owner

jonhoo commented Mar 2, 2020

See #68

@twe4ked twe4ked deleted the set branch March 2, 2020 23:24
@domenicquirl domenicquirl mentioned this pull request Mar 16, 2020
jonhoo pushed a commit that referenced this pull request Feb 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants