Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support rayon parallel iterators #14

Open
jonhoo opened this issue Jan 23, 2020 · 10 comments
Open

Support rayon parallel iterators #14

jonhoo opened this issue Jan 23, 2020 · 10 comments
Labels
enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed
Milestone

Comments

@jonhoo
Copy link
Owner

jonhoo commented Jan 23, 2020

We should implement rayon parallel iterators. The notes and code in iter/plumbing may be helpful, and the hashbrown implementation too.

@jonhoo jonhoo added enhancement New feature or request help wanted Extra attention is needed good first issue Good for newcomers labels Jan 23, 2020
@cuviper
Copy link
Collaborator

cuviper commented Jan 23, 2020

I'm happy to answer rayon questions on this.

@jonhoo
Copy link
Owner Author

jonhoo commented Jan 23, 2020

@cuviper Since you're offering, one thing I was actually wondering was whether we can somehow take advantage of the fact that the map supports fully concurrent access. Since non-concurrent maps like hashbrown are also able to implement the parallel traits, it wasn't immediately clear to me whether how you get a win from flurry in rayon even though it really feels like it should be possible.

@jonhoo
Copy link
Owner Author

jonhoo commented Jan 23, 2020

I suspect maybe the answer has to do with some of the traits in iter/plumbing, but not sure?

@cuviper
Copy link
Collaborator

cuviper commented Jan 23, 2020

Concurrent insert will make a naive ParallelExtend and FromParallelIterator trivial, just something like par_iter.for_each(|(key, value)| { map.insert(key, value); }). Maybe there's a more advanced approach that could improve performance, like folding into separate bins and then reducing into the final map, but I don't know your data structure enough to evaluate that.

Parallel iterators are a bit harder -- you need a strategy for splitting the map into separate "slices"/"views" of some sort. The hashbrown implementation should be a good reference if you look at how they use RawIterRange::split internally.

@jonhoo
Copy link
Owner Author

jonhoo commented Jan 23, 2020

Ah, I see, hashbrown is forced for first reduce and then use one core to build the map, whereas we don't have to do that. Neat!

@jonhoo jonhoo added this to the 1.0 milestone Jan 31, 2020
@Stupremee
Copy link
Contributor

I would like to give this a try.

@jonhoo
Copy link
Owner Author

jonhoo commented Feb 4, 2020

All yours!

@Stupremee
Copy link
Contributor

I need some help. Whats the best way to split a Table into two halves?

@jonhoo
Copy link
Owner Author

jonhoo commented Feb 10, 2020

The best way is probably to just split the list of bins in two: the "high" bins and the "low" bins.

@Stupremee
Copy link
Contributor

I have no idea how to approach this. I think it's better if I let someone else do it.

Others added a commit to Others/flurry that referenced this issue May 26, 2020
jonhoo pushed a commit that referenced this issue Jun 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants