Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add benchmarks for nubEq, union, intersect, and difference #206

Closed
wants to merge 6 commits into from
Closed

Add benchmarks for nubEq, union, intersect, and difference #206

wants to merge 6 commits into from

Conversation

JordanMartinez
Copy link
Contributor

No description provided.

@JordanMartinez
Copy link
Contributor Author

Benchmark results:

Array
===
mapMaybe
---------------
mapMaybe (101)
mean   = 14.41 μs
stddev = 32.39 μs
min    = 4.48 μs
max    = 551.21 μs
mapMaybe (10001)
mean   = 489.23 μs
stddev = 661.88 μs
min    = 336.78 μs
max    = 6.35 ms

nubEq
---------------
nubEq (101)
mean   = 85.85 μs
stddev = 75.58 μs
min    = 60.53 μs
max    = 1.11 ms
nubEq (10001)
mean   = 420.09 ms
stddev = 46.84 ms
min    = 370.88 ms
max    = 577.13 ms

union
---------------
union (101)
mean   = 127.70 μs
stddev = 101.86 μs
min    = 85.04 μs
max    = 1.71 ms
union (10001)
mean   = 476.17 ms
stddev = 48.16 ms
min    = 437.69 ms
max    = 779.99 ms

intersect
---------------
intersectBy (101)
mean   = 49.29 μs
stddev = 35.80 μs
min    = 41.31 μs
max    = 595.78 μs
intersectBy (10001)
mean   = 373.92 ms
stddev = 19.77 ms
min    = 357.94 ms
max    = 459.93 ms

difference
---------------
difference (101)
mean   = 70.58 μs
stddev = 52.80 μs
min    = 54.25 μs
max    = 642.70 μs
difference (10001)
mean   = 443.65 ms
stddev = 16.29 ms
min    = 423.03 ms
max    = 539.40 ms

@hdgarrood
Copy link
Contributor

Since these functions only behave interestingly in arrays which have some duplicates, we should probably ensure that the arrays we're benchmarking with also include some duplicates. If we only test with arrays in which every element is unique, and there's a performance issue that makes these functions perform especially poorly on arrays with lots of duplicates, we won't catch it with these benchmarks.

@JordanMartinez
Copy link
Contributor Author

I've updated the arrays we test to have half of their elements be unique and the other half to have duplicates based on the number (e.g. 3 means there are 3 duplicates of 3). Not sure whether the second half of the array should just be nothing but duplicates of the same number (e.g. 50 duplicates of 3) for shortNatsDup

@milesfrain
Copy link
Contributor

milesfrain commented Jan 8, 2021

The array creation in this PR currently involves a lot of magic numbers that are going to be tricky to edit if we want to modify sizes. And it would be good to do some more shuffling of the data.

Here's another way to generate the input data:
https://github.com/milesfrain/bench-array-demo/blob/main/test/Main.purs

Unfortunately, this uses quickcheck's shuffle, which won't work here due to a circular dependency. It might be possible to break this circular dependency with the following steps:


Edit: On second thought, maybe just reversing the first half of each input array is a reasonable enough approximation of a shuffle for most sorting algorithms to deal with.


Edit2: Here's a version that shuffles by interleaving the first half of the array with the reversed other half. https://github.com/milesfrain/bench-array-demo/blob/no-quickcheck/test/Main.purs

@milesfrain
Copy link
Contributor

Since there's no rush on #203, it might make sense to tackle things in this order:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants