-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make union*
, difference*
, and intersect*
run linearly by sorting elements first
#203
Conversation
I've updated the |
union*
, difference*
, and intersect*
run linearly by sorting elements first
This PR is ready for review |
I haven't reviewed properly yet but just so you know, sorting based on a comparison function which only compares two elements at a time is always at least O(n log n), which is beyond linear. Not by that much, admittedly, since |
Hmm..... Besides the Assume the following values for each benchmark below let
shortNats = Array.range 0 100
longNats = Array.range 0 10000
mod3Eq x y = (x `mod` 3) == (y `mod` 3)
mod3Cmp x y = compare (x `mod` 3) (y `mod` 3)
-- `master` uses `mod3Eq` whereas this PR uses `mod3Cmp`
benchUnionBy = do
log $ "unionBy (" <> show (Array.length shortNats) <> ")"
benchWith 1000 \_ -> Array.unionBy mod3Eq shortNats shortNats
log $ "unionBy (" <> show (Array.length longNats) <> ")"
benchWith 100 \_ -> Array.unionBy mod3Eq longNats longNats
benchIntersectBy = do
log $ "intersectBy (" <> show (Array.length shortNats) <> ")"
benchWith 1000 \_ -> Array.intersectBy mod3Eq shortNats shortNats
log $ "intersectBy (" <> show (Array.length longNats) <> ")"
benchWith 100 \_ -> Array.intersectBy mod3Eq longNats longNats
benchDifference = do
log $ "difference (" <> show (Array.length shortNats) <> ")"
benchWith 1000 \_ -> Array.difference shortNats shortNats
log $ "difference (" <> show (Array.length longNats) <> ")"
benchWith 100 \_ -> Array.difference longNats longNats
|
src/Data/Array.purs
Outdated
-- combineIndex compare [0] [0, 1] | ||
-- == [t3 true 0 0, t3 false 1 0, t3 false 2 1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-- combineIndex compare [0] [0, 1] | |
-- == [t3 true 0 0, t3 false 1 0, t3 false 2 1] | |
-- combineIndex compare [7] [9, 5] | |
-- == [t3 true 0 7, t3 false 1 9, t3 false 2 5] |
Easier to follow example with values that are distinct from indices
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But then there aren't values that exist in both arrays.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since combineIndex
doesn't care about duplicates, I don't think we need the example to involve duplicates.
This would be like saying that [1,2] <> [3,4] == [1,2,3,4]
is a bad example for append because it doesn't clarify that duplicates are preserved.
Could compromise with something like [7] [7, 5]
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since
combineIndex
doesn't care about duplicates, I don't think we need the example to involve duplicates.
🤦♂️ You're right! I forgot that this function doesn't need to care unlike the ones I'm working on. I'll make that fix.
-- ``` | ||
combineIndex :: forall a. Array a -> Array a -> Array (Tuple Boolean (Tuple Int a)) | ||
combineIndex left right = ST.run do | ||
out <- STA.new |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there any benefits to preallocating the array to a known size? Tried to find a clear answer for this in JS. And if there is a benefit that we want to take advantage of, we'd need to expand the Array.ST
API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've wondered about that myself...
let rightLen = length right | ||
ST.for 0 rightLen \idx -> do | ||
let val = unsafePartial $ unsafeIndex right idx | ||
void $ STA.push (Tuple false (Tuple (leftLen + idx) val)) out |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there any performance penalties to using a record instead of a tuple? Luckily, this tuple happens to be type-safe because of the distinct types (Boolean
, Int
, a
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure. I also tried it out with data Tuple3 a b c = Tuple3 a b c
and that sped things up only slightly.
unionBy :: forall a. (a -> a -> Boolean) -> Array a -> Array a -> Array a | ||
unionBy eq xs ys = xs <> foldl (flip (deleteBy eq)) (nubByEq eq ys) xs | ||
unionBy :: forall a. (a -> a -> Ordering) -> Array a -> Array a -> Array a | ||
unionBy cmp left right = map snd $ ST.run do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be more efficient to take advantage of the of the fact that the output is always the first array plus some other stuff appended to it. Here's some non-ST pseudo-ish code describing that option:
union left right = left <> other where
other =
combineIndex right left -- intentionally putting right first and assuming stable sort
# sortBy valueFunc
# groupBy valueFunc
# map head
# filter keepRightOnly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would be the runtime cost of this approach versus mine?
I made the rather stupid assumption that mine would be n+m
because when Harry said the current version was n*m
, I thought he was implying we should make the code linear. I didn't actually analyze my code using Big O notation to see how many steps it takes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe all reasonable approaches are O(n log n)
(where n is the sum of array sizes). But benchmarking could still reveal a 3x speedup with a different approach, which would still be O(n log n)
. I'll play around with this and report back with findings. The choice of input data could also significantly change the relative performance of different algorithms.
I think these code changes should wait until we have reached a consensus on how these functions should actually work - there are still unknowns to do with things the handling of duplicates. I’m not sure how much it helps to change |
So, will this change be merged before |
Minor optimization by only working with indices and copying values only once to the final array. |
I’m tempted to leave it for 0.15.0 personally. |
I think this shouldn't be implemented until we have |
I'm going to mark |
Fixes #192