-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pattern De-Duplication based on Subsequence Detection #1031
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do we think of this possible cleanup? Approving because this does work as-is!!
.filter((pattern) => { | ||
// Compare to all other patterns TODO: make this beat O(n^2) | ||
return !patternsSortedByLength.find((p) => { | ||
// Don't compare against ourself | ||
if (p.id === pattern.id) return false | ||
|
||
// If our pattern is longer, it's not a subset | ||
if (p.stops.length < pattern.stops.length) return false | ||
|
||
return isValidSubsequence( | ||
p.stops.map((s) => s.id), | ||
pattern.stops.map((s) => s.id) | ||
) | ||
}) | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about
.filter((pattern) => { | |
// Compare to all other patterns TODO: make this beat O(n^2) | |
return !patternsSortedByLength.find((p) => { | |
// Don't compare against ourself | |
if (p.id === pattern.id) return false | |
// If our pattern is longer, it's not a subset | |
if (p.stops.length < pattern.stops.length) return false | |
return isValidSubsequence( | |
p.stops.map((s) => s.id), | |
pattern.stops.map((s) => s.id) | |
) | |
}) | |
}) | |
.filter((pattern, patternIndex) => { | |
// Compare to all other patterns larger than the current pattern | |
return !patternsSortedByLength.find((p, pIndex) => { | |
if (pIndex >= patternIndex) return false | |
return isValidSubsequence( | |
p.stops.map((s) => s.id), | |
pattern.stops.map((s) => s.id) | |
) | |
}) | |
}) |
You're already sorting by length so checking anything above a certain index number feels like a waste of time, and you could remove the if (p.id === pattern.id)
block because p
and pattern
should have the same index. This gets the same result for 1-line patterns but might need more testing!!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to check the pattern id because the indexes don't always line up. I agree the length check is not required but to do that we need to start checking the second array at the right index and that logic is currently not present
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neat algorithm, I don't see a way to make it faster immediately. Good start!
Description:
Possibly a slightly overbuilt way to use subsequence detection to remove patterns that are simply a subsequence of another, larger pattern.
There are definitely a few optimizations possible here. I am looking for feedback on some of these algorithms and if there's a way to make the more efficient.
I believe this should be possible in O(n) time, right now it's O(n^2) although reversing the sorted array should help remove the most useless of these comparisons