-
Notifications
You must be signed in to change notification settings - Fork 313
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
model: Fix the Set
contract for data classes that implement Comparable
#6229
Conversation
…able` Kotlin's data classes have auto-generated `equals()` functions that take all primary properties into account. When implementing the `Comparable` interface for those classes, i.e. to give them a defined order in a `SortedSet`, special attention needs to be put on `compareTo()` to return 0 if and only if `equals()` returns true. This is required because "a sorted set performs all element comparisons using its compareTo method, so two elements that are deemed equal by this method are, from the standpoint of the sorted set, equal" [1]. This means that sorted sets are considered equal if for all their respective elements `compareTo()` returns 0. So if `compareTo()` is implemented to not consider all properties for an equality check, this can lead to unexpected results, for example when using Kotest's data class matchers. Fix this and obey the general contract of the `Set` interface by taking all properties into account if the properties to sort on are equal. As the ordering does not matter if not all properties are equal but the properties to sort on are, arbitrarily return 1 from `compareTo()` in that case. [1]: https://docs.oracle.com/javase/8/docs/api/java/util/SortedSet.html Signed-off-by: Sebastian Schuberth <[email protected]>
when { | ||
id < other.id -> -1 | ||
id > other.id -> 1 | ||
else -> if (this == other) 0 else 1 // Return arbitrary inequality. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes in this commit do impact the performance negatively.
In this particular case it leads to recursion: the whole subtree rooted at this
and other
is compared, which I doubt is a good idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the whole subtree rooted at
this
andother
is compared, which I doubt is a good idea.
Actually, that's exactly the point in order to be able to use Kotest's data class matchers e.g. on whole OrtResults. This would allow to get rid of the fragile text line comparison which is really hard to grasp on CI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the text comparison also makes it very easy to see the effect of changes during development and to update the expected results files, if the plan is to change the matching for the functional analyzer tests we need to consider this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the text comparison also makes it very easy to see the effect of changes
True, and it makes it easy to take over new expected results from actual results. My main point is that I believe Kotest's data class matchers should work with out code base in case someone decides to use them in tests, but currently they don't as soon as a SortedSet
is involved anywhere in the hierarchy.
metadata.id < other.metadata.id -> -1 | ||
metadata.id > other.metadata.id -> 1 | ||
else -> if (this == other) 0 else 1 // Return arbitrary inequality. | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See this commit, I would again like to bring up the question whether
using a combination of the following two for the task of sorting things in serialization is the right approach:
- Sorted collections
- Implementing Comparable
E.g.: If implementing Comparable was avoided (for ordering things in the serializer), would these kind of problems then be not present?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If implementing Comparable was avoided (for ordering things in the serializer), would these kind of problems then be not present?
Yes. I was also thinking about not using any sorted sets in the model classes at all, and only sort upon serialization, as sorting should only matter for human readability anyway. Doing this should also improve insertion performance into the (now non-sorted) sets. However, I did not find an elegant way to sort collections on serialization without writing custom serializers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems we agree that this could be good to do. I've tried to write this down in an issue, see #6235.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do you prefer to move this PR forward?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know yet. This is just a draft to play with (I wanted to see what tests it breaks, if any).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, I did not find an elegant way to sort collections on serialization without writing custom serializers.
There's actually an existing issue in Jackson for making this easily configurable, see FasterXML/jackson-databind#3166.
I'm closing this in favor of #6244. |
Kotlin's data classes have auto-generated
equals()
functions that take all primary properties into account. When implementing theComparable
interface for those classes, i.e. to give them a defined order in aSortedSet
, special attention needs to be put oncompareTo()
to return 0 if and only ifequals()
returns true. This is required because "a sorted set performs all element comparisons using its compareTo method, so two elements that are deemed equal by this method are, from the standpoint of the sorted set, equal" 1. This means that sorted sets are considered equal if for all their respective elementscompareTo()
returns 0. So ifcompareTo()
is implemented to not consider all properties for an equality check, this can lead to unexpected results, for example when using Kotest's data class matchers.Fix this and obey the general contract of the
Set
interface by taking all properties into account if the properties to sort on are equal. As the ordering does not matter if not all properties are equal but the properties to sort on are, arbitrarily return 1 fromcompareTo()
in that case.Signed-off-by: Sebastian Schuberth [email protected]