-
Notifications
You must be signed in to change notification settings - Fork 154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cluster sharding delivers message to the wrong entity #1463
Comments
@jchapuis @Roiocam I looked at the PR that seems to be the issue and I think if we try to change the @jchapuis would you have any idea how we could minimise a reproducible test that could be used for regression purposes? |
@pjfanning I haven't yet had the time to look into this. I decided to report the problem as soon as I had a certain degree of confidence that there was a change of behavior, due to the release still being young, I got worried. I would say some tests sending commands to multiple sharded entities in quick succession, verifying that each command gets delivered to the proper destination. |
@raboof it would be nice to be able to start on the 1.0.1 RC in the next few days. What do you think of this course of action?
|
@pjfanning sure happy to run endless4s tests as soon as the revert is merged |
after investigating, i think because the extractEntityId instance was shared by both |
Thanks @jchapuis and @pjfanning |
@pjfanning @Roiocam I can confirm my tests are now passing with the revert |
#1467 merged |
* add unit test protect ExtractEntityId can be shared safely Related with #1463 * chore: avoid the double evaluation of entityId in ClusterSharding (#1304) * chore: avoid the double evaluation of entityId in ClusterSharding * new cacheable partial function * optimized for review * fix the right type * Revert "chore: avoid the double evaluation of entityId in ClusterSharding (#1…" (#1464) This reverts commit b0e9886. * grammar fix * sort imports --------- Co-authored-by: PJ Fanning <[email protected]>
Sorry to bring some bad news, I have been investigating failing tests in endless4s/endless-transaction#48, a PR that upgrades Pekko from 1.0.3 to 1.1.0 and I think I found a serious issue.
The failing test suite is stress-testing event-sourced entities using the persistence test toolkit, and I have identified that a command sometimes gets delivered to the wrong entity.
I have bisected the problem to this optimization that was introduced after the 1.1.0-M1 release. That new code makes use of a
var cache
and it doesn't seem thread-safe. Could it be that we introduced races?The text was updated successfully, but these errors were encountered: