Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: rocksmq data race on consumers modification #29114

Closed
wants to merge 1 commit into from

Conversation

chyezh
Copy link
Contributor

@chyezh chyezh commented Dec 11, 2023

Related Issue: #29101

  • Fix data race by add new lock and CopyOnWrite
  • Add new unittest to verify it

@sre-ci-robot sre-ci-robot added the size/M Denotes a PR that changes 30-99 lines. label Dec 11, 2023
@sre-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: chyezh
To complete the pull request process, please assign congqixia after the PR has been reviewed.
You can assign the PR to them by writing /assign @congqixia in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@mergify mergify bot added dco-passed DCO check passed. kind/bug Issues or changes related a bug labels Dec 11, 2023
@chyezh
Copy link
Contributor Author

chyezh commented Dec 11, 2023

Data race detected on old implementation by unittest

Write at 0x00c0005e5470 by goroutine 141:
  github.com/milvus-io/milvus/internal/mq/mqimpl/rocksmq/server.(*rocksmq).RegisterConsumer()
      /Users/zilliz/repo/github/chyezh/milvus/internal/mq/mqimpl/rocksmq/server/rocksmq_impl.go:550 +0x54c
  github.com/milvus-io/milvus/internal/mq/mqimpl/rocksmq/server.TestRocksmq_RegisterConsumer.func2()
      /Users/zilliz/repo/github/chyezh/milvus/internal/mq/mqimpl/rocksmq/server/rocksmq_impl_test.go:246 +0x1c8
  github.com/milvus-io/milvus/internal/mq/mqimpl/rocksmq/server.TestRocksmq_RegisterConsumer.func10()
      /Users/zilliz/repo/github/chyezh/milvus/internal/mq/mqimpl/rocksmq/server/rocksmq_impl_test.go:249 +0x44

Previous write at 0x00c0005e5470 by goroutine 143:
  github.com/milvus-io/milvus/internal/mq/mqimpl/rocksmq/server.(*rocksmq).RegisterConsumer()
      /Users/zilliz/repo/github/chyezh/milvus/internal/mq/mqimpl/rocksmq/server/rocksmq_impl.go:550 +0x54c
  github.com/milvus-io/milvus/internal/mq/mqimpl/rocksmq/server.TestRocksmq_RegisterConsumer.func2()
      /Users/zilliz/repo/github/chyezh/milvus/internal/mq/mqimpl/rocksmq/server/rocksmq_impl_test.go:246 +0x1c8
  github.com/milvus-io/milvus/internal/mq/mqimpl/rocksmq/server.TestRocksmq_RegisterConsumer.func10()
      /Users/zilliz/repo/github/chyezh/milvus/internal/mq/mqimpl/rocksmq/server/rocksmq_impl_test.go:249 +0x44

Goroutine 141 (running) created at:
  github.com/milvus-io/milvus/internal/mq/mqimpl/rocksmq/server.TestRocksmq_RegisterConsumer()
      /Users/zilliz/repo/github/chyezh/milvus/internal/mq/mqimpl/rocksmq/server/rocksmq_impl_test.go:238 +0xa84
  testing.tRunner()
      /opt/homebrew/Cellar/go/1.21.5/libexec/src/testing/testing.go:1595 +0x1b0
  testing.(*T).Run.func1()
      /opt/homebrew/Cellar/go/1.21.5/libexec/src/testing/testing.go:1648 +0x40

@chyezh chyezh force-pushed the fixup_rocksmq_panic branch from e835fa9 to 602b742 Compare December 15, 2023 08:00
@sre-ci-robot sre-ci-robot added size/L Denotes a PR that changes 100-499 lines. and removed size/M Denotes a PR that changes 30-99 lines. labels Dec 15, 2023
@chyezh
Copy link
Contributor Author

chyezh commented Dec 16, 2023

rerun ut

Related Issue: milvus-io#29101

- Fix data race by add new lock and CopyOnWrite

- Add new unittest to verify it

Signed-off-by: chyezh <[email protected]>
@chyezh chyezh force-pushed the fixup_rocksmq_panic branch from 602b742 to bad3b45 Compare December 18, 2023 07:10
Copy link
Contributor

mergify bot commented Dec 18, 2023

@chyezh E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

@chyezh
Copy link
Contributor Author

chyezh commented Dec 18, 2023

/run-cpu-e2e

1 similar comment
@chyezh
Copy link
Contributor Author

chyezh commented Dec 18, 2023

/run-cpu-e2e

@chyezh
Copy link
Contributor Author

chyezh commented Dec 18, 2023

rerun ut

Copy link

stale bot commented Jan 17, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.

@stale stale bot added the stale indicates no udpates for 30 days label Jan 17, 2024
@stale stale bot closed this Jan 26, 2024
@mergify mergify bot added needs-dco DCO is missing in this pull request. and removed dco-passed DCO check passed. labels Apr 24, 2024
Copy link
Contributor

mergify bot commented Apr 24, 2024

@chyezh Thanks for your contribution. Please submit with DCO, see the contributing guide https://github.com/milvus-io/milvus/blob/master/CONTRIBUTING.md#developer-certificate-of-origin-dco.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Issues or changes related a bug needs-dco DCO is missing in this pull request. size/L Denotes a PR that changes 100-499 lines. stale indicates no udpates for 30 days
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants