Skip to content

Commit

Permalink
fix: Balance channel may stuck at increasing replica number case
Browse files Browse the repository at this point in the history
cause balance channel will wait until new delegator becomes serviceable,
but new delegator need to sync target version then becomes serviceable,
and sync target version need to be wait all replica load done. so if
increasing replica number and balance channel happens at same time,
logic dead lock occurs.

Signed-off-by: Wei Liu <[email protected]>
  • Loading branch information
weiliu1031 committed Nov 13, 2024
1 parent 3389a6b commit 589dfda
Showing 1 changed file with 2 additions and 3 deletions.
5 changes: 2 additions & 3 deletions internal/querycoordv2/observers/target_observer.go
Original file line number Diff line number Diff line change
Expand Up @@ -388,9 +388,8 @@ func (ob *TargetObserver) shouldUpdateCurrentTarget(ctx context.Context, collect
})
collectionReadyLeaders = append(collectionReadyLeaders, channelReadyLeaders...)

nodes := lo.Map(channelReadyLeaders, func(view *meta.LeaderView, _ int) int64 { return view.ID })
group := utils.GroupNodesByReplica(ob.meta.ReplicaManager, collectionID, nodes)
if int32(len(group)) < replicaNum {
// to avoid stuck here in dynamic increase replica case, we just check available delegator number
if int32(len(collectionReadyLeaders)) < replicaNum {
log.RatedInfo(10, "channel not ready",
zap.Int("readyReplicaNum", len(channelReadyLeaders)),
zap.String("channelName", channel),
Expand Down

0 comments on commit 589dfda

Please sign in to comment.