High Availability with Multi-AZ Deployments #488
mattisonchao
started this conversation in
Proposal
Replies: 1 comment 1 reply
-
Perhaps rename |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Motivation
As the distributed storage systems, we must let Oxia support the AZ ensemble placement function, enhancing fault tolerance and disaster recovery.
We can even deploy the Oxia server into different AZs, but the coordinator might still assign shard replicas for the servers in the same zone. Therefore, the primary purpose of this proposal is to support the Oxia coordinator in coordinating data shards into different time zones to satisfy the AZ ensemble placement policy.
Goal
zonal
orregional.
Out of scope
I understand it's a big topic here; we can make many improvements related to the load balancer and shards assignments. However, we should focus only on the main purpose of this proposal. We can talk about others in the future. So, there are several topics we will not discuss in the current proposal:
High-Level Design
Based on the above diagram, you will understand the primary purpose of AZ-aware placement.
Zonal ensemble placement
This rule will let the coordinator assign all the shards in the same zone (green colour). It will help some namespace uses for lower latency, cross-zone, cost-free, multi-replica disaster recovery cases.
Reginal ensemble placement
This is the default rule for ensemble placement, which will assign all shard replicas in the different zones. Regional AZ gives the oxia cluster a very high level of disaster recovery ability—which is very important for metadata(coordination) service.
Details Design
Model
AvailableZone
field for theServerAddress
struct.AvailableZoneRule
field forNamespaceConfig
struct.Component
This proposal will introduce a new component
AZRankClusterRebalancer
with aClusterRebalancer
interface to support calculating ensemble with AZ awareness and server shards ranking.The interface definition is as follows:
Algorithm
This proposal only focuses on the AZ awareness ensemble placement algorithm. We will reuse and be compatible with the existing ranking algorithm.
Therefore, the primary point here is that:
Disaster recovery
We have several points to consider in disaster recovery. For example. If a zone server crashes or is deleted. The rule of shards assigning is as follows:
eligibleServer= !highestRank && server.contains(terminatedServerShard)
zone
doesn't have an eligible server, move toregional
assigning.Others
none.
Beta Was this translation helpful? Give feedback.
All reactions