Skip to content

Commit

Permalink
wording
Browse files Browse the repository at this point in the history
  • Loading branch information
sbernauer committed Sep 21, 2023
1 parent a4ba0a2 commit 871c2c1
Showing 1 changed file with 3 additions and 3 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -15,12 +15,12 @@ We only allow a single namenode to be offline at a given time, regardless of the
For datanodes the questions how many instances can be offline at the same time is a bit harder:
HDFS stores your blocks on the datanodes.
Every block can be replicated multiple times (to multiple datanodes) to ensure maximum availability.
The default setting is a replication of `3` - which can be configured using `spec.clusterConfig.dfsReplication`. However, it is also possible to change the replication factor for a specific file or directory to something else than the cluster default.
The default setting is a replication factor of `3` - which can be configured using `spec.clusterConfig.dfsReplication`. However, it is also possible to change the replication factor for a specific file or directory to something else than the cluster default.

When you have a replication of `3`, you can safely can take down 2 datanodes, as there will always be a third one holding the blocks of the two down datanodes.
When you have a replication of `3`, you can safely can take down 2 datanodes, as there will always be a third datanode one holding a copy of the block on the two unavailable datanodes.
However, you need to be aware that you are now down to a single point of failure - the last of the three replicas!

Taking this into consideration, our operator uses the following algorithm to determine the maximum number of datanodes allowed to be offline.
Taking this into consideration, our operator uses the following algorithm to determine the maximum number of datanodes allowed to be offline:

`num_datanodes` is the number of datanodes in the HDFS cluster, summed over all roleGroups.
`dfs_replication` is default replication factor of the cluster.
Expand Down

0 comments on commit 871c2c1

Please sign in to comment.