Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Support graceful shutdown #407

Merged
merged 34 commits into from
Nov 2, 2023
Merged
Show file tree
Hide file tree
Changes from 18 commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
6f6f954
feat: Support graceful shutdown
sbernauer Oct 10, 2023
7635754
update docs
sbernauer Oct 10, 2023
129a7f1
docs
sbernauer Oct 10, 2023
b5869c5
changelog
sbernauer Oct 11, 2023
5dd4c04
link code in docs
sbernauer Oct 11, 2023
bcb3ad4
increase default of datanodes to 30 min
sbernauer Oct 12, 2023
ce72ff8
move into constants
sbernauer Oct 12, 2023
abe0f23
use new operator-rs
sbernauer Oct 12, 2023
692dace
docs: Format 15 minutes
sbernauer Oct 12, 2023
dcb6bbb
Use new operator-rs
sbernauer Oct 16, 2023
2f7e46f
improve docs
sbernauer Oct 16, 2023
de3c4bb
Merge branch 'main' into feat/graceful-shutdown-2
sbernauer Oct 16, 2023
cc23be8
fix link
sbernauer Oct 16, 2023
5fbceee
use operator-rs 0.55.0
sbernauer Oct 18, 2023
765a4bd
Merge branch 'main' into feat/graceful-shutdown-2
sbernauer Oct 18, 2023
9a07707
fixup
sbernauer Oct 18, 2023
58dbdc7
improve docs
sbernauer Oct 18, 2023
b78d559
set error context
sbernauer Oct 18, 2023
7733ec1
Added a high level description of graceful shutdown
Jimvin Oct 18, 2023
2ed5795
Revert "Added a high level description of graceful shutdown"
sbernauer Oct 19, 2023
0d154da
Merge remote-tracking branch 'origin/main' into feat/graceful-shutdown-2
sbernauer Oct 19, 2023
6b0b883
Move rustdoc above field attributes
sbernauer Oct 19, 2023
752b699
Avoid snafu context(false)
sbernauer Oct 19, 2023
153b2c2
docs wording
sbernauer Oct 19, 2023
7bd7d3e
newline
sbernauer Oct 19, 2023
862cd3a
fix: Vector graceful shutdown
sbernauer Oct 27, 2023
65415c4
downgrade ring again
sbernauer Oct 27, 2023
23d52c9
Merge remote-tracking branch 'origin/main' into feat/graceful-shutdown-2
sbernauer Oct 27, 2023
f3aa0f8
fix links
sbernauer Oct 30, 2023
ab32002
use new operator-rs
sbernauer Oct 31, 2023
4e14d57
chore: Bump operator-rs to 0.56.0
sbernauer Oct 31, 2023
f15b73a
Revert "chore: Bump operator-rs to 0.56.0"
sbernauer Oct 31, 2023
37077f7
Merge remote-tracking branch 'origin/main' into feat/graceful-shutdown-2
sbernauer Oct 31, 2023
85e67df
Update docs/modules/hdfs/pages/usage-guide/operations/graceful-shutdo…
sbernauer Nov 2, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ All notable changes to this project will be documented in this file.
- Default stackableVersion to operator version ([#381]).
- Configuration overrides for the JVM security properties, such as DNS caching ([#384]).
- Support PodDisruptionBudgets ([#394]).
- Support graceful shutdown ([#407]).

### Changed

Expand All @@ -28,6 +29,7 @@ All notable changes to this project will be documented in this file.
[#402]: https://github.com/stackabletech/hdfs-operator/pull/402
[#404]: https://github.com/stackabletech/hdfs-operator/pull/404
[#405]: https://github.com/stackabletech/hdfs-operator/pull/405
[#407]: https://github.com/stackabletech/hdfs-operator/pull/407

## [23.7.0] - 2023-07-14

Expand Down
8 changes: 4 additions & 4 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"
serde_yaml = "0.9"
snafu = "0.7"
stackable-operator = { git = "https://github.com/stackabletech/operator-rs.git", tag = "0.52.1" }
stackable-operator = { git = "https://github.com/stackabletech/operator-rs.git", tag = "0.55.0" }
strum = { version = "0.25", features = ["derive"] }
tokio = { version = "1.29", features = ["full"] }
tracing = "0.1"
Expand Down
120 changes: 96 additions & 24 deletions deploy/helm/hdfs-operator/crds/crds.yaml

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
@@ -1,6 +1,36 @@
= Graceful shutdown

Graceful shutdown of HDFS nodes is either not supported by the product itself
or we have not implemented it yet.
You can configure the graceful shutdown as described in xref:concepts:operations/graceful_shutdown.adoc[].

Outstanding implementation work for the graceful shutdowns of all products where this functionality is relevant is tracked in https://github.com/stackabletech/issues/issues/357
== JournalNodes

As a default, JournalNodes have `15 minutes` to terminate gracefully.

The JournalNode process will always run as PID `1` and will get a `SIGTERM` once Kubernetes wants to terminate the Pod.
It will log the received signal as show in the log below and initiate a graceful shutdown.
sbernauer marked this conversation as resolved.
Show resolved Hide resolved
After the graceful shutdown timeout is passed and the process still didn't exit, Kubernetes will issue an `SIGKILL` to force-kill the process.

https://github.com/apache/hadoop/blob/a585a73c3e02ac62350c136643a5e7f6095a3dbb/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java#L2004[This] is the relevant code that gets executed in the JournalNodes as of HDFS version `3.3.4`.
sbernauer marked this conversation as resolved.
Show resolved Hide resolved

[source,text]
----
2023-10-10 13:37:41,525 ERROR server.JournalNode (LogAdapter.java:error(75)) - RECEIVED SIGNAL 15: SIGTERM
2023-10-10 13:37:41,526 INFO server.JournalNode (LogAdapter.java:info(51)) - SHUTDOWN_MSG:

Check notice on line 18 in docs/modules/hdfs/pages/usage-guide/operations/graceful-shutdown.adoc

View workflow job for this annotation

GitHub Actions / LanguageTool

[LanguageTool] docs/modules/hdfs/pages/usage-guide/operations/graceful-shutdown.adoc#L18

Possible agreement error. The noun ‘server’ seems to be countable. (CD_NN[1]) Suggestions: `servers` Rule: https://community.languagetool.org/rule/show/CD_NN?lang=en-US&subId=1 Category: GRAMMAR
Raw output
docs/modules/hdfs/pages/usage-guide/operations/graceful-shutdown.adoc:18:30: Possible agreement error. The noun ‘server’ seems to be countable. (CD_NN[1])
 Suggestions: `servers`
 Rule: https://community.languagetool.org/rule/show/CD_NN?lang=en-US&subId=1
 Category: GRAMMAR
/************************************************************
SHUTDOWN_MSG: Shutting down JournalNode at hdfs-journalnode-default-0/10.244.0.38
************************************************************/
----

== NameNodes

As a default, NameNodes have `15 minutes` to terminate gracefully.
They go through the same mechanism as documented for the <<_journalnodes>> above.

https://github.com/apache/hadoop/blob/a585a73c3e02ac62350c136643a5e7f6095a3dbb/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java#L1080[This] is the relevant code that gets executed in the NameNodes as of HDFS version `3.3.4`.

== DataNodes

As a default, DataNodes have `30 minutes` to terminate gracefully.
They go through the same mechanism as documented for the <<_journalnodes>> above.

https://github.com/apache/hadoop/blob/a585a73c3e02ac62350c136643a5e7f6095a3dbb/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/server/JournalNode.java#L272[This] is the relevant code that gets executed in the DataNodes as of HDFS version `3.3.4`.
9 changes: 9 additions & 0 deletions rust/crd/src/constants.rs
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
use stackable_operator::time::Duration;

pub const DEFAULT_DFS_REPLICATION_FACTOR: u8 = 3;

pub const CONTROLLER_NAME: &str = "hdfsclusters.hdfs.stackable.tech";
Expand Down Expand Up @@ -41,6 +43,13 @@ pub const DEFAULT_JOURNAL_NODE_HTTP_PORT: u16 = 8480;
pub const DEFAULT_JOURNAL_NODE_HTTPS_PORT: u16 = 8481;
pub const DEFAULT_JOURNAL_NODE_RPC_PORT: u16 = 8485;

pub const DEFAULT_JOURNAL_NODE_GRACEFUL_SHUTDOWN_TIMEOUT: Duration =
Duration::from_minutes_unchecked(15);
pub const DEFAULT_NAME_NODE_GRACEFUL_SHUTDOWN_TIMEOUT: Duration =
Duration::from_minutes_unchecked(15);
pub const DEFAULT_DATA_NODE_GRACEFUL_SHUTDOWN_TIMEOUT: Duration =
Duration::from_minutes_unchecked(30);

// hdfs-site.xml
pub const DFS_NAMENODE_NAME_DIR: &str = "dfs.namenode.name.dir";
pub const DFS_NAMENODE_SHARED_EDITS_DIR: &str = "dfs.namenode.shared.edits.dir";
Expand Down
26 changes: 26 additions & 0 deletions rust/crd/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ use stackable_operator::{
role_utils::{GenericRoleConfig, Role, RoleGroup, RoleGroupRef},
schemars::{self, JsonSchema},
status::condition::{ClusterCondition, HasStatusCondition},
time::Duration,
};
use std::collections::{BTreeMap, HashMap};
use storage::{
Expand Down Expand Up @@ -156,6 +157,7 @@ pub trait MergedConfig {
None
}
fn affinity(&self) -> &StackableAffinity;
fn graceful_shutdown_timeout(&self) -> Option<&Duration>;
/// Main container shared by all roles
fn hdfs_logging(&self) -> ContainerLogConfig;
/// Vector container shared by all roles
Expand Down Expand Up @@ -841,6 +843,9 @@ pub struct NameNodeConfig {
pub logging: Logging<NameNodeContainer>,
#[fragment_attrs(serde(default))]
pub affinity: StackableAffinity,
#[fragment_attrs(serde(default))]
/// Time period Pods have to gracefully shut down, e.g. `30m`, `1h` or `2d`. Consult the operator documentation for details.
pub graceful_shutdown_timeout: Option<Duration>,
}

impl MergedConfig for NameNodeConfig {
Expand All @@ -852,6 +857,10 @@ impl MergedConfig for NameNodeConfig {
&self.affinity
}

fn graceful_shutdown_timeout(&self) -> Option<&Duration> {
self.graceful_shutdown_timeout.as_ref()
}

fn hdfs_logging(&self) -> ContainerLogConfig {
self.logging
.containers
Expand Down Expand Up @@ -916,6 +925,7 @@ impl NameNodeConfigFragment {
},
logging: product_logging::spec::default_logging(),
affinity: get_affinity(cluster_name, role),
graceful_shutdown_timeout: Some(DEFAULT_NAME_NODE_GRACEFUL_SHUTDOWN_TIMEOUT),
}
}
}
Expand Down Expand Up @@ -1001,6 +1011,9 @@ pub struct DataNodeConfig {
pub logging: Logging<DataNodeContainer>,
#[fragment_attrs(serde(default))]
pub affinity: StackableAffinity,
#[fragment_attrs(serde(default))]
/// Time period Pods have to gracefully shut down, e.g. `30m`, `1h` or `2d`. Consult the operator documentation for details.
pub graceful_shutdown_timeout: Option<Duration>,
}

impl MergedConfig for DataNodeConfig {
Expand All @@ -1014,6 +1027,10 @@ impl MergedConfig for DataNodeConfig {
&self.affinity
}

fn graceful_shutdown_timeout(&self) -> Option<&Duration> {
self.graceful_shutdown_timeout.as_ref()
}

fn hdfs_logging(&self) -> ContainerLogConfig {
self.logging
.containers
Expand Down Expand Up @@ -1069,6 +1086,7 @@ impl DataNodeConfigFragment {
},
logging: product_logging::spec::default_logging(),
affinity: get_affinity(cluster_name, role),
graceful_shutdown_timeout: Some(DEFAULT_DATA_NODE_GRACEFUL_SHUTDOWN_TIMEOUT),
}
}
}
Expand Down Expand Up @@ -1152,6 +1170,9 @@ pub struct JournalNodeConfig {
pub logging: Logging<JournalNodeContainer>,
#[fragment_attrs(serde(default))]
pub affinity: StackableAffinity,
#[fragment_attrs(serde(default))]
/// Time period Pods have to gracefully shut down, e.g. `30m`, `1h` or `2d`. Consult the operator documentation for details.
pub graceful_shutdown_timeout: Option<Duration>,
}

impl MergedConfig for JournalNodeConfig {
Expand All @@ -1163,6 +1184,10 @@ impl MergedConfig for JournalNodeConfig {
&self.affinity
}

fn graceful_shutdown_timeout(&self) -> Option<&Duration> {
self.graceful_shutdown_timeout.as_ref()
}

fn hdfs_logging(&self) -> ContainerLogConfig {
self.logging
.containers
Expand Down Expand Up @@ -1206,6 +1231,7 @@ impl JournalNodeConfigFragment {
},
logging: product_logging::spec::default_logging(),
affinity: get_affinity(cluster_name, role),
graceful_shutdown_timeout: Some(DEFAULT_JOURNAL_NODE_GRACEFUL_SHUTDOWN_TIMEOUT),
}
}
}
Expand Down
19 changes: 11 additions & 8 deletions rust/operator/src/hdfs_controller.rs
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ use crate::{
discovery::build_discovery_configmap,
event::{build_invalid_replica_message, publish_event},
kerberos,
operations::pdb::add_pdbs,
operations::{self, graceful_shutdown::add_graceful_shutdown_config, pdb::add_pdbs},
product_logging::{extend_role_group_config_map, resolve_vector_aggregator_address},
OPERATOR_NAME,
};
Expand All @@ -23,7 +23,6 @@ use stackable_operator::{
product_image_selection::ResolvedProductImage,
rbac::{build_rbac_resources, service_account_name},
},
duration::Duration,
k8s_openapi::{
api::{
apps::v1::{StatefulSet, StatefulSetSpec},
Expand All @@ -48,6 +47,7 @@ use stackable_operator::{
compute_conditions, operations::ClusterOperationsConditionBuilder,
statefulset::StatefulSetConditionBuilder,
},
time::Duration,
};
use std::{
collections::{BTreeMap, HashMap},
Expand Down Expand Up @@ -166,14 +166,15 @@ pub enum Error {
"kerberos not supported for HDFS versions < 3.3.x. Please use at least version 3.3.x"
))]
KerberosNotSupported {},
#[snafu(display(
"failed to serialize [{JVM_SECURITY_PROPERTIES_FILE}] for {}",
rolegroup
))]
JvmSecurityPoperties {
#[snafu(display("failed to serialize [{JVM_SECURITY_PROPERTIES_FILE}] for {rolegroup}",))]
JvmSecurityProperties {
source: stackable_operator::product_config::writer::PropertiesWriterError,
rolegroup: String,
},
#[snafu(display("failed to configure graceful shutdown"), context(false))]
GracefulShutdown {
source: operations::graceful_shutdown::Error,
},
}

impl ReconcilerError for Error {
Expand Down Expand Up @@ -599,7 +600,7 @@ fn rolegroup_config_map(
.add_data(
JVM_SECURITY_PROPERTIES_FILE,
to_java_properties_string(jvm_sec_props.iter()).with_context(|_| {
JvmSecurityPopertiesSnafu {
JvmSecurityPropertiesSnafu {
rolegroup: rolegroup_ref.role_group.clone(),
}
})?,
Expand Down Expand Up @@ -667,6 +668,8 @@ fn rolegroup_statefulset(
)
.context(FailedToCreateContainerAndVolumeConfigurationSnafu)?;

add_graceful_shutdown_config(merged_config, &mut pb)?;

let mut pod_template = pb.build_template();
if let Some(pod_overrides) = hdfs.pod_overrides_for_role(role) {
pod_template.merge_from(pod_overrides.clone());
Expand Down
26 changes: 26 additions & 0 deletions rust/operator/src/operations/graceful_shutdown.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
use snafu::{ResultExt, Snafu};
use stackable_hdfs_crd::MergedConfig;
use stackable_operator::builder::PodBuilder;

#[derive(Debug, Snafu)]
pub enum Error {
#[snafu(display("Failed to set terminationGracePeriod"))]
SetTerminationGracePeriod {
source: stackable_operator::builder::pod::Error,
},
}

pub fn add_graceful_shutdown_config(
merged_config: &(dyn MergedConfig + Send + 'static),
pod_builder: &mut PodBuilder,
) -> Result<(), Error> {
// This must be always set by the merge mechanism, as we provide a default value,
// users can not disable graceful shutdown.
if let Some(graceful_shutdown_timeout) = merged_config.graceful_shutdown_timeout() {
pod_builder
.termination_grace_period(graceful_shutdown_timeout)
.context(SetTerminationGracePeriodSnafu)?;
}

Ok(())
}
1 change: 1 addition & 0 deletions rust/operator/src/operations/mod.rs
Original file line number Diff line number Diff line change
@@ -1 +1,2 @@
pub mod graceful_shutdown;
pub mod pdb;
2 changes: 1 addition & 1 deletion rust/operator/src/pod_svc_controller.rs
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,10 @@ use stackable_hdfs_crd::constants::*;
use stackable_hdfs_crd::HdfsRole;
use stackable_operator::{
builder::ObjectMetaBuilder,
duration::Duration,
k8s_openapi::api::core::v1::{Pod, Service, ServicePort, ServiceSpec},
kube::runtime::controller::Action,
logging::controller::ReconcilerError,
time::Duration,
};
use std::sync::Arc;
use strum::{EnumDiscriminants, IntoStaticStr};
Expand Down
3 changes: 3 additions & 0 deletions tests/templates/kuttl/smoke/30-assert.yaml.j2
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ spec:
- name: vector
{% endif %}
- name: zkfc
terminationGracePeriodSeconds: 900
status:
readyReplicas: 2
replicas: 2
Expand All @@ -46,6 +47,7 @@ spec:
{% if lookup('env', 'VECTOR_AGGREGATOR') %}
- name: vector
{% endif %}
terminationGracePeriodSeconds: 900
status:
readyReplicas: 1
replicas: 1
Expand All @@ -69,6 +71,7 @@ spec:
{% if lookup('env', 'VECTOR_AGGREGATOR') %}
- name: vector
{% endif %}
terminationGracePeriodSeconds: 1800
status:
readyReplicas: {{ test_scenario['values']['number-of-datanodes'] }}
replicas: {{ test_scenario['values']['number-of-datanodes'] }}
Expand Down