Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Make spark-env.sh configurable #473

Merged
merged 13 commits into from
Oct 7, 2024
Merged
5 changes: 5 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,10 @@ All notable changes to this project will be documented in this file.

## [Unreleased]

### Added

- Make spark-env.sh configurable via `configOverrides` ([#473]).

### Changed

- Reduce CRD size from `1.2MB` to `103KB` by accepting arbitrary YAML input instead of the underlying schema for the following fields ([#450]):
Expand All @@ -28,6 +32,7 @@ All notable changes to this project will be documented in this file.
[#459]: https://github.com/stackabletech/spark-k8s-operator/pull/459
[#460]: https://github.com/stackabletech/spark-k8s-operator/pull/460
[#472]: https://github.com/stackabletech/spark-k8s-operator/pull/472
[#473]: https://github.com/stackabletech/spark-k8s-operator/pull/473

## [24.7.0] - 2024-07-24

Expand Down
28 changes: 14 additions & 14 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Original file line number Diff line number Diff line change
@@ -0,0 +1,150 @@
= Configuration & Environment Overrides

The cluster definition also supports overriding configuration properties and environment variables, either per role or per role group, where the more specific override (role group) has precedence over the less specific one (role).

IMPORTANT: Overriding operator-set properties (such as the ports) can interfere with the operator and can lead to problems.


== Configuration Properties

For a role or role group, at the same level of `config`, you can specify `configOverrides` for the following files:

* `spark-env.sh`
* `security.properties`

NOTE: `spark-defaults.conf` is not required here, because the properties defined in {crd-docs}/spark.stackable.tech/sparkhistoryserver/v1alpha1/#spec-sparkConf[`sparkConf` (SparkHistoryServer)] and {crd-docs}/spark.stackable.tech/sparkapplication/v1alpha1/#spec-sparkConf[`sparkConf` (SparkApplication)] are already added to this file.

For example, if you want to set the `networkaddress.cache.ttl`, it can be configured in the SparkHistoryServer resource like so:

[source,yaml]
----
nodes:
roleGroups:
default:
configOverrides:
security.properties:
networkaddress.cache.ttl: "30"
replicas: 1
----

Just as for the `config`, it is possible to specify this at the role level as well:

[source,yaml]
----
nodes:
configOverrides:
security.properties:
networkaddress.cache.ttl: "30"
roleGroups:
default:
replicas: 1
----

All override property values must be strings.

The same applies to the `job`, `driver` and `executor` roles of the SparkApplication.

=== The spark-env.sh file

The `spark-env.sh` file is used to set environment variables.
Usually, environment variables are configured in `envOverrides` or {crd-docs}/spark.stackable.tech/sparkapplication/v1alpha1/#spec-env[`env` (SparkApplication)], but both options only allow static values to be set.
The values in `spark-env.sh` are evaluated by the shell.
For instance, if a SAS token is stored in a Secret and should be used for the Spark History Server, this token could be first stored in an environment variable via `podOverrides` and then added to the `SPARK_HISTORY_OPTS`:

[source,yaml]
----
podOverrides:
spec:
containers:
- name: spark-history
env:
- name: SAS_TOKEN
valueFrom:
secretKeyRef:
name: adls-spark-credentials
key: sas-token
configOverrides:
spark-env.sh:
SPARK_HISTORY_OPTS: >-
$SPARK_HISTORY_OPTS
-Dspark.hadoop.fs.azure.sas.fixed.token.mystorageaccount.dfs.core.windows.net=$SAS_TOKEN
----

NOTE: The given properties are written to `spark-env.sh` in the form `export KEY="VALUE"`.
Make sure to escape the value already in the specification.
Be aware that some environment variables may already be set, so prepend or append a reference to them in the value, as it is done in the example.

=== The security.properties file

The `security.properties` file is used to configure JVM security properties.
It is very seldom that users need to tweak any of these, but there is one use-case that stands out, and that users need to be aware of: the JVM DNS cache.

The JVM manages its own cache of successfully resolved host names as well as a cache of host names that cannot be resolved.
Some products of the Stackable platform are very sensitive to the contents of these caches and their performance is heavily affected by them.
As of version 3.4.0, Apache Spark may perform poorly if the positive cache is disabled.
To cache resolved host names, and thus speed up queries, you can configure the TTL of entries in the positive cache like this:

[source,yaml]
----
spec:
nodes:
configOverrides:
security.properties:
networkaddress.cache.ttl: "30"
networkaddress.cache.negative.ttl: "0"
----

NOTE: The operator configures DNS caching by default as shown in the example above.

For details on the JVM security see https://docs.oracle.com/en/java/javase/11/security/java-security-overview1.html


== Environment Variables

Similarly, environment variables can be (over)written. For example per role group:
adwk67 marked this conversation as resolved.
Show resolved Hide resolved

[source,yaml]
----
nodes:
roleGroups:
default:
envOverrides:
MY_ENV_VAR: "MY_VALUE"
replicas: 1
----

or per role:

[source,yaml]
----
nodes:
envOverrides:
MY_ENV_VAR: "MY_VALUE"
roleGroups:
default:
replicas: 1
----

In a SparkApplication, environment variables can also be defined with the {crd-docs}/spark.stackable.tech/sparkapplication/v1alpha1/#spec-env[`env`] property for the job, driver and executor pods at once.
The result is basically the same as with `envOverrides`, but `env` also allows to reference Secrets and so on:

[source,yaml]
----
---
apiVersion: spark.stackable.tech/v1alpha1
kind: SparkApplication
spec:
env:
- name: SAS_TOKEN
valueFrom:
secretKeyRef:
name: adls-spark-credentials
key: sas-token
...
----


== Pod overrides

The Spark operator also supports Pod overrides, allowing you to override any property that you can set on a Kubernetes Pod.
Read the xref:concepts:overrides.adoc#pod-overrides[Pod overrides documentation] to learn more about this feature.
29 changes: 0 additions & 29 deletions docs/modules/spark-k8s/pages/usage-guide/history-server.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -74,32 +74,3 @@ spark-history-node-cleaner NodePort 10.96.203.43 <none> 18080:325
By setting up port forwarding on 18080 the UI can be opened by pointing your browser to `http://localhost:18080`:

image::history-server-ui.png[History Server Console]

== Configuration Properties

For a role group of the Spark history server, you can specify: `configOverrides` for the following files:

* `security.properties`

=== The security.properties file

The `security.properties` file is used to configure JVM security properties.
It is very seldom that users need to tweak any of these, but there is one use-case that stands out, and that users need to be aware of: the JVM DNS cache.

The JVM manages its own cache of successfully resolved host names as well as a cache of host names that cannot be resolved.
Some products of the Stackable platform are very sensible to the contents of these caches and their performance is heavily affected by them.
As of version 3.4.0, Apache Spark may perform poorly if the positive cache is disabled.
To cache resolved host names, and thus speeding up queries you can configure the TTL of entries in the positive cache like this:

[source,yaml]
----
nodes:
configOverrides:
security.properties:
networkaddress.cache.ttl: "30"
networkaddress.cache.negative.ttl: "0"
----

NOTE: The operator configures DNS caching by default as shown in the example above.

For details on the JVM security see https://docs.oracle.com/en/java/javase/11/security/java-security-overview1.html
1 change: 1 addition & 0 deletions docs/modules/spark-k8s/partials/nav.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
** xref:spark-k8s:usage-guide/logging.adoc[]
** xref:spark-k8s:usage-guide/history-server.adoc[]
** xref:spark-k8s:usage-guide/examples.adoc[]
** xref:spark-k8s:usage-guide/configuration-environment-overrides.adoc[]
** xref:spark-k8s:usage-guide/operations/index.adoc[]
*** xref:spark-k8s:usage-guide/operations/applications.adoc[]
*** xref:spark-k8s:usage-guide/operations/pod-placement.adoc[]
Expand Down
1 change: 1 addition & 0 deletions rust/crd/src/constants.rs
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,7 @@ pub const HISTORY_ROLE_NAME: &str = "node";
pub const SPARK_IMAGE_BASE_NAME: &str = "spark-k8s";

pub const SPARK_DEFAULTS_FILE_NAME: &str = "spark-defaults.conf";
pub const SPARK_ENV_SH_FILE_NAME: &str = "spark-env.sh";

pub const SPARK_CLUSTER_ROLE: &str = "spark-k8s-clusterrole";
pub const SPARK_UID: i64 = 1000;
Expand Down
1 change: 1 addition & 0 deletions rust/crd/src/history.rs
Original file line number Diff line number Diff line change
Expand Up @@ -212,6 +212,7 @@ impl SparkHistoryServer {
(
vec![
PropertyNameKind::File(SPARK_DEFAULTS_FILE_NAME.to_string()),
PropertyNameKind::File(SPARK_ENV_SH_FILE_NAME.to_string()),
PropertyNameKind::File(JVM_SECURITY_PROPERTIES_FILE.to_string()),
],
self.spec.nodes.clone(),
Expand Down
21 changes: 21 additions & 0 deletions rust/crd/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -819,6 +819,7 @@ impl SparkApplication {
(
vec![
PropertyNameKind::Env,
PropertyNameKind::File(SPARK_ENV_SH_FILE_NAME.to_string()),
PropertyNameKind::File(JVM_SECURITY_PROPERTIES_FILE.to_string()),
],
Role {
Expand All @@ -841,6 +842,7 @@ impl SparkApplication {
(
vec![
PropertyNameKind::Env,
PropertyNameKind::File(SPARK_ENV_SH_FILE_NAME.to_string()),
PropertyNameKind::File(JVM_SECURITY_PROPERTIES_FILE.to_string()),
],
Role {
Expand All @@ -863,6 +865,7 @@ impl SparkApplication {
(
vec![
PropertyNameKind::Env,
PropertyNameKind::File(SPARK_ENV_SH_FILE_NAME.to_string()),
PropertyNameKind::File(JVM_SECURITY_PROPERTIES_FILE.to_string()),
],
Role {
Expand Down Expand Up @@ -1037,6 +1040,20 @@ fn resources_to_executor_props(
Ok(())
}

/// Create the content of the file spark-env.sh.
/// The properties are serialized in the form 'export {k}="{v}"',
/// escaping neither the key nor the value. The user is responsible for
/// providing escaped values.
pub fn to_spark_env_sh_string<'a, T>(properties: T) -> String
where
T: Iterator<Item = (&'a String, &'a String)>,
{
properties
.map(|(k, v)| format!("export {k}=\"{v}\""))
.collect::<Vec<String>>()
.join("\n")
}

#[cfg(test)]
mod tests {

Expand Down Expand Up @@ -1296,6 +1313,10 @@ mod tests {
"default".into(),
vec![
(PropertyNameKind::Env, BTreeMap::new()),
(
PropertyNameKind::File("spark-env.sh".into()),
BTreeMap::new(),
),
(
PropertyNameKind::File("security.properties".into()),
vec![
Expand Down
14 changes: 12 additions & 2 deletions rust/operator-binary/src/history/history_controller.rs
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ use stackable_operator::{
role_utils::RoleGroupRef,
time::Duration,
};
use stackable_spark_k8s_crd::constants::METRICS_PORT;
use stackable_spark_k8s_crd::constants::{METRICS_PORT, SPARK_ENV_SH_FILE_NAME};
use stackable_spark_k8s_crd::{
constants::{
ACCESS_KEY_ID, APP_NAME, HISTORY_CONTROLLER_NAME, HISTORY_ROLE_NAME,
Expand All @@ -49,7 +49,7 @@ use stackable_spark_k8s_crd::{
history,
history::{HistoryConfig, SparkHistoryServer, SparkHistoryServerContainer},
s3logdir::S3LogDir,
tlscerts,
tlscerts, to_spark_env_sh_string,
};
use std::collections::HashMap;
use std::{collections::BTreeMap, sync::Arc};
Expand Down Expand Up @@ -382,6 +382,16 @@ fn build_config_map(
.build(),
)
.add_data(SPARK_DEFAULTS_FILE_NAME, spark_defaults)
.add_data(
SPARK_ENV_SH_FILE_NAME,
to_spark_env_sh_string(
config
.get(&PropertyNameKind::File(SPARK_ENV_SH_FILE_NAME.to_string()))
.cloned()
.unwrap_or_default()
.iter(),
),
)
.add_data(
JVM_SECURITY_PROPERTIES_FILE,
to_java_properties_string(jvm_sec_props.iter()).with_context(|_| {
Expand Down
Loading
Loading