Skip to content

Commit

Permalink
feat: add probes, graceful shutdown and fix logdir
Browse files Browse the repository at this point in the history
  • Loading branch information
philipsens committed Oct 9, 2023
1 parent 3bf9971 commit 9602c91
Show file tree
Hide file tree
Showing 5 changed files with 170 additions and 53 deletions.
5 changes: 4 additions & 1 deletion charts/drill/Chart.yaml
Original file line number Diff line number Diff line change
@@ -1,12 +1,15 @@
apiVersion: v2
name: drill
version: 1.2.4
version: 1.2.5
appVersion: 1.21.1
description: Helm Charts for deploying Apache Drill Clusters on Kubernetes
icon: https://raw.githubusercontent.com/wearefrank/charts/master/charts/drill/icon.svg
type: application
keywords:
- apache
- drill
- data
- query
home: 'https://drill.apache.org/'
sources:
- 'https://github.com/wearefrank/charts'
Expand Down
72 changes: 44 additions & 28 deletions charts/drill/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,19 +34,19 @@ helm repo add wearefrank https://wearefrank.github.io/charts
```

If you had already added this repo earlier, run `helm repo update` to retrieve
the latest versions of the packages. You can then run `helm search repo
the latest versions of the packages. You can then run `helm search repo
wearefrank` to see the charts.

To install the ZaakBrug chart:
To install the Drill chart:

```shell
helm install zaakbrug wearefrank/zaakbrug
helm install drill wearefrank/drill
```

To uninstall the chart:

```shell
helm delete zaakbrug
helm delete drill
```

### Values
Expand All @@ -59,12 +59,14 @@ Please refer to the [values.yaml](values.yaml) file for details on default value

### Access Drill Web UI

There is a service that can be used, but this one will jump from pod, which isn't very friendly. Depending on ingress class you can make this sticky with annotations. You could also change the
There is a service that can be used, but this one will jump from pod, which isn't very friendly. Depending on ingress
class you can make this sticky with annotations. You could also change the

## Chart Structure

Drill Helm charts are organized as a collection of files inside the `drill` directory. As Drill depends on Zookeeper for
cluster co-ordination, a zookeeper chart added as dependency in the [chart definition](Chart.yaml). The Zookeeper chart is maintained by Bitnami.
cluster co-ordination, a zookeeper chart added as dependency in the [chart definition](Chart.yaml). The Zookeeper chart
is maintained by Bitnami.

```shell
drill/
Expand All @@ -87,12 +89,16 @@ Drill Helm Charts contain the following templates:
## Autoscaling Drill Clusters

The size of the Drill cluster (number of Drill Pod replicas / number of drill-bits) can not only be manually scaled up
or down as shown above, but can also be autoscaled to simplify cluster management. When enabled, with a higher CPU
or down, but can also be autoscaled to simplify cluster management. When enabled, with a higher CPU
utilization, more drill-bits are added automatically and as the cluster load goes down, so do the number of drill-bits
in the Drill Cluster. The drill-bits deemed
excessive [gracefully shut down](https://drill.apache.org/docs/stopping-drill/#gracefully-shutting-down-the-drill-process),
by going into quiescent mode to permit running queries to complete.

> [!IMPORTANT]
> For the graceful shutdown to succeed, a sigfile is made in the `$DRILL_HOME` folder. This requires running as `root` (
> uid 0). If the application is run as `drilluser` the `stop` commando will be used.
Enable autoscaling by editing the autoscale section in `drill/values.yaml` file.

## Parameters
Expand All @@ -116,24 +122,34 @@ Enable autoscaling by editing the autoscale section in `drill/values.yaml` file.

### Drill deployment parameters

| Name | Description | Value |
| ----------------------------------- | -------------------------------------------------------- | --------- |
| `replicaCount` | Number of Drill replicas to deploy | `3` |
| `livenessProbe.initialDelaySeconds` | Initial delay seconds for livenessProbe | `40` |
| `livenessProbe.periodSeconds` | Period seconds for livenessProbe | `10` |
| `livenessProbe.timeoutSeconds` | Timeout seconds for livenessProbe | `1` |
| `livenessProbe.failureThreshold` | Failure threshold for livenessProbe | `6` |
| `livenessProbe.successThreshold` | Success threshold for livenessProbe | `1` |
| `resources` | Set the resources for the Drill containers | `{}` |
| `resources.limits` | The resources limits for the Drill containers | `""` |
| `resources.requests.memory` | The requested memory for the Drill containers | `""` |
| `resources.requests.cpu` | The requested cpu for the Drill containers | `""` |
| `terminationGracePeriodSeconds` | Number of seconds after which pods are forcefully killed | `25` |
| `terminationGracePeriodSeconds` | Note: Lower values may cause running queries to fail | |
| `nodeSelector` | Node labels for pod assignment | `{}` |
| `tolerations` | Set tolerations for pod assignment | `[]` |
| `affinity` | Set affinity for pod assignment | `{}` |
| `timeZone` | used for database connection and log timestamps | `Etc/UTC` |
| Name | Description | Value |
| ------------------------------------ | -------------------------------------------------------- | --------- |
| `replicaCount` | Number of Drill replicas to deploy | `3` |
| `startupProbe.initialDelaySeconds` | Initial delay seconds for livenessProbe | `10` |
| `startupProbe.periodSeconds` | Period seconds for livenessProbe | `10` |
| `startupProbe.timeoutSeconds` | Timeout seconds for livenessProbe | `1` |
| `startupProbe.failureThreshold` | Failure threshold for livenessProbe | `6` |
| `startupProbe.successThreshold` | Success threshold for livenessProbe | `1` |
| `readinessProbe.initialDelaySeconds` | Initial delay seconds for livenessProbe | `0` |
| `readinessProbe.periodSeconds` | Period seconds for livenessProbe | `5` |
| `readinessProbe.timeoutSeconds` | Timeout seconds for livenessProbe | `1` |
| `readinessProbe.failureThreshold` | Failure threshold for livenessProbe | `3` |
| `readinessProbe.successThreshold` | Success threshold for livenessProbe | `1` |
| `livenessProbe.initialDelaySeconds` | Initial delay seconds for livenessProbe | `0` |
| `livenessProbe.periodSeconds` | Period seconds for livenessProbe | `10` |
| `livenessProbe.timeoutSeconds` | Timeout seconds for livenessProbe | `1` |
| `livenessProbe.failureThreshold` | Failure threshold for livenessProbe | `6` |
| `livenessProbe.successThreshold` | Success threshold for livenessProbe | `1` |
| `resources` | Set the resources for the Drill containers | `{}` |
| `resources.limits` | The resources limits for the Drill containers | `""` |
| `resources.requests.memory` | The requested memory for the Drill containers | `""` |
| `resources.requests.cpu` | The requested cpu for the Drill containers | `""` |
| `terminationGracePeriodSeconds` | Number of seconds after which pods are forcefully killed | `25` |
| `terminationGracePeriodSeconds` | Note: Lower values may cause running queries to fail | |
| `nodeSelector` | Node labels for pod assignment | `{}` |
| `tolerations` | Set tolerations for pod assignment | `[]` |
| `affinity` | Set affinity for pod assignment | `{}` |
| `timeZone` | used for database connection and log timestamps | `Etc/UTC` |

### Traffic Exposure Parameters

Expand Down Expand Up @@ -168,12 +184,12 @@ Enable autoscaling by editing the autoscale section in `drill/values.yaml` file.

### Drill configuration

Configuring Drill can be done with override files or in the web ui, altough some properties can only be set in the override file.
When using the web ui, ZooKeeper will be used to store the values. Make sure that the storage of ZooKeeper is persistent if you intent to configure this way.
Configuring Drill can be done with override files or in the web ui, although some properties can only be set in the override file.
When using the web ui, ZooKeeper will be used to store the values. Make sure that the storage of ZooKeeper is persistent if you intend to configure this way.

This is an example where the web ui and authentication for local (plain) users is enabled.

```json
```hocon
drill.exec: {
http.enabled: true,
impersonation: {
Expand Down
48 changes: 30 additions & 18 deletions charts/drill/templates/statefulset.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -40,14 +40,14 @@ spec:
containers:
- name: {{ .Chart.Name }}
securityContext:
{{- toYaml .Values.securityContext | nindent 12 }}
{{- toYaml .Values.podSecurityContext | nindent 12 }}
imagePullPolicy: {{ .Values.image.pullPolicy }}
image: "{{ .Values.image.registry }}{{ if .Values.image.registry }}/{{ end }}{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
stdin: true
command:
- /bin/bash
- -c
- "/opt/drill/bin/drillbit.sh start; tail -f /var/lib/drill/log/drillbit.out"
- "${DRILL_HOME}/bin/drillbit.sh start && tail -f /var/lib/drill/log/drillbit.out"
ports:
- containerPort: 8047
name: web
Expand All @@ -57,24 +57,34 @@ spec:
name: control
- containerPort: 31012
name: data
{{/* livenessProbe:*/}}
{{/* {{- toYaml .Values.livenessProbe | nindent 12 }}*/}}
{{/* exec:*/}}
{{/* command:*/}}
{{/* - /opt/drill/bin/drillbit.sh*/}}
{{/* - status*/}}
{{/* readinessProbe:*/}}
{{/* {{- toYaml .Values.livenessProbe | nindent 12 }}*/}}
{{/* exec:*/}}
{{/* command:*/}}
{{/* - /opt/drill/bin/drillbit.sh*/}}
{{/* - status*/}}
startupProbe:
{{- toYaml .Values.startupProbe | nindent 12 }}
exec:
command:
- /bin/bash
- -c
- "${DRILL_HOME}/bin/drillbit.sh status"
readinessProbe:
{{- toYaml .Values.readinessProbe | nindent 12 }}
exec:
command:
- /bin/bash
- -c
- "${DRILL_HOME}/bin/drillbit.sh status"
livenessProbe:
{{- toYaml .Values.livenessProbe | nindent 12 }}
exec:
command:
- /bin/bash
- -c
- "${DRILL_HOME}/bin/drillbit.sh status"
lifecycle:
preStop:
exec:
command:
- /opt/drill/bin/drillbit.sh
- stop # TODO: implement graceful_stop
- /bin/bash
- -c
- "if [ \"$(id -u)\" -ne 0 ]; then ${DRILL_HOME}/bin/drillbit.sh stop; else ${DRILL_HOME}/bin/drillbit.sh graceful_stop; fi"
{{- with .Values.resources }}
resources:
{{- toYaml . | nindent 12 }}
Expand All @@ -86,8 +96,10 @@ spec:
name: {{ template "drill.fullname" . }}-users
subPath: passwd
- name: {{ template "drill.fullname" . }}-logs
mountPath: /opt/drill/logs
mountPath: /var/lib/drill/log
terminationGracePeriodSeconds: {{ .Values.terminationGracePeriodSeconds }}
securityContext:
{{- toYaml .Values.securityContext | nindent 8 }}
volumes:
- name: {{ template "drill.fullname" . }}-override
configMap:
Expand All @@ -110,7 +122,7 @@ spec:
{{- else if (not .Values.persistence.enabled) }}
- name: {{ template "drill.fullname" . }}-logs
emptyDir: {}
{{- end}}
{{- end }}
initContainers:
- name: zookeeper-available
image: busybox
Expand Down
62 changes: 61 additions & 1 deletion charts/drill/values.schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -48,13 +48,73 @@
"description": "Number of Drill replicas to deploy",
"default": 3
},
"startupProbe": {
"type": "object",
"properties": {
"initialDelaySeconds": {
"type": "number",
"description": "Initial delay seconds for livenessProbe",
"default": 10
},
"periodSeconds": {
"type": "number",
"description": "Period seconds for livenessProbe",
"default": 10
},
"timeoutSeconds": {
"type": "number",
"description": "Timeout seconds for livenessProbe",
"default": 1
},
"failureThreshold": {
"type": "number",
"description": "Failure threshold for livenessProbe",
"default": 6
},
"successThreshold": {
"type": "number",
"description": "Success threshold for livenessProbe",
"default": 1
}
}
},
"readinessProbe": {
"type": "object",
"properties": {
"initialDelaySeconds": {
"type": "number",
"description": "Initial delay seconds for livenessProbe",
"default": 0
},
"periodSeconds": {
"type": "number",
"description": "Period seconds for livenessProbe",
"default": 5
},
"timeoutSeconds": {
"type": "number",
"description": "Timeout seconds for livenessProbe",
"default": 1
},
"failureThreshold": {
"type": "number",
"description": "Failure threshold for livenessProbe",
"default": 3
},
"successThreshold": {
"type": "number",
"description": "Success threshold for livenessProbe",
"default": 1
}
}
},
"livenessProbe": {
"type": "object",
"properties": {
"initialDelaySeconds": {
"type": "number",
"description": "Initial delay seconds for livenessProbe",
"default": 40
"default": 0
},
"periodSeconds": {
"type": "number",
Expand Down
36 changes: 31 additions & 5 deletions charts/drill/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -45,16 +45,42 @@ image:
##
replicaCount: 3

## Configure extra options for Drill containers' liveness, readiness and startup probes
## Configure extra options for Drill containers' startup probe
## ref: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/#configure-probes
## @param startupProbe.initialDelaySeconds Initial delay seconds for livenessProbe
## @param startupProbe.periodSeconds Period seconds for livenessProbe
## @param startupProbe.timeoutSeconds Timeout seconds for livenessProbe
## @param startupProbe.failureThreshold Failure threshold for livenessProbe
## @param startupProbe.successThreshold Success threshold for livenessProbe
##
startupProbe:
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 1
failureThreshold: 6
successThreshold: 1

## @param readinessProbe.initialDelaySeconds Initial delay seconds for livenessProbe
## @param readinessProbe.periodSeconds Period seconds for livenessProbe
## @param readinessProbe.timeoutSeconds Timeout seconds for livenessProbe
## @param readinessProbe.failureThreshold Failure threshold for livenessProbe
## @param readinessProbe.successThreshold Success threshold for livenessProbe
##
readinessProbe:
initialDelaySeconds: 0
periodSeconds: 5
timeoutSeconds: 1
failureThreshold: 3
successThreshold: 1

## @param livenessProbe.initialDelaySeconds Initial delay seconds for livenessProbe
## @param livenessProbe.periodSeconds Period seconds for livenessProbe
## @param livenessProbe.timeoutSeconds Timeout seconds for livenessProbe
## @param livenessProbe.failureThreshold Failure threshold for livenessProbe
## @param livenessProbe.successThreshold Success threshold for livenessProbe
##
livenessProbe:
initialDelaySeconds: 40
initialDelaySeconds: 0
periodSeconds: 10
timeoutSeconds: 1
failureThreshold: 6
Expand Down Expand Up @@ -225,12 +251,12 @@ securityContext: {}

## @section Drill configuration
## @descriptionStart
## Configuring Drill can be done with override files or in the web ui, altough some properties can only be set in the override file.
## When using the web ui, ZooKeeper will be used to store the values. Make sure that the storage of ZooKeeper is persistent if you intent to configure this way.
## Configuring Drill can be done with override files or in the web ui, although some properties can only be set in the override file.
## When using the web ui, ZooKeeper will be used to store the values. Make sure that the storage of ZooKeeper is persistent if you intend to configure this way.
##
## This is an example where the web ui and authentication for local (plain) users is enabled.
##
## ```json
## ```hocon
## drill.exec: {
## http.enabled: true,
## impersonation: {
Expand Down

0 comments on commit 9602c91

Please sign in to comment.