From 137286f6c8c322b2c612f0d8355e656a12f6895c Mon Sep 17 00:00:00 2001 From: Pritesh Lahoti Date: Fri, 20 Dec 2024 16:01:29 +0530 Subject: [PATCH] [CRDB-44997] docs: update docs to upgrade chart involving new PVCs The upgrade process for a chart involving new PVCs is manual and needs to be well documented. We will be working on automating it soon. This PR also fixes other minor docs and updates the release. --- README.md | 31 ++++++++------ build/templates/README.md | 87 +++++++++++++++++++++------------------ cockroachdb/Chart.yaml | 2 +- cockroachdb/README.md | 87 +++++++++++++++++++++------------------ 4 files changed, 114 insertions(+), 93 deletions(-) diff --git a/README.md b/README.md index f9535f7c..0f95c7ee 100644 --- a/README.md +++ b/README.md @@ -50,7 +50,7 @@ rotation with following setting: tls.certs.selfSigner.rotateCerts: true ``` -## Certificate managed by cockroachdb && CA provided by user +## Certificate managed by cockroachdb and CA provided by user If user has a custom CA which they already use for certificate signing in their organisation, this utility provides a way for user to provide the custom CA. All the node and client certificates are signed by this user provided CA. @@ -84,12 +84,13 @@ tls.certs.selfSigner.nodeCertExpiryWindow: 168h This utility will only handle the rotation of client and node certificates, the rotation of custom CA should be done by user. -## Installation of Helm Chart +## Installation of Helm chart When user install cockroachdb cluster with self-signer enabled, you will see the self-signer job. ``` -kubectl get pods +$ kubectl get pods + NAME READY STATUS RESTARTS AGE crdb-cockroachdb-self-signer-mmxp8 1/1 Running 0 15s ``` @@ -98,7 +99,8 @@ This job will generate CA, client and node certificates based on the user input see the following secrets representing each certificates: ``` -kubectl get secrets +$ kubectl get secrets + NAME TYPE DATA AGE crdb-cockroachdb-ca-secret Opaque 2 3m10s crdb-cockroachdb-client-secret kubernetes.io/tls 3 3m9s @@ -112,7 +114,8 @@ sh.helm.release.v1.crdb.v1 helm.sh/release.v1 After this, the cockroachdb init jobs starts and copies this certificate to each nodes: ``` -prafull@EMPID18004:helm-charts$ kubectl get pods +$ kubectl get pods + NAME READY STATUS RESTARTS AGE crdb-cockroachdb-0 0/1 Init:0/1 0 18s crdb-cockroachdb-1 0/1 Init:0/1 0 18s @@ -122,7 +125,8 @@ crdb-cockroachdb-init-fclbb 1/1 Running 0 16s At last, the cockroach db cluster comes into running state with following output: ``` -helm install crdb ./cockroachdb/ +$ helm install crdb ./cockroachdb + NAME: crdb LAST DEPLOYED: Thu Aug 19 18:03:37 2021 NAMESPACE: crdb @@ -152,17 +156,19 @@ For more information on using CockroachDB, please see the project's docs at: https://www.cockroachlabs.com/docs/ ``` -## Upgrade of cockroachdb Cluster +## Upgrade of cockroachdb cluster Kick off the upgrade process by changing the new Docker image, where `$new_version` is the CockroachDB version to which you are upgrading: ```shell -helm upgrade my-release cockroachdb/cockroachdb \ +$ helm upgrade crdb ./cockroachdb \ --set image.tag=$new_version \ --reuse-values --timeout=20m ``` -Kubernetes will carry out a safe [rolling upgrade](https://kubernetes.io/docs/tutorials/stateful-application/basic-stateful-set/#updating-statefulsets) of your CockroachDB nodes one-by-one. Monitor the cluster's pods until all have been successfully restarted: +Kubernetes will carry out a safe [rolling upgrade](https://kubernetes.io/docs/tutorials/stateful-application/basic-stateful-set/#updating-statefulsets) of your CockroachDB nodes one-by-one. Monitor the cluster's pods until all have been successfully restarted. + +In case the upgrade involves adding new Persistent Volume Claim to the existing pods (e.g. enabling WAL Failover, pushing logs to a separate volume, etc.), then kindly refer to the documentation in [this](https://github.com/cockroachdb/helm-charts/tree/master/cockroachdb#chart-version-300-and-after) section. ## Migration from Kubernetes Signed Certificates to Self-Signer Certificates @@ -174,13 +180,13 @@ User can move from old kubernetes signing certificates by performing following s Run the upgrade command with upgrade strategy set as "onDelete" which only upgrades the pods when deleted by the user. ```shell -helm upgrade crdb-test cockroachdb --set statefulset.updateStrategy.type="OnDelete" --timeout=20m +$ helm upgrade crdb cockroachdb --set statefulset.updateStrategy.type="OnDelete" --timeout=20m ``` While monitor all the pods, once the init-job is created, you can delete all the cockroachdb pods with following command: ```shell -kubectl delete pods -l app.kubernetes.io/component=cockroachdb +$ kubectl delete pods -l app.kubernetes.io/component=cockroachdb ``` This will delete all the cockroachdb pods and restart the cluster with new certificates generated by the self-signer utility. @@ -215,7 +221,8 @@ tls.certs.certManagerIssuer.name: cockroachdb ``` ```shell -% helm install crdb ./cockroachdb +$ helm install crdb ./cockroachdb + NAME: crdb LAST DEPLOYED: Fri Aug 4 14:42:11 2023 NAMESPACE: crdb diff --git a/build/templates/README.md b/build/templates/README.md index 8050d8bc..2a7801e4 100644 --- a/build/templates/README.md +++ b/build/templates/README.md @@ -32,7 +32,7 @@ This chart will do the following: ## Add the CockroachDB Repository ```shell -helm repo add cockroachdb https://charts.cockroachdb.com/ +$ helm repo add cockroachdb https://charts.cockroachdb.com/ ``` ## Installing the Chart @@ -40,7 +40,7 @@ helm repo add cockroachdb https://charts.cockroachdb.com/ To install the chart with the release name `my-release`: ```shell -helm install my-release cockroachdb/cockroachdb +$ helm install my-release cockroachdb/cockroachdb ``` Note that for a production cluster, you will likely want to override the following parameters in [`values.yaml`](values.yaml) with your own values. @@ -57,10 +57,8 @@ For more information on overriding the `values.yaml` parameters, please see: Confirm that all pods are `Running` successfully and init has been completed: ```shell -kubectl get pods -``` +$ kubectl get pods -``` NAME READY STATUS RESTARTS AGE my-release-cockroachdb-0 1/1 Running 0 1m my-release-cockroachdb-1 1/1 Running 0 1m @@ -71,10 +69,8 @@ my-release-cockroachdb-init-k6jcr 0/1 Completed 0 1m Confirm that persistent volumes are created and claimed for each pod: ```shell -kubectl get pv -``` +$ kubectl get pv -``` NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE pvc-64878ebf-f3f0-11e8-ab5b-42010a8e0035 100Gi RWO Delete Bound default/datadir-my-release-cockroachdb-0 standard 51s pvc-64945b4f-f3f0-11e8-ab5b-42010a8e0035 100Gi RWO Delete Bound default/datadir-my-release-cockroachdb-1 standard 51s @@ -98,10 +94,8 @@ This is the default behaviour, and requires no configuration beyond setting cert If you are running in this mode, self-signed certificates are created by self-signed utility for the nodes and root client and are stored in a secret. You can look for the certificates created: ```shell -kubectl get secrets -``` +$ kubectl get secrets -```shell crdb-cockroachdb-ca-secret Opaque 2 23s crdb-cockroachdb-client-secret kubernetes.io/tls 3 22s crdb-cockroachdb-node-secret kubernetes.io/tls 3 23s @@ -179,7 +173,7 @@ spec: Launch a temporary interactive pod and start the built-in SQL client: ```shell -kubectl run cockroachdb --rm -it \ +$ kubectl run cockroachdb --rm -it \ --image=cockroachdb/cockroach \ --restart=Never \ -- sql --insecure --host=my-release-cockroachdb-public @@ -202,18 +196,39 @@ Exit the shell and delete the temporary pod: Kick off the upgrade process by changing the new Docker image, where `$new_version` is the CockroachDB version to which you are upgrading: ```shell -helm upgrade my-release cockroachdb/cockroachdb \ +$ helm upgrade my-release cockroachdb/cockroachdb \ --set image.tag=$new_version \ --reuse-values ``` -Kubernetes will carry out a safe [rolling upgrade](https://kubernetes.io/docs/tutorials/stateful-application/basic-stateful-set/#updating-statefulsets) of your CockroachDB nodes one-by-one. Monitor the cluster's pods until all have been successfully restarted: +Kubernetes will carry out a safe [rolling upgrade](https://kubernetes.io/docs/tutorials/stateful-application/basic-stateful-set/#updating-statefulsets) of your CockroachDB nodes one-by-one. +However, the upgrade will fail if it involves adding new Persistent Volume Claim (PVC) to the existing pods (e.g. enabling WAL Failover, pushing logs to a separate volume, etc.). In such cases, kindly repeat the following steps for each pod: +1. Delete the statefulset ```shell -kubectl get pods +$ kubectl delete sts my-release-cockroachdb --cascade=orphan ``` +The statefulset name can be found by running `kubectl get sts`. Note the `--cascade=orphan` flag used to prevent the deletion of pods. +2. Delete the pod +```shell +$ kubectl delete pod my-release-cockroachdb- ``` + +3. Upgrade Helm chart +```shell +$ helm upgrade my-release cockroachdb/cockroachdb +``` +Kindly update the values.yaml file or provide the necessary flags to the `helm upgrade` command. This step will recreate the pod with the new PVCs. + +Note that the above steps need to be repeated for each pod in the CockroachDB cluster. This will ensure that the cluster is upgraded without any downtime. +Given the manual process involved, it is likely to cause network churn as cockroachdb will try to rebalance data across the other nodes. We are working on an automated solution to handle this scenario. + +Monitor the cluster's pods until all have been successfully restarted: + +```shell +$ kubectl get pods + NAME READY STATUS RESTARTS AGE my-release-cockroachdb-0 1/1 Running 0 2m my-release-cockroachdb-1 1/1 Running 0 3m @@ -223,11 +238,9 @@ my-release-cockroachdb-init-nwjkh 0/1 ContainerCreating 0 6s ``` ```shell -kubectl get pods \ +$ kubectl get pods \ -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.spec.containers[0].image}{"\n"}' -``` -``` my-release-cockroachdb-0 cockroachdb/cockroach:v{{ .AppVersion }} my-release-cockroachdb-1 cockroachdb/cockroach:v{{ .AppVersion }} my-release-cockroachdb-2 cockroachdb/cockroach:v{{ .AppVersion }} @@ -237,7 +250,7 @@ my-release-cockroachdb-3 cockroachdb/cockroach:v{{ .AppVersion }} Resume normal operations. Once you are comfortable that the stability and performance of the cluster is what you'd expect post-upgrade, finalize the upgrade: ```shell -kubectl run cockroachdb --rm -it \ +$ kubectl run cockroachdb --rm -it \ --image=cockroachdb/cockroach \ --restart=Never \ -- sql --insecure --host=my-release-cockroachdb-public @@ -255,11 +268,9 @@ Due to a change in the label format in version 3.0.0 of this chart, upgrading re Get the new labels from the specs rendered by Helm: ```shell -helm template -f deploy.vals.yml cockroachdb/cockroachdb -x templates/statefulset.yaml \ +$ helm template -f deploy.vals.yml cockroachdb/cockroachdb -x templates/statefulset.yaml \ | yq r - spec.template.metadata.labels -``` -``` app.kubernetes.io/name: cockroachdb app.kubernetes.io/instance: my-release app.kubernetes.io/component: cockroachdb @@ -268,7 +279,7 @@ app.kubernetes.io/component: cockroachdb Place the new labels on all pods of the StatefulSet (change `my-release-cockroachdb-0` to the name of each pod): ```shell -kubectl label pods my-release-cockroachdb-0 \ +$ kubectl label pods my-release-cockroachdb-0 \ app.kubernetes.io/name=cockroachdb \ app.kubernetes.io/instance=my-release \ app.kubernetes.io/component=cockroachdb @@ -277,7 +288,7 @@ app.kubernetes.io/component=cockroachdb Delete the StatefulSet without deleting pods: ```shell -kubectl delete statefulset my-release-cockroachdb --cascade=false +$ kubectl delete statefulset my-release-cockroachdb --cascade=false ``` Verify that no pod is deleted and then upgrade as normal. A new StatefulSet will be created, taking over the management of the existing pods and upgrading them if needed. @@ -301,6 +312,7 @@ For details see the [`values.yaml`](values.yaml) file. | `conf.cluster-name` | Name of CockroachDB cluster | `""` | | `conf.disable-cluster-name-verification` | Disable CockroachDB cluster name verification | `no` | | `conf.join` | List of already-existing CockroachDB instances | `[]` | +| `conf.log` | Logging configuration | `{}` | | `conf.max-disk-temp-storage` | Max storage capacity for temp data | `0` | | `conf.max-offset` | Max allowed clock offset for CockroachDB cluster | `500ms` | | `conf.max-sql-memory` | Max memory to use processing SQL querie | `25%` | @@ -311,9 +323,11 @@ For details see the [`values.yaml`](values.yaml) file. | `conf.http-port` | WARNING this parameter is deprecated and will be removed in future version. Use `service.ports.http.port` instead | `""` | | `conf.path` | CockroachDB data directory mount path | `cockroach-data` | | `conf.store.enabled` | Enable store configuration for CockroachDB | `false` | +| `conf.store.count` | Number of data stores per node | `1` | | `conf.store.type` | CockroachDB storage type | `""` | | `conf.store.size` | CockroachDB storage size | `""` | | `conf.store.attrs` | CockroachDB storage attributes | `""` | +| `conf.wal-failover` | CockroachDB WAL Failover configuration | `{}` | | `image.repository` | Container image name | `cockroachdb/cockroach` | | `image.tag` | Container image tag | `v{{ .AppVersion }}` | | `image.pullPolicy` | Container pull policy | `IfNotPresent` | @@ -431,7 +445,7 @@ Override the default parameters using the `--set key=value[,key=value]` argument Alternatively, a YAML file that specifies custom values for the parameters can be provided while installing the chart. For example: ```shell -helm install my-release -f my-values.yaml cockroachdb/cockroachdb +$ helm install my-release -f my-values.yaml cockroachdb/cockroachdb ``` > **Tip**: You can use the default [values.yaml](values.yaml) @@ -443,12 +457,11 @@ helm install my-release -f my-values.yaml cockroachdb/cockroachdb Once you've created the cluster, you can start talking to it by connecting to its `-public` Service. CockroachDB is PostgreSQL wire protocol compatible, so there's a [wide variety of supported clients](https://www.cockroachlabs.com/docs/install-client-drivers.html). As an example, we'll open up a SQL shell using CockroachDB's built-in shell and play around with it a bit, like this (likely needing to replace `my-release-cockroachdb-public` with the name of the `-public` Service that was created with your installed chart): ```shell -kubectl run cockroach-client --rm -it \ +$ kubectl run cockroach-client --rm -it \ --image=cockroachdb/cockroach \ --restart=Never \ -- sql --insecure --host my-release-cockroachdb-public ``` - ``` Waiting for pod default/cockroach-client to be running, status is Pending, pod ready: false @@ -495,7 +508,7 @@ If you want more detailed information about the cluster, the best place to look If you want to see information about how the cluster is doing, you can try pulling up the CockroachDB Admin UI by port-forwarding from your local machine to one of the pods (replacing `my-release-cockroachdb-0` with the name of one of your pods: ```shell -kubectl port-forward my-release-cockroachdb-0 8080 +$ kubectl port-forward my-release-cockroachdb-0 8080 ``` You should then be able to access the Admin UI by visiting in your web browser. @@ -505,14 +518,12 @@ You should then be able to access the Admin UI by visiting ``` + +3. Upgrade Helm chart +```shell +$ helm upgrade my-release cockroachdb/cockroachdb +``` +Kindly update the values.yaml file or provide the necessary flags to the `helm upgrade` command. This step will recreate the pod with the new PVCs. + +Note that the above steps need to be repeated for each pod in the CockroachDB cluster. This will ensure that the cluster is upgraded without any downtime. +Given the manual process involved, it is likely to cause network churn as cockroachdb will try to rebalance data across the other nodes. We are working on an automated solution to handle this scenario. + +Monitor the cluster's pods until all have been successfully restarted: + +```shell +$ kubectl get pods + NAME READY STATUS RESTARTS AGE my-release-cockroachdb-0 1/1 Running 0 2m my-release-cockroachdb-1 1/1 Running 0 3m @@ -224,11 +239,9 @@ my-release-cockroachdb-init-nwjkh 0/1 ContainerCreating 0 6s ``` ```shell -kubectl get pods \ +$ kubectl get pods \ -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.spec.containers[0].image}{"\n"}' -``` -``` my-release-cockroachdb-0 cockroachdb/cockroach:v24.3.1 my-release-cockroachdb-1 cockroachdb/cockroach:v24.3.1 my-release-cockroachdb-2 cockroachdb/cockroach:v24.3.1 @@ -238,7 +251,7 @@ my-release-cockroachdb-3 cockroachdb/cockroach:v24.3.1 Resume normal operations. Once you are comfortable that the stability and performance of the cluster is what you'd expect post-upgrade, finalize the upgrade: ```shell -kubectl run cockroachdb --rm -it \ +$ kubectl run cockroachdb --rm -it \ --image=cockroachdb/cockroach \ --restart=Never \ -- sql --insecure --host=my-release-cockroachdb-public @@ -256,11 +269,9 @@ Due to a change in the label format in version 3.0.0 of this chart, upgrading re Get the new labels from the specs rendered by Helm: ```shell -helm template -f deploy.vals.yml cockroachdb/cockroachdb -x templates/statefulset.yaml \ +$ helm template -f deploy.vals.yml cockroachdb/cockroachdb -x templates/statefulset.yaml \ | yq r - spec.template.metadata.labels -``` -``` app.kubernetes.io/name: cockroachdb app.kubernetes.io/instance: my-release app.kubernetes.io/component: cockroachdb @@ -269,7 +280,7 @@ app.kubernetes.io/component: cockroachdb Place the new labels on all pods of the StatefulSet (change `my-release-cockroachdb-0` to the name of each pod): ```shell -kubectl label pods my-release-cockroachdb-0 \ +$ kubectl label pods my-release-cockroachdb-0 \ app.kubernetes.io/name=cockroachdb \ app.kubernetes.io/instance=my-release \ app.kubernetes.io/component=cockroachdb @@ -278,7 +289,7 @@ app.kubernetes.io/component=cockroachdb Delete the StatefulSet without deleting pods: ```shell -kubectl delete statefulset my-release-cockroachdb --cascade=false +$ kubectl delete statefulset my-release-cockroachdb --cascade=false ``` Verify that no pod is deleted and then upgrade as normal. A new StatefulSet will be created, taking over the management of the existing pods and upgrading them if needed. @@ -302,6 +313,7 @@ For details see the [`values.yaml`](values.yaml) file. | `conf.cluster-name` | Name of CockroachDB cluster | `""` | | `conf.disable-cluster-name-verification` | Disable CockroachDB cluster name verification | `no` | | `conf.join` | List of already-existing CockroachDB instances | `[]` | +| `conf.log` | Logging configuration | `{}` | | `conf.max-disk-temp-storage` | Max storage capacity for temp data | `0` | | `conf.max-offset` | Max allowed clock offset for CockroachDB cluster | `500ms` | | `conf.max-sql-memory` | Max memory to use processing SQL querie | `25%` | @@ -312,9 +324,11 @@ For details see the [`values.yaml`](values.yaml) file. | `conf.http-port` | WARNING this parameter is deprecated and will be removed in future version. Use `service.ports.http.port` instead | `""` | | `conf.path` | CockroachDB data directory mount path | `cockroach-data` | | `conf.store.enabled` | Enable store configuration for CockroachDB | `false` | +| `conf.store.count` | Number of data stores per node | `1` | | `conf.store.type` | CockroachDB storage type | `""` | | `conf.store.size` | CockroachDB storage size | `""` | | `conf.store.attrs` | CockroachDB storage attributes | `""` | +| `conf.wal-failover` | CockroachDB WAL Failover configuration | `{}` | | `image.repository` | Container image name | `cockroachdb/cockroach` | | `image.tag` | Container image tag | `v24.3.1` | | `image.pullPolicy` | Container pull policy | `IfNotPresent` | @@ -432,7 +446,7 @@ Override the default parameters using the `--set key=value[,key=value]` argument Alternatively, a YAML file that specifies custom values for the parameters can be provided while installing the chart. For example: ```shell -helm install my-release -f my-values.yaml cockroachdb/cockroachdb +$ helm install my-release -f my-values.yaml cockroachdb/cockroachdb ``` > **Tip**: You can use the default [values.yaml](values.yaml) @@ -444,12 +458,11 @@ helm install my-release -f my-values.yaml cockroachdb/cockroachdb Once you've created the cluster, you can start talking to it by connecting to its `-public` Service. CockroachDB is PostgreSQL wire protocol compatible, so there's a [wide variety of supported clients](https://www.cockroachlabs.com/docs/install-client-drivers.html). As an example, we'll open up a SQL shell using CockroachDB's built-in shell and play around with it a bit, like this (likely needing to replace `my-release-cockroachdb-public` with the name of the `-public` Service that was created with your installed chart): ```shell -kubectl run cockroach-client --rm -it \ +$ kubectl run cockroach-client --rm -it \ --image=cockroachdb/cockroach \ --restart=Never \ -- sql --insecure --host my-release-cockroachdb-public ``` - ``` Waiting for pod default/cockroach-client to be running, status is Pending, pod ready: false @@ -496,7 +509,7 @@ If you want more detailed information about the cluster, the best place to look If you want to see information about how the cluster is doing, you can try pulling up the CockroachDB Admin UI by port-forwarding from your local machine to one of the pods (replacing `my-release-cockroachdb-0` with the name of one of your pods: ```shell -kubectl port-forward my-release-cockroachdb-0 8080 +$ kubectl port-forward my-release-cockroachdb-0 8080 ``` You should then be able to access the Admin UI by visiting in your web browser. @@ -506,14 +519,12 @@ You should then be able to access the Admin UI by visiting