Kubernetes Operators extend Kubernetes API to manage third-party softwares to run on Kubernetes. Various Operators are being built today for variety of softwares such as MySQL, Postgres, Airflow, Redis, MongoDB, Kafka, Prometheus, Logstash, Moodle, Wordpress, Odoo, etc. We are seeing a new trend where Custom Resources introduced by different Operators are used to create custom application platforms as Code. Such platforms are Kubernetes-native, they don't have any platform vendor lock-in, and they can be created/re-created on any Kubernetes cluster.
Towards building such platforms various Operators and their Custom Resources need to provide a consistent usage experience. We present below guidelines that ensures such an experience to end users (Cluster Administrators and Application Developers). Operators that are developed following these guidelines provide ease of installation, management, and discovery to end users who are creating application platforms leveraging them.
Check out this post about our analysis of more than 100 open source Operators for their conformance to some of these guidelines.
1) Design Operator with declarative API/s and avoid inputs as imperative actions
2) Consider to use kubectl as the primary interaction mechanism
3) Decide Custom Resource Metrics Collection strategy
4) Register CRDs as YAML Spec rather than in Operator code
5) Make Operator namespace aware
6) Make Custom Resource Type definitions compliant with Kube OpenAPI
7) Define Custom Resource Spec Validation rules as part of Custom Resource Definition YAML
8) Set OwnerReferences for underlying resources owned by your Custom Resource
9) Use Helm chart or ConfigMap for Operator configurables
10) Use ConfigMap or Annotation or Spec definition for Custom Resource configurables
11) Generate Kube OpenAPI Spec for your Custom Resources
12) Add Platform-as-Code annotations on your CRD YAML
13) Package Operator as Helm Chart
14) Document how your Operator uses namespaces
15) Document Service Account needs of your Operator
16) Document naming convention and labels to be used with your Custom Resources
A declarative API allows you to declare or specify the desired state of your custom resource. Prefer declarative state over any imperative actions in Custom Resource Spec Type definition. Custom controller code should be written such that it reconciles the current state with the desired state by performing diff of the current state with the desired state. This enables end users to use your custom resources just like any other Kubernetes resources with declarative state based inputs. For example, when writing a Postgres Operator, the custom controller should be written to perform diff of the existing value of ‘users’ with the desired value of ‘users’ based on the received desired state and perform the required actions (such as adding new users, deleting current users, etc.).
Note that the diff-based implementation approach for custom controllers is essentially an extension of the level-triggered approach recommended in the general guidelines for developing Kubernetes controllers.
An example where underlying imperative actions are exposed in the Spec is this MySQL Backup Custom Resource Spec. Here the fact that MySQL Backup is done using mysqldump tool is exposed in the Spec. In our view such internal details should not be exposed in the Spec as it prevents Custom Resource Type definition to evolve independently without affecting its users.
Custom resources introduced by your Operator will naturally work with kubectl. However, there might be operations that you want to support for which the declarative nature of custom resources is not appropriate. An example of such an action is historical record of how Postgres Custom Resource has evolved over time that might be supported by the Postgres Operator. Such an action does not fit naturally into the declarative format of custom resource definition. For such actions, we encourage you to consider using Kubernetes extension mechanisms of Aggregated API servers and Custom Sub-resources. These mechanisms will allow you to continue using kubectl as the primary interaction point for your Operator. Refer to this blog post to learn more about them. So before considering to introduce new CLI for your Operator, validate if you can use these mechanisms instead.
Plan for metrics collection of custom resources managed by your Operator. This information is useful for understanding effect of various actions on your custom resources over time and improving traceability. For example, this MySQL Operator collects metrics such as how many clusters were created. One option to collect such metrics is to build the metrics collection inside your custom controller as done by the MySQL Operator. Another option is to leverage Kubernetes Audit Logs for this purpose. Then you can use external tooling like kubeprovenance to build the required metrics. Separately, you can consider exposing the collected metrics in Prometheus format as well.
Registering CRDs as YAML spec rather than in your Operator code has following advantage. Installing CRD requires Cluster-scope permission. If the CRD registration is done as YAML manifest, then it is possible to separate CRD registration from the Operator Pod deployment. CRD registration can be done by Cluster administrator while Operator Pod deployment can be done by a non-admin user. On the other hand, if CRD registration is done as part of your Operator code then the Operator Pod will need to be given Cluster-scope permissions.
Your Operator should support creating resources within different namespaces rather than just in the default namespace. This will allow your Operator to support use-cases where multitenancy through namespaces is sufficient (these are called 'soft' multitenancy use-cases).
Kubernetes API details are documented using Swagger v1.2 and OpenAPI. Kube OpenAPI supports a subset of OpenAPI features to satisfy kubernetes use-cases. As Operators extend Kubernetes API, it is important to follow Kube OpenAPI features to provide consistent user experience. Following actions are required to comply with Kube OpenAPI.
Add documentation on your custom resource Type definition and on the various fields in it. The field names need to be defined using following pattern: Kube OpenAPI name validation rules expect the field name in Go code and field name in JSON to be exactly same with just the first letter in different case (Go code requires CamelCase, JSON requires camelCase).
When defining the types corresponding to your custom resources, you should use kube-openapi annotation — “+k8s:openapi-gen=true’’ in the type definition to enable generating OpenAPI Spec documentation for your custom resources. An example of this annotation on type definition on CloudARK sample Postgres custom resource is as follows:
// +k8s:openapi-gen=true
type Postgres struct {
:
}
Your Custom Resource Spec definitions will contain different properties and they may have some domain-specific validation requirements. Kubernetes 1.13 onwards you will be able to use OpenAPI v3 schema to define validation requirements for your Custom Resource Spec. For instance, below is an example of adding validation rules for our sample Postgres CRD. The rules define that the Postgres Custom Resource Spec properties of 'databases' and 'users' should be of type Array and that every element of this array should be of type String. Once such validation rules are defined, Kubernetes will reject any Custom Resource instance creation that does not satisfy these requirements in their Spec.
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: postgreses.postgrescontroller.kubeplus
annotations:
composition: Deployment, Service
spec:
group: postgrescontroller.kubeplus
version: v1
names:
kind: Postgres
plural: postgreses
scope: Namespaced
validation:
# openAPIV3Schema is the schema for validating custom objects.
openAPIV3Schema:
properties:
spec:
properties:
databases:
type: array
items:
type: string
users:
type: array
items:
type: string
An Operator will typically create one or more native Kubernetes resources, such as Deployment, Service, Secret, etc., as part of instantiating a Custom Resource instance. Here this custom resource is the owner of its underlying native resources that were created by the Operator. OwnerReferences are key for correct garbage collection of custom resources. OwnerReferences also help with finding runtime composition tree of your custom resource instances.
Typically Operators will need to support some form of customization. For example, this MySQL Operator supports following customization settings: whether to deploy the Operator cluster-wide or within a particular namespace, which version of MySQL should be installed, etc. If you have created Helm Chart for your Operator then use values YAML file to specify such parameters. If not, use ConfigMap for this purpose. This guideline ensures that Kubernetes Administrators can interact and use the Operator using Kubernetes native's interfaces.
An Operator generally needs to take configuration parameter as inputs for the underlying resource that it is managing through its custom resource such as a database. We have seen three different approaches being used towards this in the community: using ConfigMaps, using Annotations, or using Spec definition itself. Any of these approaches should be fine based on your Operator design.
Nginx Custom Controller supports both ConfigMap and Annotation. Oracle MySQL Operator uses ConfigMap. PressLabs MySQL Operator uses Custom Resource Spec definition.
We have developed a tool that you can use for generating Kube OpenAPI Spec for your custom resources. It wraps code available in kube-openapi repository in an easy to use script. You can use this tool to generate OpenAPI Spec for your custom resources. The generated Kube OpenAPI Spec documentation for sample Postgres custom resource is here. The OpenAPI Spec for your Operator provides a single place where documentation is available for the entire Type definition hierarchy for the custom resources defined by your Operator.
Platform-as-Code annotations are a standard way to package information about Custom Resources. The 'usage' annotation should be used to define how-to use guide of a Custom Resource. The 'constants' annotation should be used to define Operator's implementation choices and assumption. The 'openapispec' annotation should be used to define the OpenAPI Spec schema for a Custom Resource. The 'composition' annotation should be used to specify the underlying Kubernetes resources that will be created as part of managing a Custom Resource instance. The values of the first three annotations are names of ConfigMaps with appropriate data. An example of this can be seen for our sample Moodle Custom Resource Definition below:
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: moodles.moodlecontroller.kubeplus
annotations:
platform-as-code/usage: moodle-operator-usage.usage
platform-as-code/constants: moodle-operator-implementation-details.implementation_choices
platform-as-code/openapispec: moodle-openapispec.openapispec
platform-as-code/composition: Deployment, Service, PersistentVolume, PersistentVolumeClaim, Secret, Ingress
This information is useful for application developers when figuring out how to use your Operator and its Custom Resources. Externalizing information like that available in the 'composition' annotation makes it possible to build tools like kubediscovery that show Object composition tree for custom resource instances built leveraging this information.
Create a Helm chart for your Operator. The chart should include two things:
-
All Custom Resource Definitions for Custom Resources managed by the Operator. Examples of this can be seen in CloudARK sample Postgres Operator and in this MySQL Operator.
-
ConfigMaps corresponding to Platform-as-Code annotations that you have added on your Custom Resource Definition (CRD).
For Operator developers it is critical to consider how their Operator works with namespaces. Typically, an Operator can be installed in one of the following configurations:
-
Operator runs in the default namespace and Custom Resource instances are created in the default namespace.
-
Operator runs in the default namespace but Custom Resource instances can be created in non-default namespaces.
-
Operator runs in a non-default namespace and Custom Resource instances can be created in that namespace.
Given these options, it will help consumers of your Operator if there is a clear documentation of how namespaces are used by your Operator. Include this information in the ConfigMap that you will add for the 'usage' annotation on the CRD.
Your Operator may be using default service account or some specific service account. Moreover, the service account may need to be granted specific permissions. Clearly document the service account needs of your Operator. Include this information in the ConfigMap that you will add for the 'usage' annotation on the CRD.
You may have special requirements for naming your custom resource instances or some of their Spec properties. Similarly you may have requirements related to the labels that need to be added on them. Document this information with in the ConfigMap corresponding to the 'usage' annotation on the CRD.