How not to: K8S Operators
When running Systems on Kubernetes, we often have to deploy not only application containers but als more complex systems like databases, messaging systems etc. These systems come with a number of idiosyncrasys unbenounced to you at first. Operators have stepped in to address some of these operational challenges. But beware: there are really bad operators out there and here is why.
Running distributed systems like Kafka or Monogdb is hard. K8S while being one of the most awesome developments in the Cloud Native Orchestration realm, does not reduce this complexity. Operators help us run these complex system by capturing years of operational knowledge in running these systems in production into the code of an operator, a pattern initially outlined by Core OS.
The concept of most operators is fairly simple: First we deploy an operator. Let’s use Kafka and the operator from Banzaicloud as a sample here. Once installed, we push a custom resource definition to create a cluster:
apiVersion: kafka.banzaicloud.io/v1beta1 kind: KafkaCluster metadata: name: kafka spec: zkAddresses: - "example-zookeepercluster-client.zookeeper:2181" clusterImage: "wurstmeister/kafka:2.12-2.3.0" brokerConfigGroups: default: storageConfigs: - mountPath: "/kafka-logs" pvcSpec: accessModes: - ReadWriteOnce resources: requests: storage: 10Gi brokers: - id: 0 brokerConfigGroup: "default" - id: 1 brokerConfigGroup: "default" - id: 2 brokerConfigGroup: "default"
Note, this customer resource does not mention deployments, persistent volumes, persistent or replica sets, no pods, no config maps, no secrets. Our operator will take this definition, assume sensible defaults for the values, we have omited and generate the necessary deployments configuration, secrets, volumes etc. required for the kafka cluster to become available. Good operators will even allow us to change the definition of our cluster (and infer the steps necessary to comply with our demand) and manage auxiliary information for the system it has just created like users, topics etc. that we need in order to access the cluster. This means, that deploying a new service that requires a new Kafka topic is as simple as pushing a
KafkaCluster and a new
KafkaTopic custom resource. The Kafka Operator will create a new topic for me, and my service will receive assignments as soon as it’s ready. No manual intervention necessary.
What good operators do:
Having worked with quite a few K8S operators from multiple software vendors, among them a few OSS projects, I’ve observed that there many really good operators out there along with quite a few bad ones. But what distinguishes good from bad operators?
1. Good operators come with their own CRDs
An operator should have ONE and only ONE interface and that should be one or more custom resources. In case of Mongodb that would be a
MongoDbCluster and in the case of Kafka that could be resources like
KafkaTopic. Creating a new topic in Kafka should require a developer to push a new
KafkaTopic custom resource and the operator should create this topic for him as soon as the cluster is ready for it. This reduces manual intervention by the operator, thus preventing many classes of mistakes.
2. Good operators do not assume
An operator should at all times know, what the desired state of the system under his authority SHOULD BE and what the state actually IS. It cannot assume, just because it created the Kafka topic, that the topic is still there at a later point in time, just as it cannot assume the cluster to be operational because it was once before! This is why every K8S resources has a desired and an actual state. While the desired state of a mongodb cluster might specify 3 replicas and one master, the actual state may differ. One replica might just be moving from one node to another because it was evicted from the node or the node is going offline for an update. The operator should react to these changes in the actual state and take steps to get back into alignment with the desired state.
3. Good operators defer and retry
If a system consists of multiple CRDs like for example when creating a
KafkaCluster and a
KafkaUser the operator cannot assume that these resources are created in the correct sequence or even be present when it expects them to be. When boostrapping a cluster, the operations team should be able to create the
KafkaCluster and later apply a new batch of resource files containing a
KafkaUser for a particular service they are deploying. The case could also be inverted which makes it more complex: We apply number of resource files and the
KafkaUser happens to be created long before the
KafakCluster is applied. The worst thing the operator can do at this point is to respond that the cluster does not exists and just forget about the user. When our cluster becomes available later, we will not have the user our service needs to access Kafka and force the service into a crashloop. Manual interaction is required - which is bad.
4. Good Operators provide us with a ready-to-use system
If an operator provides us a running Kafka cluster but we then have to create topics manually, this does not help at all, we could just as well have written our own K8S deployments and apply them manually. Our service wants to produce messages to a topic, it does not care wether the cluster is running or not, if the topic is missing or their user does not have permission to access it. Our service can deal with the fact, that the cluster is ready but the topic is not - that’s what readiness-checks are for. It remaind the operators task has to make sure, that the topic is created as soon as possible.
5. Good operators do not have external state
The nightmare of any operations engineer is deploying something that then goes on to pull in external state which he has no influence over. The official Mongodb Operator does exactly that: It stores it’s state in a proprietary cloud hosted application. What this means is that the operator broke the one and only one interface rule in the first principle we discussed. What is the worst thing that can happen? Let’s assume for a minute, that the operations guy deploys the operator in the testing environment and happend to reference the API key for the production clusters. Once the operator starts running it will start deploying a production cluster on your test system and potentially cluster your test instances to your production instances - good luck fixing that… This could not have happend if he would have had to create a second
MongoDbCluster custom resource in the staging namespace, which would have given him a new cluster in a new namespace. Also, a teammate may have reviewed the resource files in advance and found the mistake long before the issue hitting production.
Having external state also throws a wrench into the best-practice of treating infrastructure as code. It is almost impossible to manage the state in a proprietary external system and keep it conistent with the definitions in a git source tree.
6. Good operators do only what they are meant to do
We came accross a problem when we wanted to use the Kafka operator. This operator created a cluster and enabled TLS client authentication. Which is quite fine and we applaude the secure-by-design philosophy. The problem was, that the operator created new self-signed CA keys and cert. It also created signed keys and certs for every client we created through their custom resource. These CA keys were not trusted in our service containers which forced us to disable TLS encryption to interact with Kafka. In defense of the guys at Banzaicloud they made use of CertManager and self-signed issuers which is good practise (and the exact way we do it as well) but their operator lacked the ability for me to provide the keys and certs that are signed with our own trusted root CA.
The kafka operator did things well by using a tool to issue certs that is already established an widely used. It have gone one step further and offer the ability to provide certs and keys through existing K8S secrets.
7. Good operators are not lordly
While operators should abstract operational complexity and processes they should not hinder more experienced users when customizing the system under their management. The ability to customize the kafka daemon with custom settings that the operator did not set, setting things like custom resource limits or specifiying a custom storage class for the persistent volumes is a must for production usecases.
Let’s summarize some principles good operators should adhere to:
- Use only CRDs to define the desired state
- Provide a ready-to-use system (no additional interaction for bootstrapping necessary)
- Do not depend on external services or state
- Don’t create resources other component can provide you with (secrets, loadbalacing, monitoring etc.)
- Don’t limit the user unnecessarily (If the user knows what he’s doing, do not patronize him)