Powering Cluster Management with Kubernetes-Native CI/CD

Written by

Written by Luke Addison


			Powering Cluster Management with Kubernetes-Native CI/CD
Tekton

Published on our Cloud Native Blog.
Tagged with

Kubernetes-native technologies are those which are specifically designed to run on top of Kubernetes. This often means that their architecture follows the controller pattern, leveraging the Kubernetes API machinery for declarative configuration.

Resilient CI/CD is a critical concern for many organisations, especially at scale. Being able to leverage Kubernetes for CI/CD provides the necessary support for building a resilient, scalable solution and brings all the benefits of declarative configuration.

At Jetstack we regularly receive questions from customers related to managing infrastructure at scale and the CI/CD automation required to power such capabilities. As Kubernetes has matured, so has its ability to handle more of this logic natively, taking advantage of the machinery that drives typical application deployments. Here we describe an opinionated repository layout and Kubernetes-native CI/CD setup for managing a fleet of Kubernetes clusters. We first look at the set of tools used and then describe how they fit together, with a demo repository to show some of the lower-level details.

Tools Used

Here we describe key tooling which we will use to achieve Kubernetes-native CI/CD and declarative cluster management.

Tekton Pipelines

Tekton Pipelines (originally Knative Build) extends the Kubernetes API to support declarative pipelines and comes with 5 new resources: Tasks, Pipelines, TaskRuns, PipelineRuns and PipelineResources.

A Task describes a sequence of steps to be run in order and is implemented as a Kubernetes Pod, where each container in the Pod is a step in the Task. Pipelines define how Tasks are put together and can be run in any order you choose, including concurrently, passing inputs and outputs between them. TaskRuns and PipelineRuns can then reference Tasks and Pipelines respectively to invoke them; this allows Task and Pipeline definitions to be reused across runs. Alternatively, TaskRuns and PipelineRuns can specify Tasks and Pipelines inline. Finally, PipelineResources provide runtime information to those runs, such as the Git repository a run should be executed against, although this information can also be provided using parameters.

The Tekton Pipeline repository has extensive documentation if you would like to know more. In particular, I find Tekton’s capabilities for providing runtime credentials to be especially powerful.

We will use Tekton Pipelines to automate the generation of manifests for cluster addons in response to GitHub events.

We use the term cluster addon loosely to refer to any application deployed to a cluster to support developer workloads, for example Nginx Ingress Controller or cert-mananger.

Lighthouse

Jenkins X is a full Kubernetes-native CI/CD solution that supports an opinionated end-to-end life cycle for building and deploying Kubernetes applications. Jenkins X is a very powerful tool, however I have found it to be too heavyweight for some use cases. Fortunately, the Jenkins X folks have designed the components in a modular way so that they can be used in isolation.

Lighthouse is a component of Jenkins X and a webhook handler for Git provider events. Lighthouse is a fork of Prow which is used to implement CI/CD for developing Kubernetes itself. Events received by Lighthouse from Git providers (for example PR creation from GitHub) can trigger Tekton Pipelines to automate actions (such as CI tests). Note that Lighthouse also supports triggering Jenkins pipelines.

The main advantage of Lighthouse over Prow is that it uses jenkins-x/go-scm and so supports a large number of Git providers (rather than just GitHub).

We will use Lighthouse to receive GitHub events and invoke the Tekton Pipelines that perform cluster addon manifest generation.

Flux

Flux v2 is a GitOps tool which hydrates and syncs manifests from a Git repository to a Kubernetes cluster. Flux v2 brought a whole host of new capabilities compared to v1, including the ability to declaratively apply Helm charts and Kustomize configuration. Config Sync, a component of Google’s Anthos platform, has similar capabilities to Flux v1.

We will use Flux v2 (from now on just Flux) to sync cluster addons and workload cluster declarations.

Cluster API

Cluster API brings declarative, Kubernetes-style APIs to cluster creation, configuration and management. We have covered the power of Cluster API in a previous post and in this post we will be using it to provision our Kubernetes workload clusters. There is already a large number of infrastructure providers to choose from depending on your environment. There are also some alternatives to Cluster API such as Google’s Config Connector, which can provision GKE clusters declared as Kubernetes resources, or Crossplane, which has similar capabilities to Config Connector but supports multiple clouds.

We will use Cluster API to declare workload clusters using an experimental Cluster API infrastructure provider that I wrote to create cluster Nodes as Pods to save on infrastructure costs

kfmt

kfmt is small Go binary I wrote to format collections of manifests into a canonical structure, one manifest per file. The structure corresponds with the logical resource structure presented by the Kubernetes API, separating namespaced resources from non-namespaced resources and giving all resources in the same Namespace their own directory. kfmt makes it easier to manage and review changes to large collections of manifests which is particularly useful when upgrading manifests for third-party tools.

We will use kfmt to format cluster addon manifests.

Bringing the Tools Together

Here we will describe how the tools listed above can be used together to manage configuration across a fleet of clusters. We will of course be using other tools, but they play a more minor role compared to the ones listed above. A demo repository has been created here which we will refer to for lower-level details.

The aim of this repository is to facilitate managing a large number of clusters in the simplest and DRYest way possible. In particular, we only use a single repository to manage all clusters. In addition, we want a way to pin clusters to particular Git references and to promote those references to downstream clusters, but have the flexibility to define cluster specific parameters.

Repository Walkthrough

As is typical when using Cluster API, we provision a management cluster (using gcloud in this case) to host our Cluster API controllers and components. Cluster API resources are then synced to this management cluster using Flux to provision workload clusters. Cluster addons are defined as Kustomize bases which are pulled into various cluster flavours to standardise the capabilities of the clusters an organisation may want to support. In this case we only have two cluster flavours, management and workload, however it is straight forward to define more for specific use cases.

Each cluster syncs configuration from its own directory using Flux. Each directory is created by running flux bootstrap against the corresponding cluster. The main difference between each cluster’s directory is the cluster-sync.yaml configuration which targets a particular repository reference (through the spec.ref.branch field of Flux’s GitRepository resource) and Kustomize flavour (through the spec.path field of Flux’s Kustomization resource) to install addons, providing cluster specific parameters. Note that the name cluster-sync.yaml is not special outside of the context of this post, I just picked it to hold the configuration for syncing cluster addons.

To help with managing the Kustomize bases, the generate target in the Makefile runs the procedure to generate much of the addon configuration. When running generation locally, this target can be wrapped in a Docker container, which contains all the required dependencies, by running make docker_generate. Much of the resulting configuration is formatted using kfmt to give consistency and to make it easier to review any changes upon regeneration.

To automate this generation procedure we use Lighthouse and Tekton Pipelines; whenever a PR is created or changed (or a commit added to the main branch) a GitHub webhook event is sent to Lighthouse which spins up a Tekton Pipeline. This Pipeline checks out the PR branch (or main branch), runs make generate (within the same container image as is used locally) and pushes any changes as a new commit. These changes will then be picked up by Flux when they reach the main branch.

An example PR demonstrating this process by upgrading Nginx Ingress Controller can be found here. We can see that the only change from a human is the initial commit bumping the version in the Makefile. This triggers a Tekton Pipeline which appends a commit to the PR with the commit message Generated containing the updated manifests of the new version. We can see the output of this process by visiting the Tekton Dashboard:

Tekton Dashboard PR

A really powerful aspect of Lighthouse is the ability to include the Pipeline configuration within the repository. .lighthouse/triggers.yaml defines the list of presubmit (PR) and postsubmit (merge) jobs to run with the PipelineRun definitions themselves referenced through the source field. Documentation for configuring Pipelines within repositories to be invoked by Lighthouse can be found here.

To learn more about the capabilities of Lighthouse, see the Lighthouse documentation together with the management cluster’s cluster-sync.yaml configuration which specifies patches for the Lighthouse HelmRelease. In addition, the plugins directory of the Lighthouse repository contains a directory for each in-built plugin, which includes descriptions of the capabilities of each plugin.

Cluster Management

Here we look in more detail at the different aspects of the repository that contribute to cluster management.

The clusters target in the Makefile generates Cluster API configuration to be synced to the management cluster. This is applied to the infrastructure Namespace.

We can use the cluster-api category to see all the resulting resources after they are synced to the management cluster using Flux:

$ kubectl get cluster-api -n infrastructure
NAME                                                         AGE
kubeadmconfig.bootstrap.cluster.x-k8s.io/development-84qd5   3m2s
kubeadmconfig.bootstrap.cluster.x-k8s.io/development-xfxwh   3m42s

NAME                                                           AGE
kubeadmconfigtemplate.bootstrap.cluster.x-k8s.io/development   3m44s

NAME                                             PHASE     REPLICAS   READY   UPDATED   UNAVAILABLE
machinedeployment.cluster.x-k8s.io/development   Running   1          1       1

NAME                                                    PROVIDERID                                                 PHASE     VERSION
machine.cluster.x-k8s.io/development-6c5fb44c5c-z64cx   kubernetes://infrastructure/development-worker-xfcft       Running   v1.17.17
machine.cluster.x-k8s.io/development-zcpw8              kubernetes://infrastructure/development-controller-vcptf   Running   v1.17.17

NAME                                   PHASE
cluster.cluster.x-k8s.io/development   Provisioned

NAME                                              MAXUNHEALTHY   EXPECTEDMACHINES   CURRENTHEALTHY
machinehealthcheck.cluster.x-k8s.io/development   100%           2                  2

NAME                                                 REPLICAS   AVAILABLE   READY
machineset.cluster.x-k8s.io/development-6c5fb44c5c   1          1           1

NAME                                                                              AGE
kubernetesmachinetemplate.infrastructure.dippynark.co.uk/development-controller   3m43s
kubernetesmachinetemplate.infrastructure.dippynark.co.uk/development-worker       3m43s

NAME                                                           PHASE         HOST             PORT   AGE
kubernetescluster.infrastructure.dippynark.co.uk/development   Provisioned   34.77.174.110    443    3m43s

NAME                                                                            PROVIDERID                                                 PHASE     VERSION    AGE
kubernetesmachine.infrastructure.dippynark.co.uk/development-controller-vcptf   kubernetes://infrastructure/development-controller-vcptf   Running   v1.17.17   3m2s
kubernetesmachine.infrastructure.dippynark.co.uk/development-worker-xfcft       kubernetes://infrastructure/development-worker-xfcft       Running   v1.17.17   3m42s

NAME                                                      AGE
clusterresourceset.addons.cluster.x-k8s.io/calico-addon   6m14s

NAME                                                            AGE
clusterresourcesetbinding.addons.cluster.x-k8s.io/development   2m1s

NAME                                                            INITIALIZED   API SERVER AVAILABLE   VERSION    REPLICAS   READY   UPDATED   UNAVAILABLE
kubeadmcontrolplane.controlplane.cluster.x-k8s.io/development   true          true                   v1.17.17   1          1       1

Once a cluster definition has been applied to the management cluster, flux bootstrap is run through a simple CronJob. In particular, this bootstrap procedure creates the directory within the repository corresponding to the cluster. A ClusterResourceSet handles installing Calico CNI into workload clusters to allow the Flux components to schedule.

Note that CronJobs still experience a bug where if they miss too many runs they will never recover. This has been fixed in the v2 implementation.

Once a cluster has been bootstrapped it is associated with a flavour so that the corresponding addons can be synced. For development clusters, which sync addons directly from the main branch, a Kustomization resource is used to pick out the flavour (for example the development cluster’s cluster-sync.yaml). For clusters downstream of development, we can promote to them, for example:

make docker_promote SOURCE=development DESTINATION=staging

If the destination cluster does not have its own cluster-sync.yaml, this will be copied from the source. The Git hash of the main branch (or whichever branch is targeted by the source cluster) will then be taken and set in the destination cluster’s cluster-sync.yaml, effectively pinning the destination cluster’s configuration to a particular point in time for a particular branch. This hash can then be propagated further downstream if desired (for example to a production cluster), allowing particular states to be tested and promoted as necessary. Further automation can be used to manage this promotion across a large number of clusters.

To illustrate this procedure more concretely, we will run through the steps and corresponding PRs for promoting the development cluster to staging, adding a production cluster and then promoting the staging cluster to production:

  • make docker_promote SOURCE=development DESTINATION=staging
    • At the time this command was run the staging cluster had been freshly bootstrapped, so the development cluster’s cluster-sync.yaml was copied in its entirety, but with the target branch set to the hash of the main branch at the time
    • PR: https://github.com/dippynark/cicd-demo/pull/20
    • Once the PR is approved and merged, the Flux instance running on the staging cluster syncs the manifests corresponding to that hash to match the development cluster
    • Note that all Flux instances on all clusters are syncing from the main branch (although using different paths), however it’s the configuration itself (i.e. the GitRepository resources) that provide the redirection to point at a particular Git reference
  • make docker_promote SOURCE=development DESTINATION=staging
  • make docker_clusters
    • After adding the production cluster to our list of clusters in the Makefile (within the clusters target), we generate and commit the corresponding manifests
    • PR: https://github.com/dippynark/cicd-demo/pull/22
    • Once the PR is approved and merged, the Cluster API resources are synced to the management cluster and the production cluster is provisioned
$ kubectl get kubernetesclusters -n infrastructure
NAME          PHASE         HOST             PORT   AGE
development   Provisioned   34.77.174.110    443    2h
production    Provisioned   35.189.198.11    443    61s
staging       Provisioned   35.205.108.153   443    2h
  • make docker_promote SOURCE=staging DESTINATION=production
    • Once flux bootstrap has been run against the production cluster, we can pull the main branch (which now contains the production cluster’s corresponding directory) and promote the staging cluster to production
    • PR: https://github.com/dippynark/cicd-demo/pull/23
    • Note that this time we modify the Kustomization patches to increase the number of replicas of an Nginx Deployment in production to 2. These patches will be retained throughout subsequent promotions which allows for cluster specific parameters to be set

Note that subsequent runs of make docker_promote SOURCE=staging DESTINATION=production will do nothing until we promote to staging again since the hash pinned to the production cluster is now the same used by staging (i.e the promotion procedure is idempotent). We can also access the production cluster to see that we have 2 replicas of Nginx deployed instead of 1 as in the development and staging clusters.

CLUSTER_NAME="production"
make get_kubeconfig_$CLUSTER_NAME
export KUBECONFIG=kubeconfig
kubectl get pods -n nginx

This gives the following output:

NAME                     READY   STATUS    RESTARTS   AGE
nginx-574b87c764-b6qgp   1/1     Running   0          115s
nginx-574b87c764-fwnc6   1/1     Running   0          115s

For large numbers of clusters, promotion could be managed in groups using repeated runs of make docker_promote, but no matter how this is done, the logic can all be managed through file manipulation within the repository rather than by interacting with the clusters directly.

Architecture

Benefits

One of the main benefits of this setup is that the entire configuration for every cluster is maintained in a single repository. A common requirement we see from customers is ensuring clusters conform to organisational policy (described as Gatekeeper constraints for example) and are managed consistently to reduce cluster sprawl, so being able to bring this common policy and consistency into cluster flavours is a useful capability. Any cluster specific configuration (for example cluster specific parameters and RBAC resources for user access) can be placed in the cluster specific directory without affecting the DRY approach implemented by the flavours.

In addition, even though the setup supports pinning clusters to specific Git references, the current configuration for each cluster is visible in the main branch, giving a single view of the desired state of your entire fleet.

Drawbacks

The main drawbacks from this setup in my opinion are around visibility. Firstly, when using Kubernetes resources which in turn hydrate or generate further resources (for example Flux’s HelmRelease or Kustomization resources), it can be tricky to see exactly what you are going to apply or change by looking at a PR. In addition, after changes have merged to the main branch, we do not immediately know if Flux was able to apply them successfully or if resulting application rollouts completed successfully. For this reason it is critical that substantial monitoring is in place for Flux and other applications to help raise any problems that might occur when syncing changes.

This setup also doesn’t have good support for promoting individual addons explicitly. Instead, entire references to cluster flavours are promoted, treating the entire collection of addons as a single unit. This can make changes easier to reason about but sacrifices flexibility.

The GitOps methodology can also be challenging to adhere to for certain applications. To give a specific example, when using Istio, applications with injected sidecars need to be rolled as part of a control plane upgrade. Managing such complex upgrade procedures without tedious ordered commits requires more logic to live within the cluster (for example the Istio Operator) which can come with a certain amount of engineering overhead, particularly for in-house applications. Flagger is a powerful tool that can help implement more complex upgrade procedures in a generic way.

Another drawback is that each Flux instance targets a separate directory. When you get to 100s or 1000s of clusters this may not be the nicest structure to deal with, even with good tooling. The setup could be modified to support directories that are synced to multiple clusters, but at the cost of cluster specific configuration being more complicated to manage.

A final drawback to mention is around safety; if the cluster resources defined in the management cluster are accidentally deleted then all of the workload clusters would be deleted which could be catastrophic. It may be prudent then to deploy multiple management clusters that each define a failure domain.

Closing Remarks

In this post we have described one way of managing a fleet of Kubernetes clusters using Kubernetes-native tooling and the GitOps methodology.

There are many more potential use cases for the tools mentioned in this post. For example, the kfmt tool used to format manifests, as described above, uses a Tekton release Pipeline (triggered by Lighthouse) for running CI tests before building and pushing an image in-cluster using kaniko and creating a GitHub release. In addition, an interesting extension to the demo repository would be to trigger a Pipeline to promote the development cluster to staging whenever the repository is tagged with a semantic tag (see an example Lighthouse TriggerConfig here used by kfmt).

More inspiration for how to use Tekton Pipelines can be found by looking through the Tekton Catalog which contains community maintained Tasks that can be used as part of your Pipelines.

Get started with Jetstack

Enquire about Subscription

Contact us