An Internal Kubernetes Platform is a bespoke service offered to employees of an organisation that provides access to a Kubernetes environment. The main differentiator between an Internal Kubernetes Platform and a public Kubernetes platform such as GKE is that an Internal Kubernetes Platform is built and designed for a specific organisation. However, public platforms often play the crucial role of providing a battle-tested foundation that an Internal Kubernetes Platform can be built on top of.
An Internal Kubernetes Platform can (and should) be thought of as an internal product with users of the platform (e.g. developer or application teams) being the platform’s customers. A key goal here is to abstract organisational complexities away from customers and provide a well-trodden path for using Kubernetes. For organisations that are committed to using Kubernetes at scale, investment into an Internal Kubernetes Platform is critical to achieving efficient innovation.
At Jetstack we have worked with a number of organisations to design, develop and support Internal Kubernetes Platforms from the ground up, as well as build key extensions to existing platforms. Due to the bespoke nature of these platforms the variation in strategies is large and depends heavily on the requirements, goals and resources of the organisation, however in this post we present some of our opinions on the benefits, key considerations and pitfalls of designing and building such a platform.
There are a huge number of benefits that can come from building an Internal Kubernetes Platform; here we describe the ones that we have found to be the most impactful.
Streamlined Access to Accredited Infrastructure
Even with the extensive capabilities offered by public Kubernetes platforms, gaining access to accredited Kubernetes infrastructure can be incredibly challenging for developers at some organisations. This could be due to security restrictions requiring that clusters are configured in a very particular way, complex network topologies requiring complicated network or proxy setups, or even more fundamental requirements such as installing and configuring kubectl according to organisational policy.
An internal platform can provide a tested, well-documented, supported and ideally self-service process for gaining access to compliant Kubernetes clusters, taking into account and abstracting away from the particular idiosyncrasies of the organisation.
Consistent Policy Enforcement
Many organisations need to implement large numbers of controls in order to meet their security and compliance requirements. In addition, there are typically organisational policies that need to be enforced to ensure configuration standards. Managing clusters through an Internal Kubernetes Platform provides a powerful opportunity for these policies to be defined centrally and applied consistently, with input from across the organisation and with guidance for customers on how to comply.
We have found that Rego and Gatekeeper coupled with OPA’s testing framework, constraint generation with Konstraint and a GitOps tool such as Flux is a powerful combination for writing, testing, generating, distributing and enforcing policies across a fleet of managed clusters.
Guidance, Support and Community
For very large organisations, silos of information are common. Investment into an Internal Kubernetes Platform can help centralise and distribute organisational best-practices and provide a basis for an internal community, providing guidance and support around topics such as:
- Security and compliance
- Application deployment archetypes
- Monitoring and alerting
- Incident response
- Change management
- Cost optimisation
Public Kubernetes platforms aim to cater for a huge range of Kubernetes use cases across many organisations; an Internal Kubernetes Platform has the advantage that it only needs to cater for the needs of a single organisation. This means that managed clusters can bring many extensions out of the box for customers to use that are configured specifically for that organisation. Some powerful examples include:
- Prometheus: Kubernetes-native monitoring and alerting system (supporting custom alerting integrations through Alertmanager webhook receivers)
- Prometheus Custom and External Metrics Adapter: Exposes Prometheus metrics through the custom and external metrics APIs to drive horizontal Pod autoscaling
- ExternalDNS: Configures DNS records
- cert-manager: Provisions and manages TLS certificates (supporting custom integrations through issuers)
- Dex: OIDC identity provider for single sign-on to cluster dashboards or cluster authentication
All of these extensions can be configured or consumed through built-in or custom Kubernetes resources which can be coupled with Gatekeeper policies to allow the platform team to control which functionality to expose and support for customers.
Here we discuss some key considerations when building an Internal Kubernetes Platform.
Building an Internal Kubernetes Platform is a significant investment for any organisation. The main factor that determines whether this investment will pay off is whether the organisation can benefit from the economies of scale that a platform can bring by handling common requirements for many customers; of course, for this to work, the customer base needs to be large enough for these benefits to be realised.
Some rules of thumb for deciding when such an investment would pay off are having more than 15 developers that could onboard onto the platform or having the potential for five or more customer teams, however the true turning point depends heavily on the complexity of the environment and so it is specific to an organisation; for example, if it is likely to take months to provision a compliant cluster for the average developer team, investment into a centralised solution pays off much more quickly per customer compared to an organisation with more lenient compliance standards.
With these factors in mind, organisations must determine whether an Internal Kubernetes Platform is the right option or whether it would be more valuable to invest into other areas such as directly into developer automation or tooling.
Level of Abstraction
When designing an Internal Kubernetes Platform, perhaps the most technically significant consideration is the level of abstraction you wish to offer, mainly because once you start onboarding customers it can be very disruptive to change. A popular pattern is for the platform team to run large multi-tenanted clusters and provide self-service access to Namespaces. Another option is to offer a more serverless experience using a tool such as Knative.
Yet another model that we have found to work well is to provision a cluster per customer. This level of abstraction comes with higher levels of overhead per customer and requires potentially greater levels of investment into the automation around cluster lifecycle management, however it offers much stronger isolation between customers and opens up the opportunity for features that are much more difficult to support on multi-tenanted clusters (e.g. allowing customers to deploy and manage their own CustomResourceDefinitions and make other cluster-scoped configuration changes).
Here we discuss some common pitfalls that we have helped mitigate when working with customers.
Not Working with Prospective Customers Early
The success of an Internal Kubernetes Platform is typically proportional to the number of customers that are actively using it. Early collaboration with customers is critical to ensuring the platform prioritises features that bring the most value and encourage customers to onboard (and remain onboarded); the earlier these discussions happen the easier it is to build trust and create effective feedback loops.
Platform Team Becoming a General Support Team
The platform team is responsible for supporting the platform but are typically not responsible for supporting customer applications. For this reason, customers should ensure platform SLOs and guarantees are appropriate for their own requirements, but they should also have their own SLOs and on-call rota for incident response; the messaging around this shared responsibility model needs to be clear from the start to avoid unwelcome surprises.
Failure to be a Product
The platform should be built and managed as an internal product and so should stand up to the same level of scrutiny as public products. This includes the following features:
- Easy onboarding and offboarding process
- Clear, well-written documentation with getting started guides; we have found that a great place to start is to structure the platform documentation in a similar way to open source products such as Kubernetes, Istio and Tekton
- Customer discussion forums
- Support and feature request process
- Announcement channels
- Feature roadmap
Failure to Empower the Platform Team to Maximise Customer Value
There will likely be prospective customers who have very specific feature requirements compared to the requirements of the wider organisation. Such features can require high levels of investment to build and support and this may impede prioritising the features that will bring value to the majority of customers. The platform team needs to have support from management to push back on pressure to implement such features in order to maximise total value.
That being said, this is a very difficult line to tread as it is important not to deter customers from making feature requests; feature requests and customer feedback in general is essential for determining pain points and how to prioritise work, so push back sparingly!
An Internal Kubernetes Platform can be an incredibly powerful tool for improving technical innovation and development velocity at an organisation. By pushing common application requirements down to the platform, the cognitive load of developer teams can be significantly reduced and more time can be spent on building and innovating on business logic. However, such an investment is a serious undertaking and options need to be considered carefully to reduce risk.
Jetstack has deep expertise and experience building Internal Kubernetes Platforms for large organisations. You can read our case study with a global bank using GKE, or we can be contacted directly to discuss how we can help you build your own Internal Kubernetes Platform.