This is an excerpt from The Gorilla Guide to Kubernetes in the Enterprise, written by Joep Piscaer.
Previous Chapters:
You can download the full guide here.
Kubernetes Deployment from Scratch
Deploying a Kubernetes cluster from scratch can be a daunting task. It requires knowledge of its core concepts, the ability to make architecture choices, and expertise on the deployment tools and knowledge of the underlying infrastructure, be it on-premises or in the cloud.
Selecting and configuring the right infrastructure is the first challenge. Both on-premises and public cloud infrastructure have their own difficulties, and it’s important to take the Kubernetes architecture into account. You can choose to not run any pods on master nodes, which changes the requirements for those machines. Dedicated master nodes have smaller minimum hardware requirements.
Big clusters put a higher burden on the master nodes, and they need to be sized appropriately. It’s recommended to run at least three nodes for etcd, which allows a single node failure. While it may be tempting to run etcd on the same nodes as the Kubernetes master nodes, it’s recommended to create and run the etcd as a separate cluster.
Adding more nodes will protect against multiple node failures simultaneously (5 nodes/2 failures and 7 nodes/4 failures), but each node added can decrease Kubernetes’ performance. For master nodes, running two protects against failure of any one node.
For both the etcd cluster and Kubernetes master nodes, designing for availability across multiple physical locations (such as Availability Zones in AWS) protects the Kubernetes environment against physical and geographical failure scenarios.
On-Premises Implementations
Many on-premises environments are repurposed to integrate with Kubernetes (like creating clusters on top of VMs). In some cases, a new infrastructure is created for the cluster. In any case, integrating servers, storage and networking into a smoothly-running environment is still highly-skilled work.
For Kubernetes, planning for the right storage and networking equipment is especially important, as it has the ability to interact with these resources to provision storage, load balancers and the like. Being able to automate storage and networking components is a critical part of Kubernetes’ value proposition.
Public Cloud
This is why many, for their first foray into Kubernetes, spin up clusters in public cloud environments. Kubernetes deployment tools integrate natively with public cloud environments, and are able to spin up the required compute instances, as well as configure storage and networking services for day-to-day operations.
For cloud instances, it’s critically important to select the right instance type. While some instance types are explicitly a bad idea (for example, VMs with partial physical CPUs assigned or with CPU oversubscription), others might be too expensive. An advantage of public clouds is their consumption-based billing, which provides the opportunity to re-evaluate consumption periodically.
Hybrid implementations: On-Premises + Public Clouds and at the Edge
These are the most complex environments of all. For these, you may want to look into a Managed Kubernetes solution, so that you do not have to do the heavy lifting yourself.
See a comparison of leading Enterprise Kubernetes solutions here.
Networking Concerns
The slightly-different-than-usual networking model of Kubernetes requires some planning. The most basic networking pieces are the addresses for the nodes and public-facing Kubernetes parts. These are part of the regular, existing network. Kubernetes allocates an IP block for pods, as well as a block for services. Of course, these ranges should not collide with existing ranges on the physical network.
Depending on the pod network type – overlay or routed – additional steps have to be taken to advertise these IP blocks to the network or publish services to the network.
Lifecycle Management
There are various tools to manage the lifecycle of a Kubernetes cluster. Broadly speaking, there are tools for the deployment and lifecycle management of clusters, and there are tools for interacting with a cluster for day-to-day operations.
Let’s walk through a couple of the more popular tools:
Kubeadm
Kubeadm is the default way of bootstrapping a best-practice-compliant cluster on existing infrastructure. Add-ons and networking setup are both out of scope for Kubeadm, as well as provisioning the underlying infrastructure.
Kubespray
Kubespray takes the configuration management approach, based on Ansible playbooks. This is ideal for those already using Ansible and who are comfortable with configuration management tooling. Kubespray uses kubeadm under the hood.
MiniKube
MiniKube is one of the more popular ways of trying out Kubernetes locally. The tool is a good starting point for taking first steps with Kubernetes. It launches a single-node cluster inside a VM on your local laptop. It runs on Windows, Linux and MacOS, and has a dashboard.
Kops
Kops allows you to control the full Kubernetes cluster lifecycle, from infrastructure provisioning to cluster deletion. It’s mainly for deploying on AWS, but support for GCE and VMware vSphere is coming.
Various cloud vendors use their own proprietary tools for deploying and managing cluster lifecycle. These are delivered as part of the managed Kubernetes service, and usually not surfaced up to the user.
Configuration and Add-ons
Add-ons extend the functionality of Kubernetes. They fall into three main categories:
Networking and network policy
These include addons that create and manage the lifecycle of networks, such as Calico (routed) and Flannel (VXLAN overlay)
Service discovery
While Kube-DNS is still the default, CoreDNS will replace it, starting with version 1.13, to do service discovery.
User interface
The Kubernetes dashboard is an addon.
While the name add-on suggests some of these are optional, in reality many are required for a production-grade Kubernetes environment. Choosing the most suitable network provider, like Flannel or Calico, is crucial for integrating the cluster into the existing environment, be it on-premises or in the cloud.
Kubernetes Helm
Although technically not an addon, Helm is considered a vital part of a well-functioning Kubernetes cluster. Helm is the package manager for Kubernetes. Helm Charts define, install and upgrade applications. These application packages (Charts) package up the configuration of containers, pods, and anything else for easy deployment on Kubernetes.
Helm is used to find and install popular software packages and manage the lifecycle of those applications. It’s somewhat comparable to a Docker Registry, only Helm charts might contain multiple containers and Kubernetes-specific information, like pod specifications. Helm includes a default repository for many software packages:
- Monitoring: SignalFX, NewRelic, DataDog, Sysdig, ELK-stack (elasticsearch, logstash, kibana), Jaeger, Grafana, Fluentd, Prometheus, Sensu
- Databases: CockroachDB, MariaDB, CouchDB, InfluxDB, MongoDB, Percona, Microsoft SQL on Linux, PostgreSQL, MySQL
- Key/Value: etcd, Memcached, NATS, Redis
- Message Systems: Kafka, RabbitMQ
- CI/CD: Concourse CI, Artifactory, Jenkins, GitLab, Selenium, SonarQube, Spinnaker
- Ingress and API Gateways: Istio, Traefik, Envoy, Kong
- Application and Web Servers: Nginx, Tomcat
- Content Management: WordPress, Joomla, Ghost, Media-Wiki, Moodle, OwnCloud
- Storage: OpenEBS, Minio
There are unique Charts available for things like the games Minecraft and Factorio, the smart home automation system Home Assistant, and Ubiquiti’s wireless access point controller, UniFi SDN. Helm makes the life of application developers easier by eliminating the toil of finding, installing and managing the lifecycle of many popular software packages.
There’s More:
For a deep-dive into Kubernetes deployment models and technical comparison of the different tools, check out the free guide at the link below.
On the next posts we’ll dive deeper into how you make Kubernetes work: from user access to monitoring, HA, storage, and more. We’ll also explore key Kubernetes use cases, and best practices for operating Kubernetes in production, at scale.