Вы находитесь на странице: 1из 9

Recommendations for Deploying

Apache Kafka® on Kubernetes

Written by: Gwen Shapira

June 2018

www.confluent.io/contact ©2018 Confluent, Inc.

Table of Contents

Introduction 3

Is Running Kafka on Kubernetes a Good Idea? 4

Kubernetes Terms and Concepts 4

Kubernetes Considerations for Capacity Planning 4

Helm Charts 5

Storage: Persistent or Ephemeral? 5

Storage: Local of Shared? 6

Traffic: How Do Clients Communicate with Individual Brokers? 6

Traffic: How Do External Services Communicate with Kafka? 6

Log Aggregation 7

Metrics 7

Role of Operator 8

Conclusion 8

References 9

2 ©2018 Confluent, Inc.

As organizations modernize their data infrastructure and move to more agile, microservices-based approach, many encounter the
need for systems that help automate and manage their microservices deployment. In the last few years we’ve seen Kubernetes
becoming the prominent container orchestration framework, with significant adoption in both cutting-edge silicon valley startups and
traditional enterprises.

When an organization adopts a container orchestration solution, it makes sense to use it for all services that are deployed in
production. This increases manageability since operations of all services are automated in the same way, improves availability and
provides more flexibility around resource management.

So naturally, the question of how to run Apache Kafka and Confluent Platform on Kubernetes is an important one for many

This document addresses several important questions that software developers and production operations teams will face when
they are planning their deployment of Confluent Platform on Kubernetes. This document does not replace the Confluent Platform
Reference Architecture for Kubernetes where we discuss each specific Confluent Platform component and include detailed
suggestions for deployment and capacity planning.

Kubernetes evolves rapidly. This document applies to Kubernetes 1.8.x and 1.9.x.

3 ©2018 Confluent, Inc.

Is Running Kafka on Kubernetes a Good Idea?
The main goals of running anything on Kubernetes are:
• Developer and operator productivity via improved workflows
• Better resource efficiency by “bin packing” — running multiple applications on one physical or virtual machine when possible.

One could claim that Kafka isn’t very challenging to operate and that it often requires all the resources on a node anyway, so there
aren’t many benefits for running Kafka on Kubernetes.

Leaving aside the discussion of whether or not Kafka is as easy to operate as it looks, this approach takes a very narrow view. The
intention behind Kubernetes is to make it easy to run the entire business infrastructure, not just specific applications, and running
entire data centers is made easy by standardizing and automating.

The Kafka ecosystem includes stream processing jobs, event-driven applications in many languages, Confluent REST Proxy, Kafka
Connect, many connectors, KSQL and Confluent Schema Registry. Deploying and managing all those becomes challenging even in
mature organizations. Our customers routinely ask for a single “Deploy Confluent Platform” button. With Kubernetes, we can deliver this.

Don’t ask “Does it make sense to run Kafka on Kubernetes?”, ask “Do I want a standardized and automated way to manage an entire
ecosystem of event-driven stream-processing services and their dependencies?”

Kubernetes Terms and Concepts

If you are relatively new to the world of Kubernetes, the sheer amount of new terminology can be difficult to follow. Here are few short
definitions on the terms that we use throughout this document.
• Pod: Basic unit of resource allocation in Kubernetes. Includes one or more containers, storage volumes, network resources
and more. Pods are assigned to nodes as a single unit.
• StatefulSet: Collection of Pods with specific rules for scaling, ordering and uniqueness.
• Service: Logical set of Pods and a policy by which to access them.
• Headless service: Logical definition that provides unique network identity for each Pod in a set. By default all Pods in a set
are considered identical and traffic is load balanced between them.

Kubernetes Considerations for Capacity Planning

Most Kubernetes clusters are running on standard hardware or standard instance types. Standardizing on the underlying hardware
makes Kubernetes management much easier and lets you focus on allocating the containers that will run on the cluster.

The standard instance must be large enough to accommodate the most resource demanding component that you’ll want to execute,
since one node can host multiple containers but a single container cannot run on multiple nodes. This typically means r4.xlarge on
AWS, n1-highmem-4 on GCP, DS5v2 on Azure and 12 core, 128GB RAM, 12 disks server on-premises.

When you read the capacity planning section of the Apache Kafka® and Confluent Enterprise Reference Architecture, you’ll notice
that there are some discrepancies between the resources we recommend allocating for a given component and the hardware / cloud
instances recommendations we make for the component. The reason for the discrepancies is that there are real-world constraints
around servers sold by hardware vendors and instances available from specific cloud vendors. For example, we believe that in most
cases Apache ZooKeeper™ requires no more than 4GB RAM, yet the hardware recommendation is for 32GB RAM, simply because
there are no enterprise-grade servers available with 4GB RAM.

4 ©2018 Confluent, Inc.

Deploying components as Pods (one or more container, deployed as one unit) in Kubernetes cluster allows you to specify the exact
amount of memory, CPU and disk space each component requires — both required minimum (Make sure ZooKeeper has at least
4GB available) and limits (Don’t let ZooKeeper use more than 4GB RAM). This means that Kubernetes now has more information on
where in the cluster to deploy each Pod.

Helm Charts
Helm is the package manager for Kubernetes, making it easy to deploy standard workloads on Kubernetes. You are probably used
to package managers like apt, yum or Homebrew, where you can simply type “yum install confluent-enterprise” or “apt-get install
confluent-kafka” and have the right components and their dependencies install and configure automatically. Helm provides the
similar functionality for Kubernetes deployments.

The vendor or community experts of a specific project will create a Helm Chart, which is essentially a Kubernetes deployment
template. It includes things like Pod definitions, stateful set definitions, dependencies, resource requests, number of Pods in a set
and more. It also includes default values for requested resources like memory and number of cores. The key is that one Helm Chart
contains all the definitions needed for one project, including its dependencies and that it is a template, so default values can be

Users can then customize the charts for their environments by overriding the standard values and manage the charts with standard
commands. For example, Confluent Helm Chart for Apache Kafka allows users to override default number of partitions, and heap
size, but not the fact that Kafka depends on ZooKeeper and requires a persistent volume.

You can type helm install cp-kafka and this will deploy Kafka and ZooKeeper as a dependency with all our default values. Or
you can override default values with helm install --set heapOptions=-Xmx4G --set cp-zookeeper.servers=5 and
deploy a Kafka cluster with larger than default heap size and 5 ZooKeeper servers instead of 3.

Storage: Persistent or Ephemeral?

Confluent Platform includes many components, and their storage requirements are all different. Most components store their data
and configuration in Kafka itself, but some require extra consideration. Let’s look at 3 types of services and how to configure storage
for each.
• Stateful components: ZooKeeper and Kafka are essentially databases and need to be treated as such. While Kafka brokers
could restore their state by re-replicating data from the surviving brokers, the time and network resources required to do so
in most production systems is so high that we recommend avoiding it. Brokers also have persistent identity — when you
connect to broker with specific ID, you need to connect to that specific broker because it is the leader for specific partitions.
You need to know that every time you connect to specific URI, you access the exact same broker with same partitions
In order to provide persistent storage and identity, we deploy ZooKeeper and Kafka clusters as Kubernetes StatefulSets. This
gives them persistent volumes for storage and it also gives each Pod in the cluster its own DNS entry via Headless services.
• Stateless components: Kafka Connect, Confluent Schema Registry and Confluent Rest Proxy either have no state or
maintain a very small state in Kafka that they reload when started — so there is no need for persistent storage. They also
don’t have persistent identity. If you have 5 Connect workers and you connect to a URI, you don’t need to care which specific
worker responded. These services are managed via a Deployment, not a StatefulSet.
• Stream processing components: KSQL, Kafka Streams and Confluent Control Center have recoverable state — they store
data in RocksDB, but the RocksDB cache can be recreated from data in Kafka if needed. These caches can be small or
large, depending on the specific application, so the time to recover state from Kafka can take anywhere from few seconds to
many hours. Depending on the cache size and time to recovery requirements, these can be deployed either with persistent
storage or ephemeral. We recommend persistent storage when the cache size is such that it takes more than few seconds
to recover since Cloud Native applications should have very short recovery times.

5 ©2018 Confluent, Inc.

Storage: Local or Shared?
Since Confluent Platform includes StatefulSets with persistent storage, this raises the question of whether one should use local disks
or shared storage.

If you are deploying Kafka and Kubernetes in the cloud, this is an easy decision — use shared storage. It is easier to configure and
manage, makes failure recovery faster and simpler and shared storage on all large cloud providers has the throughput and latency
required for successful Kafka deployments.

On-premises, this sounds like a more challenging decision. On one hand, shared storage provides better availability and reliability
— you don’t need to restore Pods to the exact server that includes their storage and you don’t need to re-replicate all the data if
something happened to the server. In many cases, it is easier to allocate too. On the other hand, in many organizations, shared
storage does not provide the consistent low latency that Kafka and Zookeeper require. If the storage QOS is not properly configured,
a database backup can take the entire storage bandwidth and cause Kafka availability to drop.

Since Kubernetes support for persistent volumes on local storage only became beta in version 1.10, we recommend using shared
storage for now. If you are deploying on-premises, work closely with your storage administrators to ensure consistent low latency
access for Kafka and Zookeeper.

Traffic: How Do Clients Communicate with Individual Brokers?

Kubernetes has internal traffic router and internal DNS, so each service can be accessed through its name and port. For example, if
you are running inside Kubernetes, you can connect to the Confluent REST Proxy with http://cp-rest-proxy:8080. If you have
multiple REST Proxy pods running, Kubernetes will route the traffic to one of them.

This is great for stateless services, and for the initial connection to Kafka - the bootstrap.servers broker list can be just
cp-kafka:9092 because it doesn’t matter which broker responds with metadata. But once you start producing and consuming,
you need to communicate with specific brokers.

This is where headless services come into the picture. By defining a headless service for each Kafka broker, you define a unique
service name for each broker. Then you configure it as the advertised.listeners on the broker, and clients discover that
broker 1 is cp-kafka-1-headless:9092 and access it when they need to produce or consume from partitions where the leader is
on broker 1.

Traffic: How Do External Services Communicate with Kafka?

By default, services and pods only have IPs that are routable within the Kubernetes cluster, and are therefore inaccessible from
outside the cluster. This isn’t a problem if you are deploying all your Kafka applications, microservices and streams processing jobs
within the Kubernetes cluster, but what if there are services outside Kubernetes that need to produce events to Kafka or consume
events from Kafka?

This happens often when Kafka is deployed and managed by a central platform team on their own Kubernetes cluster, alongside
other data services. The applications that produce and consume events are owned by the different line of business teams and they
have their own clusters and ways of deploying their applications.

Kubernetes currently has 3 approaches to exposing services to external access:

1. NodePort: This exposes a specific port on each node, and the traffic sent to this node is forwarded to the Kafka service.
This is the simplest configuration and is great in environments where you have direct access to the nodes in the Kubernetes
cluster and you can assume IPs of Kubernetes nodes won’t change all the time. This is mostly true in on-premises

6 ©2018 Confluent, Inc.

2. LoadBalancer: If you are running Confluent Platform in an environment where you do not have direct access to the
Kubernetes nodes (cloud environments for example), or you may not want to assume specific node IPs, then use
LoadBalancer. LoadBalancer service type creates a single IP that routes traffic for the service and there is automatic
integration with load balancer services of the cloud providers and some on-prem load balancers. Since Kafka is stateful,
you’ll want to expose an IP for each broker.
3. Ingress: Ingress is a full-fledged proxy service and as such provides the most flexibility in configuration and routing.
Unfortunately, at this time, Ingress only supports HTTP, so it can be used for Confluent Platform components with REST
endpoints (Kafka Connect, Confluent Schema Registry, Confluetn REST Proxy, Confluent KSQL), but not for Kafka brokers
themselves since Kafka uses a different protocol.
In both LoadBalancer and NodePort modes, you’ll need to update broker configuration to route messages correctly. This is
traditionally done by overriding the default broker configuration via docker environment variables, but starting with Confluent
Platform 4.1, this can also be done dynamically via the kafka-configs command.

When configuring the brokers, you will want to keep using the internal IP for communication between brokers and for communication
with clients on the same Kubernetes cluster and reserve the external IP for communication with clients outside the Kubernetes cluster.
This can be done by adding a new endpoint to advertised.listeners config, using listener.security.port.map to define security
protocol for the new external listener and defining the inter.broker.listener.name to refer to the internal listener. KIP-103 has
a good example.

To summarize, we recommend using LoadBalancer when running in public cloud or if your on-premises load balancer is integrated
with Kubernetes. NodePort approach works well for on-premises deployments where the node IPs are known, directly accessible and
relatively static and when the team deploying Kafka doesn’t necessarily have access to the load balancer.

Log Aggregation
Kubernetes tools and APIs let you to view logs from each pod, but when you are running a large system with many pods and
services, you will want to aggregate the logs into one system, so you will have one central place to view and analyse them when

Since you are already running Kubernetes, it is safe to assume that you already have a log aggregation system that works for you.
Confluent Platform containers write the log output to stdout/stderr (with the exception of ksql-cli container that logs to a file by
default), which means that it will integrate with your existing log aggregation solution.

If you are still searching for a solution, we recommend using Kubernetes DaemonSet to run a pod with a log collection agent (Filebeat
and Fluentd are popular) on each node, collecting all the logs from the various pods running on that node. The collection agent can
then forward the logs to Kafka (use a different cluster than the production cluster you are currently monitoring) and from there, use
an Elasticsearch connector to forward the logs to Elastic. Using Kafka to aggregate the logs gives you both reliable buffer and the
ability to add additional subscribers to the log events in addition to Elasticsearch.

The same way we collect logs from all the containers, we also need to collect their metrics, and there are as many ways to collect
metrics as there are to collect logs.

Confluent Platform and Apache Kafka publish their metrics through JMX by default. So if you’ve already selected a monitoring
solution for Kubernetes, you’ll need to find out how to get it to collect JMX metrics.

7 ©2018 Confluent, Inc.

The most popular way to monitor Kubernetes is Prometheus (with Heapster a close second), and luckily, Prometheus can collect
JMX metrics from Kafka and Confluent Platform using its JMX Exporter. You can deploy the JMX Exporter as a container in the same
pod as the containers you are monitoring (sidecar pattern) and then configure the Java JMX agent in the Kafka container to report its
JMX metrics to the JMX Exporter via the use of KAFKA_OPTS environment variable.

In this setup, the Kafka container will report JMX metrics which will be scraped by the JMX Exporter and sent to Prometheus where
you can visualize them using popular open source tools like Grafana.

In addition to the JMX metrics that Kafka itself exports, we recommend monitoring latency and reliability of end-to-end message
delivery between Kafka producers and consumers. Monitoring services from an external viewpoint lets you catch unexpected issues
and make sure you are meeting SLAs. You can do that by configuring your client applications to load Confluent’s client interceptors
and report delivery and latency metrics to Confluent Control Center, where you can alert on SLA breaches and view the data trends
on the dashboards.

Role of Operator
An Operator is an application-specific extension to the Kubernetes API that allows managing the application (or, in our case, the
entire platform) using the standard Kubernetes tools such as kubectl. Operators are declarative — you define the desired state of the
cluster, and the operator monitors both the cluster and the definitions for changes and reacts accordingly. The idea is that experts
on specific services can create operators that are aware of the specifics of the service in a way that the more generic Kubernetes
abstractions are not. So a generic Kubernetes controller can do a rolling restart, but will it know that it needs to wait until “under
replicated partitions” metric is zero before restarting a broker?

Confluent Operator will be released later in the year and will include the following functionality:
• Deployments across multiple racks and availability zones
• Security configuration
• Rolling restarts
• Rolling upgrades

• Scaling Kafka up and down — including automatic balancing of partitions and load across the Pods

A well-written operator is like a personal assistant for the cluster administrator — making sure that repeated tasks are done correctly
every time.

This paper is intended to share some of our best practices around the deployment of Confluent Platform on Kubernetes clusters.
Of course, each use case, workload and environment is different and the best production systems are tailored to the specific
requirements and the specific organizations that run them. Confluent professional services teams have experience helping a wide
range of organizations, of all sizes, levels and expertise, successfully deploy Confluent Platform in many different environments
under many different constraints and requirements. You can always contact us, to schedule a deep dive into how to make the best
decisions in your specific scenario.

8 ©2018 Confluent, Inc.

Introducing the Confluent Operator: Apache Kafka® on Kubernetes Made Simple

Kubernetes Documentation: Services

Kubernetes Documentation: StatefulSet Basics

Kubernetes Documentation: DaemonSet

Kubernetes Documentation: Exposing an External IP Address to Access an Application in a Cluster

Exposing an Application to External Traffic

Kubernetes Documentation: Basic Logging in Kubernetes

Kubernetes NodePort vs LoadBalancer vs Ingress? When should I use what?

9 ©2018 Confluent, Inc.