Production Kafka Settings
The Confluent Developer Training has a lot of great information and examples.The instructor kept saying, "And in production, you'll want to do this...".
This is like the professor saying "This will be on the exam."
Following are some production configurations mentioned in the course.
Confluent also has an in-depth link (see bottom of post), regarding production settings.
Cloudera has also written a very in-depth guide on Kafka setup.
Cloudera has also written a very in-depth guide on Kafka setup.
Kafka Brokers:
- Run at least 3 brokers
- 8 gigabytes of RAM to start
- 32 GB on Host (less is counterproductive)
- 5 GB for JVM
- JVM, run with G1GC
- -Xms6g -Xmx6g -XX:MetaspaceSize=96m -XX:+UseG1GC -XX:MaxGCPauseMillis=20
-XX:InitiatingHeapOccupancyPercent=35 -XX:G1HeapRegionSize=16M
-XX:MinMetaspaceFreeRatio=50 -XX:MaxMetaspaceFreeRatio=80
ZooKeeper
- Minimum 3 nodes, sometimes 5
- 16 GB RAM on Host
- 1 GB JVM
- 64 GB SSD (1)
Topic Settings
- Topic Replication: Replication Factor 2 (3 copies)
- min.insync.replicas = required.acks=-1
- Turn off Auto-Creation in production: `auto.create.topics.enable: False`
- Turn off Topic Deletion: `delete.topic.enable: False`
Altering Topics - Altering partitions in topics
Option 1: Only create new topics
Option 2: Shut down producers, increase partitions, restart producers
Kafka Connect:
- Use distributed mode for fault tolerance and availability
- Set topic replication to 3 for connect topics
- Set cleanup.policy to compact
- Set offset.storage.topic: 50
- Set status.storage.topic: 10
Kafka Streams:
- Use at least 2 brokers in code, preferrably 3
- Use a ShutdownHook in Streams code
- Job scaling limited by parallelism of first topic
Links:
- https://docs.confluent.io/current/kafka/deployment.html
- https://www.cloudera.com/documentation/enterprise/6/6.0/topics/kafka.html