Monday, September 24, 2018

Production Kafka Settings

Production Kafka Settings

The Confluent Developer Training has a lot of great information and examples.
The instructor kept saying, "And in production, you'll want to do this...". 
This is like the professor saying "This will be on the exam."

Following are some production configurations mentioned in the course. 
Confluent also has an in-depth link (see bottom of post), regarding production settings.
Cloudera has also written a very in-depth guide on Kafka setup. 

Kafka Brokers:

- Run at least 3 brokers
- 8 gigabytes of RAM to start
- 32 GB on Host (less is counterproductive)
- 5 GB for JVM
- JVM, run with G1GC
- -Xms6g -Xmx6g -XX:MetaspaceSize=96m -XX:+UseG1GC -XX:MaxGCPauseMillis=20
   -XX:InitiatingHeapOccupancyPercent=35 -XX:G1HeapRegionSize=16M
   -XX:MinMetaspaceFreeRatio=50 -XX:MaxMetaspaceFreeRatio=80

ZooKeeper

- Minimum 3 nodes, sometimes 5
- 16 GB RAM on Host
- 1 GB JVM
- 64 GB SSD (1)

Topic Settings

- Topic Replication: Replication Factor 2 (3 copies)
- min.insync.replicas = required.acks=-1
- Turn off Auto-Creation in production: `auto.create.topics.enable: False`  
- Turn off Topic Deletion: `delete.topic.enable: False`

Altering Topics - Altering partitions in topics

Option 1: Only create new topics  
Option 2: Shut down producers, increase partitions, restart producers  

Kafka Connect:

- Use distributed mode for fault tolerance and availability  
- Set topic replication to 3 for connect topics
- Set cleanup.policy to compact  
- Set offset.storage.topic: 50  
- Set status.storage.topic: 10  

Kafka Streams:

- Use at least 2 brokers in code, preferrably 3
- Use a ShutdownHook in Streams code
- Job scaling limited by parallelism of first topic

Links:

- https://docs.confluent.io/current/kafka/deployment.html
- https://www.cloudera.com/documentation/enterprise/6/6.0/topics/kafka.html

Apache Airflow - Runbook

To try out a different scheduler,  we tried Apache Airflow to schedule Spark jobs.  Due to a known issue with Kerberos and Python 3 (see...