📨 Kafka Infrastructure Series

Build a 3-Broker Kafka Cluster
with Kafka Connect + Kafka UI

KRaft mode. No ZooKeeper. Three dedicated VMs as brokers, Kafka Connect and Kafka UI running as Docker containers — with optional Prometheus + Grafana monitoring.

Apache Kafka 4.1 KRaft Mode 3-Broker Cluster Kafka Connect Kafka UI JMX Exporter Docker Compose Prometheus

Architecture at a Glance

Three Ubuntu VMs run Kafka brokers in KRaft mode (no ZooKeeper required). A fourth Docker VM hosts Kafka Connect for source/sink connectors, Kafka UI for a web management panel, Kafka Exporter for broker metrics, and the JMX exporter for connector metrics — all wired into Prometheus + Grafana.

🖥️
Broker 1
node.id=1 · port 9092
+
🖥️
Broker 2
node.id=2 · port 9092
+
🖥️
Broker 3
node.id=3 · port 9092
🐳
Docker VM
Connect · UI · Exporter
📊
Monitoring
Prometheus + Grafana
💡
SECTION 01

Architecture & Core Concepts

This setup uses Kafka 4.1 in KRaft mode — which means no ZooKeeper at all. Each broker also acts as a controller, forming a self-managed quorum of 3 nodes. This is the modern, production-ready way to run Kafka.

⚡ KRaft Mode

Kafka Raft — replaces ZooKeeper entirely. Each broker here also acts as a controller, forming a 3-node quorum. Simpler operations, fewer moving parts.

🖥️ 3 Brokers

Dedicated Ubuntu VMs — one Kafka process per VM. Replication factor 3 means every topic partition has a copy on each broker. No single point of failure.

🔌 Kafka Connect

A distributed integration layer for moving data between Kafka and external systems (PostgreSQL, S3, Elasticsearch, etc.) using pluggable connectors.

🖼️ Kafka UI

Provectus Kafka UI — a web dashboard for browsing topics, consumer groups, connector status, and producing/consuming test messages.

Listener Architecture

Each broker exposes three listeners on different ports with different roles:

ListenerPortRoleUsed By
INTERNAL19092Inter-broker replicationOther brokers, Kafka Connect, Kafka UI (inside network)
EXTERNAL9092External client accessProducers / Consumers outside the broker network
CONTROLLER9093KRaft quorum votesController-to-controller communication only
💡
Why INTERNAL on 19092? Using a separate internal listener (19092) means Kafka Connect and Kafka UI connect via the internal network — faster, no external routing. External clients use 9092. Keeping them separate also lets you apply different security policies per listener in the future.
📋
SECTION 02

Prerequisites — VM Sizing

🖥️
×3
Broker VMs
8 vCPU
16 GB RAM
200 GB NVMe
Ubuntu 22.04 LTS
🐳
×1
Docker VM
8 vCPU
16 GB RAM
100 GB disk
Ubuntu 22.04 LTS
📊
Opt
Monitoring VM
4 vCPU
8 GB RAM
Prometheus + Grafana
Can use Docker VM
  • Network reachability — all broker VMs must reach each other on ports 9092, 9093, and 19092
  • Docker VM must reach all broker VMs on port 19092
  • SSH access to all VMs with sudo privileges
  • Static IPs or DNS names for all broker VMs — these go into controller.quorum.voters and advertised.listeners
🔌
SECTION 03

Port Reference

Port
Service
Purpose
9092
EXTERNAL (Kafka)
External producer/consumer client connections
19092
INTERNAL (Kafka)
Inter-broker replication; Connect & UI bootstrap
9093
CONTROLLER (KRaft)
Controller quorum votes between brokers
8083
Kafka Connect REST
Connector management API
8190
Kafka UI
Web dashboard (mapped from container 8080)
9308
Kafka Exporter
Prometheus scrape: broker-level metrics
7072
JMX Exporter
Prometheus scrape: Kafka Connect JVM metrics

🖥️
PHASE 01 — Step 1

Install Java & Kafka on All 3 Broker VMs

SSH into each broker VM and run the following. Kafka 4.1 requires Java 21.

🔁
Run on all 3 broker VMs Every command in this step is identical across Broker 1, 2, and 3. Open three terminal windows and run them in parallel to save time.
Shell — All 3 Broker VMs
# Update package list and install Java 21
sudo apt update
sudo apt install -y openjdk-21-jdk wget

# Verify Java version
java -version   # should show: openjdk version "21..."

# Download Kafka 4.1.1 into /opt
cd /opt
sudo wget https://downloads.apache.org/kafka/4.1.1/kafka_2.13-4.1.1.tgz

# Extract and create a stable symlink
sudo tar -xzf kafka_2.13-4.1.1.tgz -C /opt
sudo ln -s /opt/kafka_2.13-4.1.1 /opt/kafka

# Verify the directory
ls /opt/kafka/bin/   # should list: kafka-server-start.sh, kafka-topics.sh, etc.
⚙️
PHASE 01 — Step 2

Edit server.properties — Common Settings

Open /opt/kafka/config/server.properties on each VM and update the following settings. The values here are identical on all three brokers — only node.id and advertised.listeners differ (covered in the next step).

Shell — open config on each broker
vim /opt/kafka/config/server.properties

Find and update each of these settings in the file. If a line is missing, add it. If controller.quorum.bootstrap.servers exists, comment it out.

server.properties — Common Section (all 3 brokers)
# ─── KRaft mode: this node is both broker and controller ──────
process.roles=broker,controller
controller.listener.names=CONTROLLER

# ─── Listener definitions ─────────────────────────────────────
listeners=INTERNAL://:19092,EXTERNAL://:9092,CONTROLLER://:9093
listener.security.protocol.map=CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSL,INTERNAL:PLAINTEXT,EXTERNAL:PLAINTEXT

# Use INTERNAL listener for broker-to-broker replication
inter.broker.listener.name=INTERNAL

# ─── KRaft quorum: list all 3 broker controller endpoints ─────
# Replace with your actual broker IPs
controller.quorum.voters=1@<Broker1_IP>:9093,2@<Broker2_IP>:9093,3@<Broker3_IP>:9093

# Comment out this line if it exists (bootstrap.servers not used in static KRaft)
#controller.quorum.bootstrap.servers=localhost:9093

# ─── Storage ──────────────────────────────────────────────────
log.dirs=/var/lib/kafka/logs

# ─── Replication & partitioning ───────────────────────────────
num.partitions=3
default.replication.factor=3
min.insync.replicas=2

offsets.topic.replication.factor=3
transaction.state.log.replication.factor=3
transaction.state.log.min.isr=2

share.coordinator.state.topic.replication.factor=3
share.coordinator.state.topic.min.isr=2
⚠️
min.insync.replicas must be less than replication.factor With replication.factor=3 and min.insync.replicas=2, Kafka requires at least 2 brokers to acknowledge a write. This means you can lose 1 broker and still accept writes. Never set min.insync.replicas equal to or greater than replication.factor — all brokers would need to be up for any write to succeed.
🔢
PHASE 01 — Step 3

Per-Server Config — node.id & advertised.listeners

These two settings are unique per broker. Set them after the common config above. Replace the IP placeholders with each broker's actual IP address.

Broker 1 — server.properties additions
node.id=1
advertised.listeners=INTERNAL://<Broker1_IP>:19092,EXTERNAL://<Broker1_IP>:9092
Broker 2 — server.properties additions
node.id=2
advertised.listeners=INTERNAL://<Broker2_IP>:19092,EXTERNAL://<Broker2_IP>:9092
Broker 3 — server.properties additions
node.id=3
advertised.listeners=INTERNAL://<Broker3_IP>:19092,EXTERNAL://<Broker3_IP>:9092
💡
node.id must match the quorum voters config The IDs in controller.quorum.voters=1@IP:9093,2@IP:9093,3@IP:9093 must exactly match the node.id on each respective broker. Mismatch causes the quorum to fail to form.
📁
PHASE 01 — Step 4

Create Directories & Set Permissions

Run on all 3 broker VMs. Replace $USER with your actual Linux username if the variable doesn't expand correctly.

Shell — All 3 Broker VMs
# Create Kafka log data directory (log.dirs in server.properties)
sudo mkdir -p /var/lib/kafka/logs
sudo chown -R $USER:$USER /var/lib/kafka/logs

# Create Kafka application log directory (for stdout / stderr)
sudo mkdir -p /opt/kafka/logs
sudo chown -R $USER:$USER /opt/kafka/logs

# Give your user ownership of the Kafka installation
sudo chown -R $USER:$USER /opt/kafka
sudo chown -R $USER:$USER /opt/kafka_2.13-4.1.1
🔑
PHASE 01 — Step 5 ⚠️ Critical

KRaft Storage Format — UUID Generation

This is the most critical step. All three brokers must be formatted with the exact same UUID. The UUID ties the cluster together — brokers with different UUIDs will refuse to join the same quorum.

01

Generate UUID on Broker 1 Only

Run this only on Broker 1. Copy the output UUID — you will use it on all three brokers.

Shell — Broker 1 only
cd /opt/kafka/bin
./kafka-storage.sh random-uuid

# Example output — yours will be different:
AbCdEfGhIjKlMnOpQrStUv

# Copy this UUID — you need it in the next step on ALL brokers
02

Format Storage on All 3 Brokers with the Same UUID

Run this command on Broker 1, Broker 2, and Broker 3 — using the same UUID generated above.

Shell — All 3 Broker VMs (same UUID)
cd /opt/kafka/bin

# Replace <UUID> with the value from step 01
./kafka-storage.sh format \
  -t <UUID> \
  -c /opt/kafka/config/server.properties
03

Verify the Format Created the Metadata Files

Shell — All 3 Broker VMs
ls /var/lib/kafka/logs/

# Expected output — you should see:
bootstrap.checkpoint   meta.properties   __cluster_metadata-0/
⚠️
Re-formatting clears all data Running kafka-storage.sh format again on a broker that already has data will wipe all topic data on that broker. Only run it once, during initial setup.
🔥
PHASE 01 — Step 6

Firewall — Open Required Ports

Brokers must be able to reach each other. Run on all 3 broker VMs. Also open ports from the Docker VM IP so Kafka Connect and Kafka UI can connect.

Shell — All 3 Broker VMs
# Check current firewall status
sudo ufw status

# Allow XMPP client connections (external producers/consumers)
sudo ufw allow 9092/tcp

# Allow inter-broker + Connect/UI listener
sudo ufw allow 19092/tcp

# Allow KRaft controller quorum
sudo ufw allow 9093/tcp

# Reload and confirm
sudo ufw reload
sudo ufw status numbered

# Optional: restrict 19092 to Docker VM IP only (more secure)
sudo ufw allow from <DockerVM_IP> to any port 19092
💡
Quick connectivity test between brokers From Broker 1, test reachability to Broker 2: nc -zv <Broker2_IP> 9093. If it hangs, the firewall is blocking the controller port. All 3 brokers must reach all others on 9092, 9093, and 19092.
⚙️
PHASE 01 — Step 7

Set Up as a Systemd Service

Running Kafka as a systemd service means it starts on boot and auto-restarts on failure. Create the same service file on all 3 brokers. Update the User= field to match your Linux username.

Shell — create service file
sudo vim /etc/systemd/system/kafka.service
/etc/systemd/system/kafka.service — All 3 Broker VMs
[Unit]
Description=Apache Kafka (KRaft)
After=network.target

[Service]
Type=simple
User=your_linux_user   # ← replace with your actual username
ExecStart=/opt/kafka/bin/kafka-server-start.sh /opt/kafka/config/server.properties
ExecStop=/opt/kafka/bin/kafka-server-stop.sh
Restart=on-failure
RestartSec=5
LimitNOFILE=100000

# Application logs go here (separate from topic data)
StandardOutput=append:/opt/kafka/logs/kafka.out
StandardError=append:/opt/kafka/logs/kafka.err

[Install]
WantedBy=multi-user.target
Shell — Enable & Start
sudo systemctl daemon-reload
sudo systemctl start kafka
sudo systemctl enable kafka    # auto-start on reboot
sudo systemctl status kafka    # should show: active (running)
⏱️
Start all 3 brokers within a short window The KRaft quorum requires a majority (2 of 3) to elect a leader. Start all three brokers within 30–60 seconds of each other. If one broker is slow to start, the other two will wait for quorum before accepting any requests.
🧪
PHASE 01 — Step 8

Verify the Broker Cluster

Shell — Run from any Broker VM
# Tail the Kafka log — look for "KRaft state is LEADER" or "Quorum leader"
tail -f /opt/kafka/logs/kafka.out

# List all brokers in the cluster metadata
/opt/kafka/bin/kafka-metadata-quorum.sh \
  --bootstrap-server localhost:9092 \
  describe --status

# Create a test topic across all 3 brokers
/opt/kafka/bin/kafka-topics.sh \
  --bootstrap-server localhost:9092 \
  --create --topic test-cluster \
  --partitions 3 \
  --replication-factor 3

# Describe the topic — confirm replicas are spread across all 3 nodes
/opt/kafka/bin/kafka-topics.sh \
  --bootstrap-server localhost:9092 \
  --describe --topic test-cluster
What healthy output looks like kafka-metadata-quorum.sh describe --status should show LeaderId: 1 (or 2 or 3) and all 3 voters as active. The topic describe should show each partition with Replicas: 1,2,3 and Isr: 1,2,3.

🐳
PHASE 02 — Step 1

Install Docker & Docker Compose on the Docker VM

SSH into the Docker VM (the 4th machine) and run the following to install Docker Engine and the Compose plugin.

Shell — Docker VM
# Remove any old Docker packages
sudo apt remove -y docker docker-engine docker.io containerd runc

# Install dependencies
sudo apt update
sudo apt install -y ca-certificates curl gnupg lsb-release

# Add Docker's official GPG key
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg \
  | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
sudo chmod a+r /etc/apt/keyrings/docker.gpg

# Add Docker's repository
echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] \
  https://download.docker.com/linux/ubuntu \
  $(lsb_release -cs) stable" \
  | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

# Install Docker Engine + Compose plugin
sudo apt update
sudo apt install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

# Add your user to the docker group (avoids needing sudo)
sudo usermod -aG docker $USER
newgrp docker

# Verify
docker --version
docker compose version
📁
PHASE 02 — Step 2

Create Directories & Download Dependencies

All commands run from the Docker VM. Set up the working directory, the JMX exporter, and the Kafka Connect JDBC plugin + PostgreSQL driver.

~/kafka/ # working directory
docker-compose.yaml
jmx-exporter/
jmx_prometheus_javaagent-1.5.0.jar
kafka-connect.yml # JMX scrape rules
connect-plugins/
kafka-connect-jdbc-10.7.4.jar
postgresql-42.7.3.jar
Shell — Docker VM
# Create working directory
mkdir ~/kafka && cd ~/kafka

# ── JMX Exporter ───────────────────────────────────────────
mkdir jmx-exporter && cd jmx-exporter

wget https://github.com/prometheus/jmx_exporter/releases/download/1.5.0/jmx_prometheus_javaagent-1.5.0.jar

# Create the JMX scrape rules config (see next step)
vim kafka-connect.yml

# ── Kafka Connect JDBC Plugin ───────────────────────────────
cd ~/kafka
mkdir connect-plugins && cd connect-plugins

# JDBC connector (for DB source/sink connectors)
wget https://packages.confluent.io/maven/io/confluent/kafka-connect-jdbc/10.7.4/kafka-connect-jdbc-10.7.4.jar

# PostgreSQL JDBC driver
wget https://repo1.maven.org/maven2/org/postgresql/postgresql/42.7.3/postgresql-42.7.3.jar

# Confirm both jars are present
ls ~/kafka/connect-plugins/
💡
Adding more connectors Drop any additional connector JARs into ~/kafka/connect-plugins/ before starting Docker Compose. The directory is mounted to /etc/kafka-connect/jars inside the container. Restart the connect container after adding new JARs: docker compose restart kafka-connect.
📡
PHASE 02 — Step 3

JMX Exporter Config — kafka-connect.yml

This file tells the JMX exporter which Kafka Connect metrics to expose to Prometheus. Place it in ~/kafka/jmx-exporter/kafka-connect.yml.

~/kafka/jmx-exporter/kafka-connect.yml
lowercaseOutputName: true
rules:
  # App info — start time and version
  - pattern: 'kafka.(.+)<type=app-info, client-id=(.+)><>start-time-ms'
    name: kafka_$1_start_time_seconds
    labels:
      clientId: "$2"
    help: "Kafka $1 JMX metric start time seconds"
    type: GAUGE
    valueFactor: 0.001

  - pattern: 'kafka.(.+)<type=app-info, client-id=(.+)><>(commit-id|version): (.+)'
    name: kafka_$1_$3_info
    value: 1
    labels:
      clientId: "$2"
      $3: "$4"
    help: "Kafka $1 JMX metric info version and commit-id"
    type: GAUGE

  # Producer/consumer topic + partition metrics
  - pattern: kafka.(.+)<type=(.+)-metrics, client-id=(.+), topic=(.+), partition=(.+)><>(.+-total|compression-rate|.+-avg|.+-replica|.+-lag|.+-lead)
    name: kafka_$2_$6
    labels:
      clientId: "$3"
      topic: "$4"
      partition: "$5"
    type: GAUGE

  - pattern: kafka.(.+)<type=(.+)-metrics, client-id=(.+), topic=(.+)><>(.+-total|compression-rate|.+-avg)
    name: kafka_$2_$5
    labels:
      clientId: "$3"
      topic: "$4"
    type: GAUGE

  # Node-level metrics
  - pattern: kafka.(.+)<type=(.+)-metrics, client-id=(.+), node-id=(.+)><>(.+-total|.+-avg)
    name: kafka_$2_$5
    labels:
      clientId: "$3"
      nodeId: "$4"
    type: UNTYPED

  # General client metrics
  - pattern: kafka.(.+)<type=(.+)-metrics, client-id=(.*)><>(.+-total|.+-avg|.+-bytes|.+-count|.+-ratio|.+-age|.+-flight|.+-threads|.+-connectors|.+-tasks|.+-ago)
    name: kafka_$2_$4
    labels:
      clientId: "$3"
    type: GAUGE

  # Connector task status
  - pattern: 'kafka.connect<type=connector-task-metrics, connector=(.+), task=(.+)><>status: ([a-z-]+)'
    name: kafka_connect_connector_status
    value: 1
    labels:
      connector: "$1"
      task: "$2"
      status: "$3"
    type: GAUGE

  # Connector task metrics (errors, requests, retries)
  - pattern: kafka.connect<type=(.+)-metrics, connector=(.+), task=(.+)><>(.+-total|.+-count|.+-ms|.+-ratio|.+-avg|.+-failures|.+-requests|.+-timestamp|.+-logged|.+-errors|.+-retries|.+-skipped)
    name: kafka_connect_$1_$4
    labels:
      connector: "$2"
      task: "$3"
    type: GAUGE

  # Worker connector metrics
  - pattern: kafka.connect<type=connect-worker-metrics, connector=(.+)><>([a-z-]+)
    name: kafka_connect_worker_$2
    labels:
      connector: "$1"
    type: GAUGE

  # Worker global metrics
  - pattern: kafka.connect<type=connect-worker-metrics><>([a-z-]+)
    name: kafka_connect_worker_$1
    type: GAUGE

  # Worker rebalance metrics
  - pattern: kafka.connect<type=connect-worker-rebalance-metrics><>([a-z-]+)
    name: kafka_connect_worker_rebalance_$1
    type: GAUGE
🐙
PHASE 02 — Step 4

docker-compose.yaml

Create this file at ~/kafka/docker-compose.yaml. Replace all <BrokerN_IP> placeholders with actual IPs. Three containers: Kafka UI, Kafka Connect, and Kafka Exporter.

~/kafka/docker-compose.yaml
version: "3.8"

services:

  # ── Kafka UI — Web management dashboard ────────────────────
  kafka-ui:
    image: provectuslabs/kafka-ui:latest
    container_name: kafka-ui
    ports:
      - "8190:8080"
    environment:
      KAFKA_CLUSTERS_0_NAME: kafka-cluster
      # Use INTERNAL listener port 19092 for UI ↔ broker traffic
      KAFKA_CLUSTERS_0_BOOTSTRAPSERVERS: <Broker1_IP>:19092,<Broker2_IP>:19092,<Broker3_IP>:19092

      # Login form auth — change password before production use
      AUTH_TYPE: LOGIN_FORM
      SPRING_SECURITY_USER_NAME: admin
      SPRING_SECURITY_USER_PASSWORD: password@123

      KAFKA_CLUSTERS_0_PROPERTIES_ALLOW_DELETE_TOPICS: "true"
      KAFKA_CLUSTERS_0_PROPERTIES_ALLOW_CREATE_TOPICS: "true"
      KAFKA_CLUSTERS_0_PROPERTIES_ALLOW_EDIT_CONFIGS: "true"

      JAVA_OPTS: "-Xmx512m"
    restart: unless-stopped

  # ── Kafka Connect — Connector framework ────────────────────
  kafka-connect:
    image: confluentinc/cp-kafka-connect:7.7.8
    container_name: kafka-connect
    ports:
      - "8083:8083"     # REST API
      - "7072:7072"     # JMX metrics for Prometheus
    restart: unless-stopped
    volumes:
      - ./connect-plugins:/etc/kafka-connect/jars   # JDBC + PG driver
      - ./jmx-exporter:/opt/jmx                     # JMX agent + config
    environment:
      CONNECT_BOOTSTRAP_SERVERS: <Broker1_IP>:19092,<Broker2_IP>:19092,<Broker3_IP>:19092
      CONNECT_GROUP_ID: kafka-connect-group

      # Internal Kafka topics that Connect uses to store its state
      CONNECT_CONFIG_STORAGE_TOPIC: connect-configs
      CONNECT_OFFSET_STORAGE_TOPIC: connect-offsets
      CONNECT_STATUS_STORAGE_TOPIC: connect-status

      CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR: "3"
      CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR: "3"
      CONNECT_STATUS_STORAGE_REPLICATION_FACTOR: "3"

      CONNECT_REST_ADVERTISED_HOST_NAME: kafka-connect
      CONNECT_REST_PORT: 8083

      # JSON converters — no schema registry needed
      CONNECT_KEY_CONVERTER: org.apache.kafka.connect.json.JsonConverter
      CONNECT_VALUE_CONVERTER: org.apache.kafka.connect.json.JsonConverter
      CONNECT_INTERNAL_KEY_CONVERTER: org.apache.kafka.connect.json.JsonConverter
      CONNECT_INTERNAL_VALUE_CONVERTER: org.apache.kafka.connect.json.JsonConverter
      CONNECT_KEY_CONVERTER_SCHEMAS_ENABLE: "false"
      CONNECT_VALUE_CONVERTER_SCHEMAS_ENABLE: "false"

      # Plugin path — includes mounted jars and built-in plugins
      CONNECT_PLUGIN_PATH: /usr/share/java,/etc/kafka-connect/jars

      # Reduce noisy logs from network and reflection libraries
      CONNECT_LOG4J_LOGGERS: org.apache.kafka.clients.NetworkClient=ERROR,org.reflections=ERROR

      # JMX Prometheus agent — exposes Connect JVM metrics on port 7072
      # Remove KAFKA_OPTS line below if you don't want monitoring
      KAFKA_OPTS: "-javaagent:/opt/jmx/jmx_prometheus_javaagent-1.5.0.jar=7072:/opt/jmx/kafka-connect.yml"

      # Heap: 4GB min/max — adjust based on number of active connectors
      KAFKA_HEAP_OPTS: "-Xms4g -Xmx4g"

  # ── Kafka Exporter — Broker-level Prometheus metrics ───────
  # Optional: remove this service if you don't need broker monitoring
  kafka-exporter:
    image: danielqsj/kafka-exporter
    container_name: kafka-exporter
    ports:
      - "9308:9308"
    command:
      - "--kafka.server=<Broker1_IP>:19092"
      - "--kafka.server=<Broker2_IP>:19092"
      - "--kafka.server=<Broker3_IP>:19092"
    restart: unless-stopped
⚠️
Broker 2 IP appears twice in the original — fix before deploying In the bootstrap servers config, make sure all three IPs are distinct: Broker1_IP:19092, Broker2_IP:19092, Broker3_IP:19092. Using the same IP twice means one broker gets no traffic from Connect / UI.
🚀
PHASE 02 — Step 5

Start & Verify Docker Containers

Shell — from ~/kafka directory
# Start all containers in detached mode
cd ~/kafka
docker compose up -d

# Watch startup logs
docker compose logs -f

# Check all containers are running
docker compose ps

# Test Kafka Connect REST API is responding
curl http://localhost:8083/

# List installed connector plugins
curl http://localhost:8083/connector-plugins | python3 -m json.tool

# Test Kafka UI is accessible
curl -I http://localhost:8190/
💡
Kafka Connect takes 30–60s to start It needs to connect to all brokers, create its internal topics (connect-configs, connect-offsets, connect-status), and load all connector JARs. If curl http://localhost:8083 fails immediately, wait 60 seconds and retry.

Access the Kafka UI

Open a browser and go to http://<DockerVM_IP>:8190. Log in with the credentials set in docker-compose.yaml: username admin, password password@123. You should see all 3 brokers listed and the cluster health overview.


📡
PHASE 03 — Step 1

Prometheus Scrape Endpoints

Two exporters are already running from the Docker VM. Configure Prometheus to scrape both.

ExporterEndpointMetrics
kafka-exporter http://<DockerVM_IP>:9308/metrics Broker-level: topic lag, partition count, consumer group offsets, leader election rate
jmx-exporter http://<DockerVM_IP>:7072/metrics Kafka Connect JVM: connector status, task errors, throughput, rebalances
prometheus.yml — scrape_configs section
scrape_configs:

  # Broker-level metrics (topics, consumer lag, partition leadership)
  - job_name: 'kafka-brokers'
    static_configs:
      - targets: ['<DockerVM_IP>:9308']
    scrape_interval: 15s

  # Kafka Connect JVM metrics (connector tasks, errors, throughput)
  - job_name: 'kafka-connect'
    static_configs:
      - targets: ['<DockerVM_IP>:7072']
    scrape_interval: 15s
📊
PHASE 03 — Step 2

Grafana Dashboards

Import these community dashboards directly in Grafana (Dashboards → Import → Grafana.com ID):

DashboardGrafana IDData Source
Kafka Exporter Overview 7589 Prometheus → kafka-brokers job
Kafka Consumer Lag 12460 Prometheus → kafka-brokers job
Kafka Connect Dashboard 11962 Prometheus → kafka-connect job
📊
Key metrics to watch kafka_consumergroup_current_offset_sum vs kafka_topic_partition_current_offset gives consumer lag per group. kafka_connect_connector_status (value 1 = running, 0 = failed) is the most important Connect metric to alert on.

SECTION FINAL

What You Have Now

📨 Kafka Cluster

3-broker KRaft cluster (no ZooKeeper)
Replication factor 3 — full redundancy
min.insync.replicas=2 — survives 1 broker failure
Systemd service — auto-starts on reboot
Internal + external listener separation

🐳 Docker Services

Kafka UI at :8190 — web dashboard
Kafka Connect at :8083 — JDBC + PG connectors
Kafka Exporter at :9308 — broker metrics
JMX Exporter at :7072 — Connect JVM metrics
Grafana dashboards for lag + connectors

Key Takeaways

  • Same UUID on all 3 brokers — generated once on Broker 1, applied to all via kafka-storage.sh format
  • Comment out bootstrap.servers — the controller.quorum.bootstrap.servers line conflicts with the static voters config
  • Start brokers close together — quorum needs 2 of 3 to be reachable; large time gaps cause leader election delays
  • INTERNAL listener (19092) for Connect + UI — don't use the EXTERNAL port for internal services
  • Kafka Connect takes ~60s to initialise — it must create its internal topics on first start
  • Add more connectors by dropping JARs into connect-plugins/ and restarting the container
🔜
Next Steps Add TLS between brokers for encrypted inter-broker communication. Configure SASL authentication for producer/consumer access control. Set up topic-level retention policies and quotas per client. Consider deploying Kafka on Kubernetes using the Strimzi operator for fully automated cluster management.