Skip to content

Commit

Permalink
update kafka docs (#13)
Browse files Browse the repository at this point in the history
* update kafka configs/docs
* update Dockerfile entrypoint
  • Loading branch information
haobibo authored Apr 21, 2023
1 parent 8d1d2e5 commit 52e4031
Show file tree
Hide file tree
Showing 9 changed files with 193 additions and 137 deletions.
10 changes: 5 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,8 @@ Please generously STAR★ our project or donate to us! [![GitHub Starts](https:
## Building blocks for data lake and pipelines projects

Building blocks for the following big data project use cases are supported in this project:
- Flink
- Spark
- Kafka
- Elasticsearch
- GreenplumDB

- [Flink / Spark](https://hub.docker.com/r/qpod/bigdata)
- [Kafka](https://hub.docker.com/r/qpod/kafka)
- [Elasticsearch](https://hub.docker.com/r/qpod/elasticsearch)
- [GreenplumDB](https://hub.docker.com/r/qpod/greenplum)
5 changes: 4 additions & 1 deletion docker_kafka_confluent/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,10 @@ RUN source /opt/utils/script-confluent-kafka.sh \
&& echo "Clean up" && list_installed_packages && install__clean

ENV PATH=$PATH:$KAFKA_HOME/bin \
CLUSTER_ID="pUyrmY_RQHClQc9LPBJqTw"
CLUSTER_ID="pUyrmY_RQHClQc9LPBJqTw" \
KAFKA_LOG_DIRS="/var/lib/${COMPONENT}/data"

EXPOSE 9092
VOLUME ["/var/lib/${COMPONENT}/data", "/etc/${COMPONENT}/secrets"]
ENTRYPOINT ["tini", "-g", "--"]
CMD ["/etc/confluent/docker/run"]
75 changes: 2 additions & 73 deletions docker_kafka_confluent/README.md
Original file line number Diff line number Diff line change
@@ -1,76 +1,5 @@
# Confluent Kafka in KRaft mode

In the KRaft mode, zookeeper is not need.
To start a standalone mode KRaft kafka, use the `docker-compose.yml` file in this folder.
In the [KRaft](https://developer.confluent.io/learn/kraft/) mode, zookeeper is no longer need.

## Development - debug inside docker

```bash
docker run -it \
--name=cp-ckafka \
-h=broker \
-p=9092:9092 \
-v $(pwd):/root/dev \
qpod/jdk11 bash

export KAFKA_BROKER_ID=1
export KAFKA_LISTENER_SECURITY_PROTOCOL_MAP='CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT'
export KAFKA_ADVERTISED_LISTENERS='PLAINTEXT://broker:29092,PLAINTEXT_HOST://localhost:9092'
export KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR=1
export KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS=0
export KAFKA_TRANSACTION_STATE_LOG_MIN_ISR=1
export KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR=1
export KAFKA_JMX_PORT=9101
export KAFKA_JMX_HOSTNAME=localhost
export KAFKA_PROCESS_ROLES='broker,controller'
export KAFKA_NODE_ID=1
export KAFKA_CONTROLLER_QUORUM_VOTERS='1@broker:29093'
export KAFKA_LISTENERS='PLAINTEXT://broker:29092,CONTROLLER://broker:29093,PLAINTEXT_HOST://0.0.0.0:9092'
export KAFKA_INTER_BROKER_LISTENER_NAME='PLAINTEXT'
export KAFKA_CONTROLLER_LISTENER_NAMES='CONTROLLER'
export KAFKA_LOG_DIRS='/tmp/kraft-combined-logs'
export KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR=1

export COMPONENT=kafka
export KAFKA_HOME=/opt/kafka
export PATH=$PATH:$KAFKA_HOME/bin

export CLUSTER_ID="${CLUSTER_ID:-$(kafka-storage random-uuid)}"
kafka-storage format --ignore-formatted -t "${CLUSTER_ID}" -c /etc/kafka/kafka.properties

sed -i '1i\
export CLUSTER_ID="${CLUSTER_ID:-$(kafka-storage random-uuid)}"
' /etc/confluent/docker/run

/etc/confluent/docker/run > /tmp/kafka.log
```

## Development - build the docker image and run

```bash
docker build -t qpod/kafka --build-arg "BASE_NAMESPACE=qpod" .

docker run -it \
-e KAFKA_BROKER_ID=1 \
-e KAFKA_LISTENER_SECURITY_PROTOCOL_MAP='CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT' \
-e KAFKA_ADVERTISED_LISTENERS='PLAINTEXT://broker:29092,PLAINTEXT_HOST://localhost:9092' \
-e KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR=1 \
-e KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS=0 \
-e KAFKA_TRANSACTION_STATE_LOG_MIN_ISR=1 \
-e KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR=1 \
-e KAFKA_JMX_PORT=9101 \
-e KAFKA_JMX_HOSTNAME=localhost \
-e KAFKA_PROCESS_ROLES='broker,controller' \
-e KAFKA_NODE_ID=1 \
-e KAFKA_CONTROLLER_QUORUM_VOTERS='1@broker:29093' \
-e KAFKA_LISTENERS='PLAINTEXT://broker:29092,CONTROLLER://broker:29093,PLAINTEXT_HOST://0.0.0.0:9092' \
-e KAFKA_INTER_BROKER_LISTENER_NAME='PLAINTEXT' \
-e KAFKA_CONTROLLER_LISTENER_NAMES='CONTROLLER' \
-e KAFKA_LOG_DIRS='/tmp/kraft-combined-logs' \
-e KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR=1 \
--name cp-kafka \
-h broker \
-p 9092:9092 \
-v $(pwd):/root/dev \
qpod0dev/cp-kafka
```
To start a standalone mode KRaft kafka, refert to the `docker-compose.yml` file in the `example/kafka-standalone-confluent` folder.
26 changes: 0 additions & 26 deletions docker_kafka_confluent/docker-compose-bitnami.yml

This file was deleted.

32 changes: 0 additions & 32 deletions docker_kafka_confluent/docker-compose.yml

This file was deleted.

41 changes: 41 additions & 0 deletions docker_kafka_confluent/example/kafka-standalone-bitnami/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# Start a Standalone mode single-node Kafka cluster in Kraft mode (using bitnami docker image)

Reference: https://github.com/bitnami/containers/tree/main/bitnami/kafka#kafka-without-zookeeper-kraft

Notice: the `localhost` in the `KAFKA_CFG_ADVERTISED_LISTENERS` needed to be changed to your host's external IP/hostname.

```bash
KAFKA_DATA_DIR="/data/kafka-bitnami/broker" && mkdir -pv $KAFKA_DATA_DIR && chmod -R ugo+rws $KAFKA_DATA_DIR
docker-compose up -d
```

## Debug and Development

```bash
docker run -it --rm \
-e ALLOW_PLAINTEXT_LISTENER='yes' \
-e KAFKA_BROKER_ID=1 \
-e KAFKA_ENABLE_KRAFT='yes' \
-e KAFKA_KRAFT_CLUSTER_ID='k4hJjYlsRYSk9UQcZjN0rA' \
-e KAFKA_CFG_PROCESS_ROLES='broker,controller' \
-e KAFKA_CFG_CONTROLLER_QUORUM_VOTERS='[email protected]:9093' \
-e KAFKA_CFG_CONTROLLER_LISTENER_NAMES='CONTROLLER' \
-e KAFKA_CFG_INTER_BROKER_LISTENER_NAME='PLAINTEXT' \
-e KAFKA_CFG_LISTENERS='CONTROLLER://:9093,PLAINTEXT://:9092' \
-e KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP='CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT' \
-e KAFKA_CFG_ADVERTISED_LISTENERS='PLAINTEXT://localhost:9092' \
--name kafka-broker \
-h broker \
-p 9092:9092 \
-v /data/kafka-bitnami/broker:/bitnami \
-u root \
docker.io/bitnami/kafka:latest bash

apt update && apt install vim

vim /opt/bitnami/scripts/kafka/setup.sh

bash /opt/bitnami/scripts/kafka/setup.sh

bash /opt/bitnami/scripts/kafka/entrypoint.sh /opt/bitnami/scripts/kafka/run.sh
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
version: "3"

# Standalone kraft mode, ref: https://github.com/bitnami/containers/tree/main/bitnami/kafka#kafka-without-zookeeper-kraft
# Note: the volume folder read/write permission is required for uid 1001 and gid 1000. use the command below for debug:
# `KAFKA_DATA_DIR="/data/database/kafka-bitnami/broker" && mkdir -pv $KAFKA_DATA_DIR && chmod -R ugo+rws $KAFKA_DATA_DIR`

services:
kafka-broker:
image: docker.io/bitnami/kafka:latest
container_name: kafka-broker
hostname: broker
ports:
- "9092:9092"
environment:
ALLOW_PLAINTEXT_LISTENER: 'yes'
KAFKA_BROKER_ID: 1
KAFKA_ENABLE_KRAFT: 'yes'
KAFKA_KRAFT_CLUSTER_ID: 'k4hJjYlsRYSk9UQcZjN0rA'
KAFKA_CFG_PROCESS_ROLES: 'broker,controller'
KAFKA_CFG_CONTROLLER_QUORUM_VOTERS: '[email protected]:9093'
KAFKA_CFG_CONTROLLER_LISTENER_NAMES: 'CONTROLLER'
KAFKA_CFG_INTER_BROKER_LISTENER_NAME: 'PLAINTEXT'
KAFKA_CFG_LISTENERS: 'CONTROLLER://:9093,PLAINTEXT://:9092'
KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP: 'CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT'
# the `localhost` below needed to be changed to your host's external IP/hostname and external port
KAFKA_CFG_ADVERTISED_LISTENERS: 'PLAINTEXT://localhost:19092'

volumes:
- /data/database/kafka-bitnami/broker:/bitnami
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
# Start a Standalone mode single-node Kafka cluster in Kraft mode (using confluent kafka)

Notice: the `localhost` in the `KAFKA_ADVERTISED_LISTENERS` needed to be changed to your host's external IP/hostname.

```bash
KAFKA_DATA_DIR="/data/kafka-confluent/broker" && mkdir -pv $KAFKA_DATA_DIR && chmod -R ugo+rws $KAFKA_DATA_DIR
docker-compose up -d
```

## Debug

```bash
docker run -it --rm \
-e CLUSTER_ID='k4hJjYlsRYSk9UQcZjN0rA' \
-e KAFKA_BROKER_ID=1 \
-e KAFKA_NODE_ID=1 \
-e KAFKA_LOG_DIRS='/var/lib/kafka/data' \ \
-e KAFKA_PROCESS_ROLES='broker,controller' \
-e KAFKA_CONTROLLER_QUORUM_VOTERS='[email protected]:9093' \
-e KAFKA_CONTROLLER_LISTENER_NAMES='CONTROLLER' \
-e KAFKA_INTER_BROKER_LISTENER_NAME='PLAINTEXT' \
-e KAFKA_LISTENERS='CONTROLLER://:9093,PLAINTEXT://:9092' \
-e KAFKA_LISTENER_SECURITY_PROTOCOL_MAP='CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT' \
-e KAFKA_ADVERTISED_LISTENERS='PLAINTEXT://localhost:9092' \
--name kafka-broker \
-h broker \
-p 9092:9092 \
-v /data/kafka-confluent/broker/data:/var/lib/kafka/data \
-v /data/kafka-confluent/broker/secrets:/etc/kafka/secrets \
docker.io/qpod/kafka:latest bash

/etc/confluent/docker/run
```

## Development

```bash
docker run -it \
--name=kafka-broker-confluent-ce \
-h=broker \
-p=9092:9092 \
-v /data/kafka-confluent/broker/:/var/lib/kafka-broker/ \
-v $(pwd):/root/dev \
qpod/jdk11 bash

export KAFKA_BROKER_ID=1
export KAFKA_LISTENER_SECURITY_PROTOCOL_MAP='CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT'
export KAFKA_ADVERTISED_LISTENERS='PLAINTEXT://broker:29092,PLAINTEXT_HOST://localhost:9092'
export KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR=1
export KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS=0
export KAFKA_TRANSACTION_STATE_LOG_MIN_ISR=1
export KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR=1
export KAFKA_JMX_PORT=9101
export KAFKA_JMX_HOSTNAME=localhost
export KAFKA_PROCESS_ROLES='broker,controller'
export KAFKA_NODE_ID=1
export KAFKA_CONTROLLER_QUORUM_VOTERS='1@broker:29093'
export KAFKA_LISTENERS='PLAINTEXT://broker:29092,CONTROLLER://broker:29093,PLAINTEXT_HOST://0.0.0.0:9092'
export KAFKA_INTER_BROKER_LISTENER_NAME='PLAINTEXT'
export KAFKA_CONTROLLER_LISTENER_NAMES='CONTROLLER'
export KAFKA_LOG_DIRS='/tmp/kraft-combined-logs'
export KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR=1

export COMPONENT=kafka
export KAFKA_HOME=/opt/kafka
export PATH=$PATH:$KAFKA_HOME/bin

export CLUSTER_ID="${CLUSTER_ID:-$(kafka-storage random-uuid)}"
kafka-storage format --ignore-formatted -t "${CLUSTER_ID}" -c /etc/kafka/kafka.properties

sed -i '1i\
export CLUSTER_ID="${CLUSTER_ID:-$(kafka-storage random-uuid)}"
' /etc/confluent/docker/run

/etc/confluent/docker/run > /tmp/kafka.log
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
version: '3'

## Standalone kraft mode - confluent kafka: https://github.com/confluentinc/cp-all-in-one/blob/7.3.0-post/cp-all-in-one-kraft/docker-compose.yml
# Note: the volume folder read/write permission is required for uid 1001 and gid 1000. use the command below for debug:
# `KAFKA_DATA_DIR="/data/database/kafka-bitnami/broker" && mkdir -pv $KAFKA_DATA_DIR && chmod -R ugo+rws $KAFKA_DATA_DIR`

services:
kafka-broker:
image: docker.io/qpod/kafka:latest
container_name: kafka-broker
hostname: broker
ports:
- "9092:9092"
- "9101:9101"
environment:
CLUSTER_ID: 'k4hJjYlsRYSk9UQcZjN0rA'
KAFKA_BROKER_ID: 1
KAFKA_NODE_ID: 1
KAFKA_LOG_DIRS: '/var/lib/kafka/data'
KAFKA_PROCESS_ROLES: 'broker,controller'
KAFKA_CONTROLLER_QUORUM_VOTERS: '[email protected]:9093'
KAFKA_CONTROLLER_LISTENER_NAMES: 'CONTROLLER'
KAFKA_INTER_BROKER_LISTENER_NAME: 'PLAINTEXT'
KAFKA_LISTENERS: 'CONTROLLER://:9093,PLAINTEXT://:9092'
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: 'CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT'
# the `localhost` in the `KAFKA_ADVERTISED_LISTENERS` needed to be changed to your host's external IP/hostname
KAFKA_ADVERTISED_LISTENERS: 'PLAINTEXT://localhost:9092'
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0
KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
KAFKA_JMX_PORT: 9101
KAFKA_JMX_HOSTNAME: localhost
volumes:
- /data/kafka-confluent/broker/data:/var/lib/kafka/data
- /data/kafka-confluent/broker/secrets:/etc/kafka/secrets

0 comments on commit 52e4031

Please sign in to comment.