Using Apache Kafka locally with docker-compose
curl -O https://gist.githubusercontent.com/nfroidure/720e83d6796a7c276f69ec8ad27fd7e9/raw/0bb69bbb8e8d97dd31f5b9dc3655fd6407910480/docker-compose.yml
Innovation is often driven by data. Through my various professional experience, I naturally went to use message queuing and then streams processing.
I first used Kinesis for its ease of use but it do not support topics and it is not open-source with is not good to stay cloud agnostic so I decided to switch to Kafka for my new position at DiagRAMS.
The thing is that there is no official Kafka docker image which lead to a lack of documentation on how to use it. This article may help you to spend less time on it than I had to.
If you like using
docker-compose for your developer
environment, here is the recipe.
I choose to use the Bitnami images (feel free to share yours!) since no official one exists at the time of this writing.
I also explicitly declare the network options for two main reasons:
- I need to choose the IP range range docker uses to avoid collisions with my various VPC (which led to a few annoying moments configuring my VPN connection...),
Apache Kafka uses an advertising system to share the brokers hosts
leading to an easier setup if you can rely on a fixed IP adresses
for them to fill the
Here is the result:
You can add more brokers if you wish thought is it not generally useful for development. Just beware that you will have to tweak the various environment variables.
Connecting with Kafdrop
Kafdrop can be directly added to the docker-compose file but I prefer not doing so to keep the development environment lighter.
It also allows to selectively run Kafdrop for both local and production environments.
So let's run Kafdrop once we need it with that simple command:
docker run --rm -p 9000:9000 \ -e KAFKA_BROKERCONNECT="10.5.0.1:9092" \ -e JVM_OPTS="-Xms32M -Xmx256M" --network myapp \ -e SERVER_SERVLET_CONTEXTPATH="/" \ obsidiandynamics/kafdrop:latest
--network myapp allows Kafdrop to
live in the same network than our Kafka brokers.
Here is the command for production were you will probably need to add the SSL configuration like this:
docker run --rm -p 9000:9000 \ -e KAFKA_BROKERCONNECT=$(node -e "process.stdout.write($(terraform output kafka_bootstrap_brokers))") \ -e JVM_OPTS="-Xms128M -Xmx2G" -e KAFKA_PROPERTIES=$(echo security.protocol=SSL | base64) \ -e SERVER_SERVLET_CONTEXTPATH="/" \ obsidiandynamics/kafdrop:latest
As you can see, I directly retrieve the Kafka brokers via my Terraform states, feel free to do so or simply add it by hands.
Using Kafka scripts
By reading the Kafka docs, you will probably be prompted to use the scripts embedded by Kafka, here is, for example, how you would create a topic with the above setup:
docker-compose exec kafka /opt/bitnami/kafka/bin/kafka-topics.sh \ --create \ --bootstrap-server localhost:9092 \ --replication-factor 1 \ --partitions 1 \ --topic users
Listing available commands is done simply that way:
docker-compose exec kafka ls /opt/bitnami/kafka/bin
Kafka is an interesting technology, that said, you should be aware that using Kafka is not on its own a passport for managing big data.
Finally, I found out that searching for documentation often leads to Confluent specific tutorial which is not great. I think that using free software should not be tied to a particular company so I hope more people will take some time to tell how to use raw Kafka, I will be glad to read it ;).