-
Notifications
You must be signed in to change notification settings - Fork 0
Clustered Alarm System Example
Extend JAWS with EPICS alarms in a cluster of 12 containers.
Before beginning make sure to navigate to the cluster example directory:
cd examples/cluster
Finally, check connect status
Note: This example uses a single node with docker-compose. For true fault-tolerance and scalability you'll need to deploy the containers across multiple nodes using a container orchestration tool such as Docker Swarm or Kubernetes.
- Launch Zookeeper "ensemble" (3):
docker-compose -f zookeeper.yml up
Wait for them to come up!
- Launch Kafka nodes (3):
docker-compose -f kafka.yml up
Wait for them to come up!
- Launch alarm/support nodes:
docker-compose -f alarm.yml up
Wait for them to come up!
- Launch connect nodes (3):
docker-compose -f connect.yml up
Wait for them to come up!
Note: You can easily create a single node swarm with:
docker swarm init
Note: The same compose files (v3.2+) used above are used by the Docker Engine in Swarm mode. However, the ability to scale on demand requires more work to setup as you would need to run each piece as a single scalable service. For example, instead of having three separate fixed Kafka services defined, you would need to define a single dynamic Kafka service that could be scaled similar to what is described here.
- Launch Zookeeper "ensemble" (3):
docker stack deploy -c zookeeper.yml alarms
Wait for them to come up!
- Launch Kafka nodes (3):
docker stack deploy -c kafka.yml alarms
Wait for them to come up!
- Launch alarm/support nodes:
docker stack deploy -c alarm.yml alarms
Wait for them to come up!
- Launch connect nodes (3):
docker stack deploy -c connect.yml alarms
Wait for them to come up!
Note: If you used swarm then the container names are conveniently scrambled and obfuscated. You'll have to look them up with:
docker container ls
Further, the docker service ls command shows different names, which aren't container names. Despite docker-compose exec docs saying it works with service names, it really only works with container names.
docker exec -it connect-1 bash
/scripts/show-status.sh
docker stop connect-3
The show-status.sh script should show that task assigned to connect-3 (via IP address) is now in state UNASSIGNED. After the delay specified by scheduled.rebalance.max.delay.ms has elapsed (default 5 minutes) the task should be re-assigned and in state RUNNING, though on a different connect server.
If connect-3 happened to be the connector leader, it will have been moved to a new connect server as well (the leader runs the Connector, which has a ChannelManager that monitors the command channel for changes).