The MPI Operator makes it easy to run allreduce-style distributed training.
kubectl create -f deploy/
Launch a multi-node tensorflow benchmark training job:
kubectl create -f examples/tensorflow-benchmarks.yaml
Once everything starts, the logs are available in the launcher
pod.