Spark Cluster in Docker Containers

This project will help you quickly deploy a spark cluster in containers.

How to build the image?

You can build the image using the docker build command. First cd into the directory and then run the following command:

docker build -t reddy-s/spark:1.0 spark

Note: Always make sure you specify a verison as it would help track your live application versions

How to run the contianers?

The containers can be run using the compose file.

docker-compose up -d

Note: the -d notation will run the containers in background

This would start the following:

Spark Master

master:
    image: reddy-s/spark:1.0
    command: /bin/bash -c "/home/lib/spark-2.2.1-bin-hadoop2.7/bin/spark-class org.apache.spark.deploy.master.Master"
    environment:
      - MASTER_PORT_7077_TCP_ADDR=172.18.0.2
      - MASTER_PORT_7077_TCP_PORT=7077
    volumes:
        - type: volume
        source: cluster-data
        target: /home/data
    ports:
        - "8080:8080"
        - "7077:7077"

Spark Worker

slave:
    image: reddy-s/spark:1.0
    command: /bin/bash -c "/home/lib/spark-2.2.1-bin-hadoop2.7/bin/spark-class org.apache.spark.deploy.worker.Worker 172.18.0.2:7077"
    environment:
      - MASTER_PORT_7077_TCP_ADDR=172.18.0.2
      - MASTER_PORT_7077_TCP_PORT=7077
    volumes:
      - type: volume
        source: cluster-data
        target: /home/data
    depends_on:
      - master
    links:
      - master

A Postgres database to read and persist data if needed

db:
    image: manoj10/postgres
    ports:
      - "5431:5430"

How to run your spark jobs?

You can find a sample Spark Job (no where close to a production one) for you to test the cluster.

You can interact with any of the workers by running the following command

docker exec -it slave_1 /bin/bash

Run your spark job with the following command

./bin/spark-submit \
    --master spark://$MASTER_PORT_7077_TCP_ADDR:$MASTER_PORT_7077_TCP_PORT \
    /home/data/sparkjob.py

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
spark		spark
Docker-Compose.yml		Docker-Compose.yml
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Spark Cluster in Docker Containers

How to build the image?

How to run the contianers?

How to run your spark jobs?

About

Releases

Packages

reddy-s/spark-container

Folders and files

Latest commit

History

Repository files navigation

Spark Cluster in Docker Containers

How to build the image?

How to run the contianers?

How to run your spark jobs?

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages