Spark-Master-REST-API

In this guide, we will walk you through the process of deploying Spark applications on a standalone cluster using a Python client. Apache Spark is a powerful open-source data processing and analytics engine that supports various cluster managers, and one of them is the standalone cluster manager. We will use the Spark documentation as a reference (https://spark.apache.org/docs/latest/cluster-overview.html) and build a Python client to submit Spark applications to a standalone cluster.

This library abstracts the process of interacting with the Spark master's REST API to submit applications.

Prerequisites

Before you begin, ensure that you have the following prerequisites in place:

Apache Spark: Install Apache Spark on your machine or cluster. You can download it from the official website: https://spark.apache.org/downloads.html
Standalone Cluster Manager: Set up a Spark standalone cluster. Refer to the Spark documentation for detailed instructions on cluster setup: https://spark.apache.org/docs/latest/spark-standalone.html
Python: Make sure you have Python installed on your machine.

Usage

This guide will walk you through the process of using a Python script to submit a remote JAR file to a Spark standalone cluster.

pip install spark_master_rest_api

PyPi

from spark_master_rest_api import Client

# Initialize the client with Spark master's hostname and version
client = Client('spark-master.example.com', '3.2.1')

# Define Spark application submission parameters
app_resource = "hdfs:///jars/app.jar"
spark_properties = {
    "spark.master": "spark://spark-master.example.com:7077",
    "spark.submit.deployMode": "cluster",
}
main_class = "com.example.Main"
app_args = []

# Submit the Spark application
response, submit_result = client.submit(
    app_resource=app_resource,
    spark_properties=spark_properties,
    main_class=main_class,
    app_args=app_args,
)

# Print the submission result
print(submit_result.submission_id)
print(submit_result.success)

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
spark_master_rest_api		spark_master_rest_api
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
project.toml		project.toml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Spark-Master-REST-API

Prerequisites

Usage

About

Releases

Packages

Languages

License

nazerpanahi/spark-master-rest-api

Folders and files

Latest commit

History

Repository files navigation

Spark-Master-REST-API

Prerequisites

Usage

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages