Skip to content

Latest commit

 

History

History
50 lines (34 loc) · 1.48 KB

README.md

File metadata and controls

50 lines (34 loc) · 1.48 KB

Spark examples

Java, Python and Jupyter notebook

Spark examples give quick overview of the Spark API.

Spark is built on the concept of distributed datasets, which contain arbitrary Java or Python objects. You create a dataset from external data, then apply parallel operations to it. The building block of the Spark API is its RDD API.

  • java uses Gradle
  • Python uses pyspark
  • jupyter notebook

Features

  • Explain the Spark environment setup
  • Uses JDK 11
  • IntelliJ Community Edition IDE
  • pySpark

Tech

Spark examples uses a number of open source projects to work properly:

  • Open JDK 11
  • pySpark
  • MongoDB
  • Windows 10

nc or netcat

The nc (or netcat) utility is used for just about anything under the sun involving TCP or UDP. It can open TCP connections, send UDP packets, listen on arbitrary TCP and UDP ports, do port scanning, and deal with both IPv4 and IPv6.

The socket examples uses the following command

nc -lk 9999

windows uses netcat from nmap and download and run the following command

netcat -lk 9999

Installation

Spark requires JDK to run.