Spark examples give quick overview of the Spark API.
Spark is built on the concept of distributed datasets, which contain arbitrary Java or Python objects. You create a dataset from external data, then apply parallel operations to it. The building block of the Spark API is its RDD API.
- java uses Gradle
- Python uses pyspark
- jupyter notebook
- Explain the Spark environment setup
- Uses JDK 11
- IntelliJ Community Edition IDE
- pySpark
Spark examples uses a number of open source projects to work properly:
- Open JDK 11
- pySpark
- MongoDB
- Windows 10
The nc (or netcat) utility is used for just about anything under the sun involving TCP or UDP. It can open TCP connections, send UDP packets, listen on arbitrary TCP and UDP ports, do port scanning, and deal with both IPv4 and IPv6.
The socket examples uses the following command
nc -lk 9999
windows uses netcat from nmap and download and run the following command
netcat -lk 9999
Spark requires JDK to run.