VSP (Sacramento), 6/29/2017
Assets for IBM's Apache Spark Proof of Technology
You will be using IBM DSX notebooks and Apache Spark Service on IBM Bluemix Cloud to work on the labs.
-
Setup your Spark Service in IBM Bluemix:
To setup your IBM Bluemix enviroment navigate to https://new-console.ng.bluemix.net, register and create a Spark service. -
Log in to IBM Data Science Experience (DSX) to create and run notebooks:
To setup your IBM DSX (Data Science Experience) enviroment navigate to http://datascience.ibm.com and login using your bluemix userid.
A video tutorial on setting up the enviroment can be viewed here:
https://www.youtube.com/watch?v=yG3tVVDz1uE
To use these notebooks simply cut and paste the URLs below when you are creating a new notebook.
-
Introduction to Spark - Python:
https://github.com/joshishwetha/dsx-spark/blob/master/Lab%201-%20Introduction%20to%20Spark-Student.ipynb -
Introduction to Spark SQL:
https://github.com/joshishwetha/dsx-spark/blob/master/Lab%202:%20Spark%20SQL%20-%20Student.ipynb -
Spark Machine Learning - Python:
https://github.com/joshishwetha/dsx-spark/blob/master/Lab%203%20-%20Machine%20Learning%20Student.ipynb
https://raw.githubusercontent.com/joshishwetha/dsx-spark/master/data.csv
Spark Streming webinar link: https://www.youtube.com/watch?v=_mFm2F7UQgU
Spark Streming demo code : https://github.com/smatlapudi/spark-streaming-webinar1