VSP (Sacramento), 6/29/2017
Assets for IBM's Apache Spark Proof of Technology
You will be using IBM DSX notebooks and Apache Spark Service on IBM Bluemix Cloud to work on the labs.
PLEASE READ: https://github.com/tliakos/dsx-spark/blob/master/DSX%20workbook_draft1.docx
-
Setup your Spark Service in IBM Bluemix:
To setup your IBM Bluemix enviroment navigate to https://new-console.ng.bluemix.net, register and create a Spark service. -
Log in to IBM Data Science Experience (DSX) to create and run notebooks:
To setup your IBM DSX (Data Science Experience) enviroment navigate to http://datascience.ibm.com and login using your bluemix userid.
A video tutorial on setting up the enviroment can be viewed here:
https://www.youtube.com/watch?v=yG3tVVDz1uE
To use these notebooks simply cut and paste the URLs below when you are creating a new notebook.
-
Introduction to Spark - Python:
https://github.com/tliakos/dsx-spark/blob/master/Lab%201-%20Introduction%20to%20Spark-Student.ipynb -
Introduction to Spark SQL:
https://github.com/tliakos/dsx-spark/blob/master/Lab%202:%20Spark%20SQL%20-%20Student.ipynb -
Spark Machine Learning - Python:
https://github.com/tliakos/dsx-spark/blob/master/Lab%203%20-%20Machine%20Learning%20Student.ipynb
https://raw.githubusercontent.com/tliakos/dsx-spark/master/data.csv
Spark Streming webinar link: https://www.youtube.com/watch?v=_mFm2F7UQgU
Spark Streming demo code : https://github.com/smatlapudi/spark-streaming-webinar1