Skip to content

floslm/sparkWorkshop

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

For spark workshop by Open Source Lebanese Movement and Cnam Liban

Open Source lebanese Movement

ISSAE Cnam Liban

##Venue: Lebanese University (UL) - Beirut - Lebanon

. Date(s): 2nd December 2016

. Time: 09:00-13:00

. By: Pascal Fares

Introduction: Spark is built on the concept of distributed datasets, which contain arbitrary Scala, Java or Python objects. You create a dataset from external data, then apply parallel operations to it. The building block of the Spark API is its RDD API. In the RDD API, there are two types of operations: transformations, which define a new dataset based on previous ones, and actions, which kick off a job to execute on a cluster. On top of Spark’s RDD API, high level APIs are provided, e.g. DataFrame API and Machine Learning API.

Aims Of The Workshop Install, configure and get an overview of Open Tools for big data, data manipulations and clustering. You'll learn among other how to install Linux, Java, Sacala, Spark on any intel or AMD PC then build a small "cluster". Will then use our cluster to demonstrate the map-reduce and machine learning pradigm by appling some use cases.

##Targeted Audience:

professionals and gradute students aspiring to learn the basics of open source tools and products in the field of Big Data Analytics using Spark Framework and become a Spark Developer (users).

Installing open source tools and products Leaning new programming paradigms in the field of big data and clustering using "Open Source" only tools

Prerequisites: Knowledge of programming, Java is best

Laptop Requirements: Laptop Requirements: At least each 2 must have a laptop, will install Linux on them (or on a VirtualBox , CPUs include hardware virtualization features that help accelerate VirtualBox is a best)

Flyer

Registration: click here

Usefull links