Assignment 1: Data Analysis
- Assignment Documentation.pdf
- Complete chronological screenshots.pdf
Data extraction query codes from Stack Exchange.txt
Pig ETL Source Code.txt
Hive Query Source Code.txt
Hadoop source code for TF-IDF.txt
mapper1.py
mapper2.py
mapper3.py
mapper4.py
reducer1.py
reducer2.py
reducer3.py
Source for mapper and reducer scripts : https://github.com/devangpatel01/TF-IDF-implementation-using-map-reduce-Hadoop-python-.git
General HDFS Commands Source Code.txt
Extracted data from Stack Exchange.zip
ETL Transformations using Pig.zip
Data queried from Hive.zip
TF-IDF calculated using MapReduce.zip