Skip to content

Hadoop MapReduce simple average calculator application that counts the average grade for each module in a given input set. The

License

Notifications You must be signed in to change notification settings

priyanktejani/hadoop-mapreduce-average-calculator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Hadoop MapReduce average calculator

AverageCalculator is a simple MapReduce application that counts the average grade for each module in a given input set. The application operates in two different stages map phase and reduces phase.

1. Map phase

Mapper takes the file as input, divides it into single line. Where 4th (module) and 5th (grade) columns values are stored as KeyValue pair in the HashMap. But if the key already exists, it adds a new grade value with the previous grade value. Additionally, another HashMap tracks the number of times each module appears in a column. Then a cleanup converts the Hashmap Values into List of <Module, <IntPair(valueGrade, valueCount)>. Here, the key is the Module and the value is the integer pair of Grade, and the total Count of that key.

2. Reduce phase

This is the phase that is responsible to calculates the average grade of each module. The reducer takes a List of <Module, <IntPair(valueGrade, valueCount)> from the Map class. Then it iterates over the Integer pair of values and adds each pair value with the previous value. Finally, the sum of Grades is divided by the total number of counts recorded during the iteration process. Which outputs a final average for each Module.

License

MIT. Copyright (c) MIT License.

About

Hadoop MapReduce simple average calculator application that counts the average grade for each module in a given input set. The

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages