Skip to content
This repository has been archived by the owner on Oct 8, 2019. It is now read-only.

The number of mappers is less than splits in Hadoop 2.x

myui edited this page Oct 24, 2014 · 5 revisions

The default hive.input.format is set to org.apache.hadoop.hive.ql.io.CombineHiveInputFormat. This configuration could give less number of mappers than the split size (i.e., # blocks in HDFS) of the input table.

Try setting org.apache.hadoop.hive.ql.io.HiveInputFormat for hive.input.format.

set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;

Note Apache Tez uses org.apache.hadoop.hive.ql.io.HiveInputFormat by the default.

set hive.tez.input.format;

hive.tez.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat


You can then control the maximum number of mappers via setting:

set mapreduce.job.maps=128;
Clone this wiki locally