Friday, February 2, 2018

Increase map reduce timeout value and JVM heap size, or set different log level for map reduce job

Update the mapred-site.xml on client machine only:

For increasing heap size:

  <property>
    <name>mapreduce.map.java.opts</name>
    <value>-Djava.net.preferIPv4Stack=true -Xmx3600m</value>
  </property>

  <property>
    <name>mapreduce.map.memory.mb</name>
    <value>4096</value>
  </property>

Make sure the value of "mapreduce.map.memory.mb" is greater than the value of "-Xmx"

For no timeout:

  <property>
    <name>mapreduce.task.timeout</name>
    <value>0</value>
  </property>

Default is 10 minutes: 600000

adding these to the mapred-site.xml to lower log level from INFO to WARN:
    <property>
    <name>mapreduce.map.log.level</name>
    <value>WARN</value>
  </property>
      <property>
    <name>mapreduce.reduce.log.level</name>
    <value>WARN</value>
  </property>

Of course, we can also so this through Java code.  For example, use following code to reset JVM memory:

conf.setInt("yarn.app.mapreduce.am.resource.mb", 2048);
String opt = "-Xmx2048m";
conf.set("yarn.app.mapreduce.am.command-opts", opt);