Tuesday, March 21, 2017

Mapreduce logs have weird behavior on HDP 2.3 Tez

When launching map reduce on Tez, we will not see our logs from HDP UI.  Click ‘History’ and got nothing although we can see Hadoop system logs.

But if you use following command, you will get map reduce logs from system stdout:

sudo -u hdfs yarn logs -applicationId application_1490133530166_0002

But the format is modified:

2017-03-21 15:04:24,708 [ERROR] [TezChild] |common.FindAndExitMapRunner|: caught throwable when run mapper: java.lang.UnsupportedOperationException: Input only available on map

The ‘source’ is changed to ‘TezChild’. 
Our package name is truncated to only have last part so our Java class name is not full name anymore. On this example, “com.xxx.hadoop.common.FindAndExitMapRunner” is changed to “common.FindAndExitMapRunner”

          To be compare with normal log (i.e. without Tez), here is what we should have from map reduce log (you see full package name and class name):


      2017-03-20 14:29:54,778 INFO [main] com.xxx.hadoop.common.ColumnMap: Columnar Mapper

Important! Important! Important! ---->
To review the log, you have to use exact user who launched such application: sudo -u hdfs"
Otherwise, you will see following error:
"Log aggregation has not completed or is not enabled."

Monday, March 6, 2017

Additional jar files when running Spark under Hadoop YARN mode (CDH 5.10.0 with Scala 2.10 and Spark 1.6.0)



lrwxrwxrwx 1 root root     91 Feb 23 15:02 spark-core_2.10-1.6.0-cdh5.10.0.jar -> /opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.41/jars/spark-core_2.10-1.6.0-cdh5.10.0.jar
lrwxrwxrwx 1 root root     80 Feb 23 15:15 scala-library-2.10.6.jar -> /opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.41/jars/scala-library-2.10.6.jar
lrwxrwxrwx 1 root root     37 Mar  6 14:06 commons-lang3-3.3.2.jar -> ../../../jars/commons-lang3-3.3.2.jar
-rw-r--r-- 1 root root 185676 Mar  6 14:11 typesafe-config-2.10.1.jar
lrwxrwxrwx 1 root root     55 Mar  6 14:26 akka-actor_2.10-2.2.3-shaded-protobuf.jar -> ../../../jars/akka-actor_2.10-2.2.3-shaded-protobuf.jar
lrwxrwxrwx 1 root root     56 Mar  6 14:28 akka-remote_2.10-2.2.3-shaded-protobuf.jar -> ../../../jars/akka-remote_2.10-2.2.3-shaded-protobuf.jar
lrwxrwxrwx 1 root root     55 Mar  6 14:29 akka-slf4j_2.10-2.2.3-shaded-protobuf.jar -> ../../../jars/akka-slf4j_2.10-2.2.3-shaded-protobuf.jar
lrwxrwxrwx 1 root root     70 Mar  6 14:40 spark-assembly-1.6.0-cdh5.10.0-hadoop2.6.0-cdh5.10.0.jar -> ../../../jars/spark-assembly-1.6.0-cdh5.10.0-hadoop2.6.0-cdh5.10.0.jar
[root@john2 lib]# pwd
/opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.41/lib/hadoop-yarn/lib