Tuesday, August 1, 2017

log4j trick when creating ConsoleAppender from Java code

Right way to do it:

/**
* Dynamically initialize a log4j console appender so we can log info to stdout
* with specified level and pattern layout.
*/
public static void initStdoutLogging(Level logLevel, String patternLayout) {
LogManager.resetConfiguration();
ConsoleAppender appender = new ConsoleAppender(new PatternLayout(patternLayout));
appender.setThreshold(logLevel);
LogManager.getRootLogger().addAppender(appender);
}

initStdoutLogging(Level.INFO, PatternLayout.TTCC_CONVERSION_PATTERN);

When enabling logging for Spark, we need to call above initStdoutLogging twice: one for the driver and another for the executor.   Otherwise, you can only see logs from driver. 

To read logs from executor, you will have to use something like:

sudo -u hdfs yarn logs -applicationId application_1501527690446_0066


Analysis:

If I initialize it like this:
ConsoleAppender ca = new ConsoleAppender();
ca.setLayout(new PatternLayout(PatternLayout.TTCC_CONVERSION_PATTERN));
it gives an error and breaks the logging.
Error output:
log4j:ERROR No output stream or file set for the appender named [null].
If I initialize it like this it works fine:
ConsoleAppender ca = new ConsoleAppender(new PatternLayout(PatternLayout.TTCC_CONVERSION_PATTERN));

The reason:
If you look at the source for ConsoleAppender:
  public ConsoleAppender(Layout layout) {
    this(layout, SYSTEM_OUT);
  }

  public ConsoleAppender(Layout layout, String target) {
    setLayout(layout);
    setTarget(target);
    activateOptions();
  }
You can see that ConsoleAppender(Layout) passes SYSTEM_OUT as the target, and also that it calls activateOptions after setting the layout and target.
If you use setLayout yourself, then you'll also need to explicitly set the target and call activateOptions.