Debugging MapReduce jobs


#1

I’ve a MapReduce job that seems to be failing:

> var job = AwesomeMapReduceJob.create(. . .);
> AwesomeMapReduceJob.start(job);
> job.status()
C3.typesys.Obj {step: "map", errors: Array(1), started: DateTime, startedby: "BA", status: "failing"}
> job.status().errors[0]
C3.typesys.Obj {failedActionId: "7405.5611781", errorMsg: "wrapped org.mozilla.javascript.EcmaError: Referenc…ined. (AwesomeMapReduceJob_map.js#57)", errorCodes: "ScriptError", errorLog: "c3.love.exceptions.C3RuntimeException: wrapped org…a:617)↵	at java.lang.Thread.run(Thread.java:748)↵"}

When I look in MapReduceQueue I just see that all jobs I’ve started are failed:

c3Grid(MapReduceQueue.countAll())

Where I have to look for more information on the error that’s causing jobs failure?
Is there an easy/iterative way for debugging MapReduce jobs?


#2

To get the full stack trace of the error inside the job, this works:

c3Grid(job.status())

Still looking for an easy way to debug the map() and reduce() jobs, as they are executed by the framework and console.log won’t work!


#3

You can always run the map or reduce function in the console (debugger mode) line by line to better understand the root cause of the error.

For example for the map function, you would:

  1. Fetch some objs (src type of your map reduce job), with the correct “filter” and “include” corresponding to your map-reduce filter and include parameters.
  2. Fetch the map-reduce job instance that was failing.
  3. Copy-paste the map function in the console
  4. Run map(batch, objs, job) (batch is just an int, indicating num of batch being processed by the map function).

Another option is to add loggers in your map and reduce functions and analyze log outputs in splunk.


#4

Thanks @romain, I’ve endup doing something similar, here is my code:

function runner() {
  var targetTypeRef = TypeRef.fromString(awesomeTypeName);
  var options = {
    id: DateTime.now().getMillis(),
    targetType: targetTypeRef,
      //batchSize: 10,
      //limit:-1,
    filter: Filter.eq("id", 1),
    . . .
  };

  var job = AwesomeMapReduceJob.make(options);
  job.merge({include:"id, targetType, filter, startDate, endDate, interval, oid",});
  // input
  var objs = AwesomeType.fetch({filter: filter}).objs;
  console.log('count', objs.length);
  // map() phase
  var mapResult = AwesomeMapReduceJob.map(objs.length, objs, job);
  // reduce() phase
  console.log(mapResult);
  Object.entries(mapResult).forEach(function(arr) {
    var outKey = arr[0];
    var interValues = [arr[1]];
    console.log(outKey, '=', interValues);
    AwesomeMapReduceJob.reduce(outKey, interValues, job);
  });
}