EvalMetricSpec Example (RevPro)

If you want to know how I write an EvalMetricsSpec to calculate features for ML, here it is:

var end = "2018-10-01"

var evalSpec = EvalMetricsSpec.make({expressions:metrics, start:DateTime(end).plusDays(-1), end:DateTime(end), interval: "DAY"});

In this code, I have a start date of 1 day before the end date, and I am taking daily readings (specified by interval:”DAY”). So there will be exactly one reading (for 2018-09-30). The metrics in the list “metrics” will be computed and only the value for that day will be returned. For RevPro, we usually set “end” to be 1 month before the inspection. That is because 1) the data is loaded on a monthly basis so we cannot assume that we have data within one month of the inspection and 2) sometimes inspectors cut power to a consumer a week before they record the inspection outcome, and we don’t want the drop resulting from cutting power to be a leaky feature indicative of an upcoming TPE.

Think of the evalSpec as saying “First compute the metrics based on the metric definition and the dates in my spec. If there is an eval function, compute that. When you’re done, return a time slice for the start and end dates that I’ve asked for (in this case time slice is only 1 day).”

Some of the metrics in the list “metrics” could be rolling metrics. Here’s how we would define them:

  "id" :  "MyMetric_Max_Days_withReset" ,
  "name" :  "MyMetric_Max_Days_withReset" ,
  "expression" :  "eval('SUM','DAY',rolling('MAX', MyMetric, ResetCondition, 0), dateTime('2008-01-01'))"

The metric basically calculates the MAX on MyMetric (resets to 0 on ResetCondition, which could be a boolean metric) since 2008-01-01. The “SUM”,“DAY” part does nothing since my evalSpec interval is also in “DAY” (If the evalSpec interval were “MONTH”, then it would add up the MAX values for all the days in the month).

Without that eval condition, the metric would have returned MAX on MyMetric (resets to 0 on ResetCondition, which could be a boolean metric) since 2018-09-30 (because that is the start date in evalSpec).

If I want a relative date to compute the sum (say 1 year before the current date), I would use the “window” function starting at -3650 ending at 0 (present) at an interval of “DAY”. Rolling is computed on absolute dates (either specified in the evalSpec or in the eval function), while window is relative to the evalSpec.

cc @varun.krishna