Custom Normalizer, Time Zone Interaction


#1

We have noticed a couple of problems when running a custom normalizer on a shared environment that do not happen on docker/localhost (v7.6.1):

  1. sometimes we get an error "Incorrect construction of timeseries meta. Length of field {} mismatch. expected=N, actual=N-1" even though we are sure that function normalize returns N points
  2. sometimes, somehow/perhaps, start of the normalized data is pushed back by one interval, perhaps indirectly causing the above error

The strangest part is that Jenkins tests fail because the result of evalMetric is shorter by 1 on the shared environment than on docker/localhost. There is no above error, only test failures. (They pass if I run them on the shared environment using c3-tester-node as my user.)

We attempted various fixes using time zone. We are still debugging in Splunk…

Update: We removed the tag where the above was happening, then reprovisioned and ran a couple of tests manually with success. Then we loaded real data, and the error started happening again. We then noticed that the following log call:

 log.info('t_tag={}, RAW9[{}, {}]: {}',
        c3c.tag, ts.data().length, specHeader.differential,
        JSON.stringify(ts));

returned in one case:

t_tag=vincent, RAW9[122.0, true]: {"type":"NormTimeseriesDouble","m_start":"2015-11-10T00:00:00.000","m_end":"2016-03-11T00:00:00.000","m_data":[7.247790697675068,7.247790697675068,7.247790697675068,7.247790697675068,7.247790697675072,7.247790697675065,7.247790697675072,7.247790697675065,7.247790697675072,7.247790697675072,7.2477906976750575,7.247790697675072,7.247790697675072,7.247790697675072,7.247790697675072,7.2477906976750575,7.247790697675072,7.247790697675072,7.247790697675072,7.247790697675072,7.247790697675072,7.247790697675043,7.247790697675072,7.247790697675072,7.247790697675072,7.247790697675072,7.247790697675072,7.247790697675072,7.247790697675072,7.247790697675072,7.247790697675043,7.247790697675072,7.247790697675072,7.247790697675072,7.247790697675072,7.247790697675072,7.247790697675043,7.2477906976751,7.247790697675043,7.2477906976751,7.247790697675043,7.2477906976751,7.247790697675043,7.247790697675043,7.2477906976751,7.247790697675043,7.2477906976751,7.247790697675043,7.2477906976751,7.247790697675043,7.2477906976751,7.247790697675043,7.247790697675043,7.2477906976751,7.247790697675043,7.2477906976751,7.247790697675043,7.2477906976751,7.247790697675043,7.2477906976751,7.247790697675043,7.247790697675043,7.2477906976751,7.247790697675043,7.2477906976751,7.247790697675043,7.2477906976751,7.247790697675043,7.247790697675043,7.2477906976751,7.247790697675043,7.2477906976751,7.2477906976751,7.247790697674986,7.2477906976751,7.2477906976751,7.2477906976751,7.247790697674986,7.2477906976751,7.2477906976751,7.247790697674986,7.2477906976751,7.2477906976751,7.2477906976751,7.247790697674986,7.24779069761928,8.099999999976717,8.100000000023215,12.200000000000045,11.830000000016298,11.82999999998367,10.034000000008405,10.034000000008405,10.034000000008291,10.034000000008405,10.033999999966454,13.010000000000105,10.959999999999923,8.990000000000009,11.058333333329415,11.058333333329529,11.058333333329415,11.058333333329415,11.058333333329529,11.05833333335272,9.953333333328715,9.953333333328601,9.953333333342698,8.75,11.379999999999995,11.013999999989778,11.013999999989778,11.013999999989665,11.013999999989778,11.014000000040937,11.210000000020955,11.209999999979118,11.373333333331743,11.373333333331857,11.373333333336404,11.059999999999945], ...

but there are actually 121 elements in m_data. (In another case, we see RAW9[122.0, false] but m_data correctly contains 122 elements, which is equal to ts.data().length.) The first value of ts that has ts.data().length == 122 and ts.m_data.length == 121 is obtained by calling Timeseries.makeNorm:

    Timeseries.makeNorm(
        NormTimeseriesDoubleSpec.make({
            start: tsStart,
            end: tsEnd,
            interval: interval,
            data: _.pluck(data, 'value'),
            unit: unit,
            missing: NormTimeseriesDouble.range2norm(
                tsStart, tsEnd, interval, missingRanges),
            estimates: NormTimeseriesDouble.range2norm(
                tsStart, tsEnd, interval, estimatesRanges)
        }))

So in the above call, data.length == 121 but the resulting series has data().length == 122. What is wrong? The documentations says:

data: [ V ]
Data points for this time series. The no of points in this field should be equal to the number of intervals between start and end

so perhaps something (0s?) should be added that corresponds to missing and/or estimates arguments, which may be non-empty at the start and/or end of the period.

Thanks


#2

@AlexBakic this sounds like a bug in the custom normalizer code. Is this custom normalizer default in the platform or developed in the apps code? If it’s the latter then you will have to debug further. Remember normalization needs to make sure that the time series being constructed always has the same number of intervals as in between start / end irrespective of the zone the data is in.

A function that is used in the default normalizer that will be helpful is withoutZone on DateTime.

This strips out the zone information from the datetime object once the processing for zones is done, and then dealing with distances becomes much easier


#3

Yes, it is a custom normalizer, it also strips timezone using:

obj = obj.putField('start', obj.start.withoutZone());

but it is probably not timezone but the constraints when calling makeNorm. I will try adding 0s to fill in holes.

Thanks


#4

Yes you have to provide as many points as there are in between start / end


#5

I computed the number of intervals (I had a function already) and subtracted the length od data array from it; then pushed 0 to the data array that many times. It works, although it is not obvious in which case(s) 0s are added.