Get normalized timeseries based off data that was available before a given date


#1

Our customer has a use case where they need to calculate consumption and provide measurement data for the consumption values, but can have data come in at a later date which will potentially change the values of these metrics/timeseries. They want to be able to evaluate metrics based off only data that was available on or before a given date.

I spoke with @rohit.sureka a few weeks back and he mentioned the TimeseriesDataBase type, which sounds like it would provide exactly what we’re looking for - it provides a way to normalize a timeseries with filtering and/or grouping the data points. There isn’t really much documentation on this type, however, and I couldn’t find any samples where it’s being used.

Could anyone provide an example of how to use this type? Specifically for the method evalTimeseriesHistorical(), I’m not sure what to provide for EvalTimeseriesHistoricalSpec#parent:

// parent Id for the timeseries. This will be used to look up the parent
parent: string

#2

@scott.kruyswyk Here is the basic outline of how you create types / data for using this feature:

  1. Create a shared header type:
    entity type TestTimeseriesInfo mixes TimeseriesInfo schema name ‘NORMTEST_TTI’

  2. Create dp type as :
    @db(compactType=true,
    datastore=‘cassandra’,
    partitionKeyField=‘parent’,
    persistenceOrder=‘start,end’,
    persistDuplicates=false,
    shortId=true,
    shortIdReservationRange=100000,
    columnarStorage=true)
    entity type RawIntervalTimeseriesData mixes TimeseriesData<ServicePoint, TestTimeseriesInfo> schema name ‘normalizertest_RawIntervalTimeseriesData’ {
    @ts
    quantity: double
    }

  3. Create data points in the above types

  4. Query using: RawIntervalTimeseriesData.evalTimeseriesHistorical({…})


#3

Thanks Rohit this is perfect. One last thing - I noticed that end is required on the TimeseriesData type. How do I get around this if I want to deal with point measurement data? I tried just setting end = start but of course ended up getting no data back as a result.


#5

@scott.kruyswyk For the historical ts, mixin PointTimeseriesData


#6

What’s the correct format of the parent field in EvalTimeseriesHistoricalSpec?
When I use the same id as the one used in TestTimeseriesInfo I see this error:

EvalTimeseries error : Parent Id not in correct format. Cannot retrieve record for type: -type-RawIntervalTimeseriesData.RawIntervalTimeseriesData

#7

@bachr you would use the id of parent object the timeseries data is related to - in the case Rohit used it would be the ServicePoint id.

For example:

remix type ServicePoint {
    measurementTimeseries: [PhysicalMeasurementTimeseriesData](parent)
    measurementsFilteredByDate: function(obj: Obj, spec: TSEvalSpec, metric: Metric, beforeDate: datetime): Timeseries js server
}
@db(compactType=true,
  datastore='cassandra',
  partitionKeyField='parent',
  persistenceOrder='start, createdDate',
  persistDuplicates=false,
  shortId=true,
  shortIdReservationRange=100000,
  columnarStorage=true)
entity type PhysicalMeasurementTimeseriesData mixes PointTimeseriesData<ServicePoint, PhysicalMeasurementTimeseriesInfo> schema name 'RMTD' {
  @ts
  quantity: double

  info: PhysicalMeasurementTimeseriesInfo

  createdDate: datetime
}

Calling evaluateTimeseriesHistorical would then look like this:

var servicePointId = 'id-1';
PhysicalMeasurementTimeseriesData.evalTimeseriesHistorical({
    start: spec.start,
    end: spec.end,
    parent: servicePointId,
    filter: filter,
    interval: spec.interval
  })

#8

Can this behavior be activated automatically when calling evalMetric[s]?
In practice, we need to evaluate a CompoundMetric passing a cut-off datetime.


#9

@AlexBakic This would be a new feature request. Currently not on the product roadmap but we can help prioritize


#10

(Moved to TimeMachine: Evaluating metrics in the past .)


TimeMachine: Evaluating metrics in the past