What is Interval Splitting?

#1

In the documentation I’m working from a multi-stage process of normalizing time-series data is shown, with “Interval Splitting” showing as the final phase before outputting the normalized time-series.

But - Interval Splitting is not documented. What is it?

0 Likes

#2

Interval Splitting, sometimes referred to as “Interval Alignment”, is the process by which de-duplicated/de-overlapped/interpolated/etc. data points are aligned to regular* intervals, each of which has a start and end spanning exactly the normalization interval (as determined by the Interval Detection normalization step or as specified by the interval field on the header—see TimedDataFields.interval). Some options for normalization intervals include MONTH, DAY, and HOUR—see the Interval enum type for the full list of options.

This step entails aggregating multiple data points that fall in the same interval, as well as disaggregating data points that span several intervals.

Here is an illustrated example: (single horizontal axis, depicting time; data points delimited by pipes)

Given: normalization interval `DAY`
Input: (continuous timeseries, data points at irregular intervals)
  1       2     3 4   5   6   7   8  9   <-- numbers for data point reference only
|----|---------|-|--|----|--|----|-|----|

Output: (continuous timeseries, each data point spans exactly 1 day)
  A    B    C    D    E    F    G    H
|----|----|----|----|----|----|----|----|

This is a list of some of the Interval Splitting operations happening in this example:

  • Data point 2 is disaggregated and used for outputs B and C
  • Data points 3 and 4 are aggregated and used for output D
  • Data point 7 is disaggregated into datapoints 7.1 and 7.2 and used for outputs F and G, respectively
    • Data points 6 and 7.1 are aggregated and used for output F
    • Data points 7.2 and 8 are aggregated and used for output G

* “regular” in the sense that each data point represents 1 <INTERVAL>. E.g., each data point at MONTH interval spans exactly 1 month, although the number of days and milliseconds represented by each data point obviously differs.

3 Likes

#3

Thank you very much for the detailed answer

0 Likes