DataLoadQueue and AnalyticsQueue errors using FileTimedData

#1

I’m looking at a couple different issues in an environment - using types that mixin FileTimedDataHeader/FileTimedDataPoint, loading in data on either a second interval, and have several analytics set up based off the measurements that are coming in from this measurement data. I have been seeing a ton of errors in the DataLoadQueue and AnaltyicsQueue, so I guess there’s really two parts to this post

Question 1

DataLoadQueue errors all look like this:

Could not acquire lock for key 3#FDI#672#CON055.PCV - Gas Pressure

  1. I saw this jira ticket talking about a similar data load issue and was wondering is the FileDataConfig#indexAfterUpsert flag enabled by default? It’s not set in this environment so I’m not sure if setting it to false would do any help
  2. If this isn’t the issue are there any other reasons why this would be happening so much?

Question 2

The AnalyticsQueue issue that they’re having is similar to this community post and this jira ticket. It looks like there are multiple analytics that are trying to access the normalized timeseries for data that’s continuously being invalidated.

unable to preprocess sources for 17 DFEs:

C3RuntimeException: unable to load metric ‘PumpTorqueSDGasThroughPump’ for ID CON044 (start 2018-05-29T11:40:00.000Z, end 2018-05-29T15:20:00.000Z):

C3RuntimeException: c3.love.exceptions.C3RuntimeException: c3.love.exceptions.C3RuntimeException: MetricEngine error : c3.love.exceptions.C3RuntimeException: Error c3.love.exceptions.C3RuntimeException: Unable to normalize timeseries action=fetchNormalizedData,type=WellStreamMeasurementSeries,parentId=CON044.Pump Speed Actual,error=Internal Error : Could not resolve concurrency scenario for normalization after 3 attempts.

I have a couple of questions here:

  1. Is the only real workaround for this to pause the AnalyticsQueue until data loading cools down a little bit, as mentioned in https://c3energy.atlassian.net/browse/EXC-14025 ?
  2. Would changing the Normalization config potentially improve this performance? Incremental normalization is turned on, but it’s set to ALL right now. I don’t think this will change much though because if they have analytics set up to trigger on these metrics then isn’t normalization’s going to happen anyways?
  3. How much benefit would setting bucketInterval to a smaller interval be on this? Right now it’s set to DAY for all 99,168 series I see, but it looks like a very large portion of these are receiving data at a higher rate than this (86,772 w/ interval: 'MINUTE', 2,066 w/ interval: 'HOUR')
0 Likes

#2

@rohit.sureka can elaborate but both of these are being fixed right now. Workaround is to recover failed actions, they should complete.

  1. Yes, and recover failed

  2. We are better off normalizing incrementally. If normalization is triggered via analytics, it will perform a full normalization.

  3. This only affects normalization speed and has been tuned empirically. Seems to be working well.

0 Likes

#3

We are better off normalizing incrementally

I wasn’t referring to turning off incremental normalization but rather changing the TenantConfig to something non-blocking like RECENT or AFTERQUERY. It seems like it might be an issue if on one side you have analytics invalidating/normalizing the entire timeseries and new data coming in at a second-level interval on the other

0 Likes