fetchNormalizedData failed : Could not acquire lock for key


#1

When I do tsEval on normalized data, I get the below error message:

message: “DataLoaderImpl Error : Call to fetchNormalizedData failed : Could not acquire lock for key 3#450#ABCD1032401_1913165027_XXX-YY-5000-US-22”

What could be the cause of this error “Could not acquire lock for key”? Any suggestion to resolve this kind of error?


closed #2

#3

Before normalization begins a {@link DbLock} is acquired on the series header (i.e. the same series cannot be concurrently
be normalized within the cluster). Currently, the normalization engine attempts to acquire a lock 10 times with a delay
of 100 ms in between the two checks after which it fails with Could not acquire lock exception.

These errors are retryable so you can simply retry the normalization by recovering failed jobs on the queue, or simply requerying. If there is high content then you can raise the number of retries via:

TenantConfig.upsert({id:"NormalizationMaxLockAttempts", value : 100 }); // retries 100 times instead of 10 (default)

#5

I’ve run into this same issue with a BatchJob I’ve implemented. I’ve tried increasing the NormalizationMaxLockAttempts to 1000 but still occasionally get the error. After increasing the attempts I also started getting the following error:

errorMsg: c3.love.exceptions.C3RuntimeException: c3.love.exceptions.C3RuntimeException: MetricEngine error : c3.love.exceptions.C3RuntimeException: Error c3.love.exceptions.C3RuntimeException: Unable to normalize timeseries action=fetchNormalizedData,type=RollupHistoricalTestSeries,parentId=PC|F|0000002230|1319,error=Internal Error : Could not resolve concurrency scenario for normalization after 3 attempts. at c3.engine.database.timeseries.normn.FetchNormalizedDataTask.throwError(FetchNormalizedDataTask.java:191) at ...

Both errors show up in BatchQueue.errors().

Each batch of my batch job runs a rollupMetrics() function. There is a very rare chance that there could be a timeseries ID in common, among two sets of IDs that are running on two different batches at once.

Could that be what is causing these errors? Any ideas? Thanks in advance!


#6

Can we avoid locking, assuming we know there are no concurrent accesses to a header?