DFE: How to Adjust How often DFE's are triggered

#1

I understand that a DFE will be kicked off whenever the underlying data changes via a data load or change in the environment.

I have two scenarios I want to evaluate:
Scenario 1:
I want to state that I only want to trigger DFE’s at a maximum frequency of 2 hours, so that all data changes in a 2 hour period are batched together upon analytic evaluation.

I see a ‘frequency’ field on the c3ShowType(“Ext.DFE”) (the @DFE annotation):

Is this field referring to the frequency that the analytic will be triggered?

Or is there another setting at the TenantConfig or environment level that controls the frequency that DFEs (and therefore Analytics) are invalidated?

Scenario 2:
I have data streaming in every two hours into the environment, and it appears that the analytics/DFE’s are only getting evaluated once per day.

How do I check the set points of the frequency of the DFE evaluation?

1 Like
#2

A few things to look at:

  1. Check out the InvalidationQueues, Analytics are processed Asyncronously in the queues, so if the queues are busy with other jobs, it could delay the resolution of the analytics.

  2. Is the Analytic Result actually being persisted by the analytic? Or is there another process in the code base (like a CronJob) that takes the analytic result and generates the result that is expected–Essentially, make sure you’re looking at the right “expected” result.

  3. What is the actual arrival frequency of the data. Double-check that the data is arriving when you think it is arriving.

  4. Once you check the top 3 bullets: (the queues are relatively empty, you properly understand the analytic result, and the data is definitely arriving at your expected frequency): Fire the Analytic manually for a subset of the entities and see if you get results persisted as you expect. If you get the results that you expect with this manual firing, now we can talk about Analytic/DFE trigger frequency. If not, then follow the standard Analytic debugging steps.

  5. Guidance on how to configure the frequency of the DFE triggering is still needed.

#3

@ColumbusL

Today the analytics are fired (entered in the queues) every time data change happens. The frequency at which analytics are invoked is disabled and an obsolete feature. The overhead of storing analytic per source per execution is too high and hence the feature is deprecated.

So to summarize, the analytic will be invoked everytime the data change happens and the application code can do its on state management to decide whether they want to do anything in the process function as it stands today. -> This should answer both scenario1 and scenario2

#4

Let’s suppose we get new data every minute. If we don’t want to trigger DFEs all the time is it possible to trigger only once a day for example?

#5

@romain.juban then in your case it would better to write a CronJob as you don’t care if data comes or not!

#6

@bachr That would theoretically work, but would be inefficient. We don’t want to run a job on ‘everything’ we just want to run a job every x minutes/hours on instances where the data did change. So essentially a CronAnalytic kind of a thing.

#7

Can you have a pipeline stage where you create batches with events that arrive in 2-hour windows, then put them into the next queue every 2 hours? Then the Analytic listens to that next queue.