As I understand it, the way ACE works is by distributing the work across ID’s/sources. Loosely translated, each thread on a machine processes the analytic for one source/ID at a time and once the processing is done, another process starts for a different source/ID. If that is the case, why is loadContextAll() needed? Shouldn’t it be enough to have just loadContext()…?
loadContextAll() is helpful if your sources all share a context. For example, maybe you have 1 trained ML model that all your sources are using for predictions. With loadContextAll() you can fetch the model one time and then the model will be passed to all the ‘process’ functions without needing to fetch the same object once per source.
To follow-up on this what is the benefit of doing something in loadContext() vs processSource() if you applying source-specific logic before processing the input?
It is possible that the arrival of a particular piece of data could invalidate several periods for a given source, resulting in the “processSource” method being called several times for a given source (once for each timerange that was invalidated).
In this case, data that would be shared between all timeranges (e.g. again, a ML Model) can be fetched one time in loadContext and the object will be passed to each call to processSource.
Now suppose all my sources are partitioned into 3 groups according to some field. Hence I have 3 different ML models.
I would fetch the 3 models once in the loadContextAll() function.
Now for each source I can get the applicable model in loadContext() or in processSource() (no additional I/O just some JS logic). Which one is better or does that even matter?
I think that in your use-case it probably depends e.g. on the size of the model in memory and the number of sources. You’d want to do some performance testing. Its not clear that 1 way or the other would obviously be better.
Performance-wise there might not be a big difference today (as Riley said: to be benchmarked)
However frol a code perspective I’d select the model in the loadContext function, I believe it’s a better separation of concerns between the 2 actions (get everything I need vs. processing my data).
What louis says is definitely accurate, you shoudl use loadContext() to fetch objects not process() (basically, you should avoid doing ANY I/O in the process functions).
I think the question was about the difference between loadContextAll() vs loadContext() in the case when the object <-> source mapping is somewhere between a 1<->many and a 1<->1… like the few<->1 case. In few<->1 you’re going to have to play with it to see whats fastest.