How to abort SourceFile.processAll that's gone bad


#1

I synced a whole bunch of files and wanted to process them all.

I ran:

SourceFile.processAll({filter: ‘status==“initial”’)

and I found that the command is processing the files one-by-one, which obviously is not ideal. I realized I should have used instead:

SourceFile.processAll({filter: ‘status==“initial”’, async: true})

which would process them all in the background in parallel. However, all the files from my initial operation have moved from initial status to scheduled status – and I can’t seem to pull them undo this.

What does scheduled actually mean,… how do I get them unscheduled so I can re-issue the command correctly?


#2

@paulyip processAll will schedule all the files but a batch of them at a time, async will only return the processAll call back almost immediately.
So once they are scheduled( chunked and placed in the queue), the chunks will be processed asynchronously and in parallel.
for more information on various statuses of source file check SourceFileStatus

But still if you want to undo your processing , you will have to clean up
JMS.purgeQueues("jms queue names you care about")

And run process again.


#3

Thank you for replying.

From what I observed, I am very certain that when I run it WITHOUT the { async: true} option, the processing of files is serialized.

When I monitored the queues, there was very little build up of DataLoadQueue, for example. When I run with WITH {async: true}, there was a much stronger backlog in DataLoadQueue - enabling more workers to participate in the work / more throughput. Am I going out of my mind?

We did seem to have some problem with a runaway process and Ops helped to cancel that processAll() action.

From there, I purged the queues, using JMS.purgeQueues as suggested. Now will try to get this data load restarted.