SourceFile.fetch() after SourceFile.process() leads to SourceFile stuck in Processing


#1

I run a SourceFile.process(<>) and then SourceFile.fetch().

The SourceFile I processed now says processing, however, no activity can be seen in either the JMSDataLoadQueue or the Cluster.actionDump(). Additionally, no records are loaded. So I can be certain that the file was not processed.

What are some helpful steps to debug this?


#3

I find the following places are good places to debug loading issues

DataLoadUploadLog.fetch({order: “descending(meta.update)”})
– very similar data to SourceFile

DataLoadProcessLog.fetch({order: “descending(meta.updated)”})
– gives information on each chunk (including errors) after that chunk processed.

Check InvalidationQueue.countAll() and look for queues with failures then look at errors with .errors()


#4

@ColumbusL have you verified that FileSourceSystem and FileSourceCollection exist, FileSourceCollection.get('fsc_id').rootUrl() returns a meaningful path, and that the path is accessible via S3.listFiles()?


#5

Another thing I would try is S3File.fromString('path').readObjs({serType:MyCanonicalType}) to see if the platform is able to read the file and serialize it as intended canonical.


#6

Hi @yaroslav/@ColumbusL : I have noticed this issue under two scenarios in v7.8

  1. No data in files:
    This is a bug when SourceFile shows wrong status. Platform is aware of this issue and has a ticket to fix this.

  2. ContentType: UTF-16
    We noticed this to happen when the contentType is UTF-16. This is also a bug and we have a ticket for this. As a workaround, this can be processed when we disable JMS processing in FileSourceCollection

    FileSourceCollection.merge({id:‘XYZ’,‘jmsDisabled’:true}) and reproess the file