Syncing Files Resets Status to Initial

#1

Both FileSourceCollection.get('CanonicalName').sync and SourceFile.syncAll(fsc.rootUrl()) are resetting the status of the SourceFile to initial, causing files to get re-processed repeatedly.
This appears to have started with 7.8.8.

Our current process:
Files are placed in the Azure blob by an external tool. A cron job periodically does a SourceFile.syncAll(), passing in the FileSourceCollection’s rootUrl, in order to create the SourceFile records. A second cron job then periodically fetches SourceFile records that are initial, updates their contentType (the external tool is not able to set that), syncs them, and processes them.

That used to work fine, but now old files keep getting reset back to initial, so they keep getting re-processed. Is this expected behavior, or is there a better way of doing this?

Some other notes from experimenting:
If the file in the Azure blob has a contentType set (either manually in Azure, or via a different external tool that can set it properly, or via a previous File api call to set it), then both FileSourceCollection.sync and SourceFile.syncAll wipe out that contentType information from the SourceFile record. However, a SourceFile.syncFile retains/restores the contentType information (with the downside of needing to know the specific file name).
Using SourceFile.syncFile does not appear to reset it back to initial like the other methods do.

My current thought is to switch to doing a FileSourceCollection.listFiles and then calling SourceFile.syncFile for each file, as that appears to be the only sync method that does not wipe out / reset information for existing files. Is this the correct way of doing things?

0 Likes

#2

@mjlovell This is not expected behavior. syncAll/sync is always going to update files in SourceFile if any of the metadata of the file in the source has changed, eg. contentType, lastModifiedDate, contentEncoding.

Could be a bug, can you create a ticket, with the environment and an example of files with which we can reproduce it? Thanks.

0 Likes