Best Practice for Serving Files to Users?


#1

What are best practices for serving files from the FileSystem (S3 Bucket or Azure Blob) to a user?

The API /file/1/tenant/tag/url appears to partially work and so far may be the best option. However, we cannot find anything that could be used to secure that api, restricting it to specific users and specific file paths. It also opens the file in the browser window (content type ‘text/csv’) rather than downloading it, even when specifying the ‘download’ attribute to the anchor tag.

For small files, reading the file contents as either binary or string, and passing them back to browser as an API result for a client-side save-as works fine. However, many of these files are rather large (~100MB) so this does not seem like a good option for those.

The use case is as follows:
The application does monthly data processing to generate a large table of calculated values, which is displayed as part of the application. The business wants to save a snapshot of this table each month so its data scientist can download it and compare how the calculated results change over time. Saving a snapshot of the table to the FileSystem is complete and working fine, now we just need a way for the people who need it to get it.


#2

you can use the File (or S3File) API in your case to download files from the FileSystem.
Any other type system is called like /api/1/tenant/tag/' + typename + '?action=' + action
So in your case for S3 files (use read or readString actions)

POST /api/1/tenant/tag/S3File?action=read

The body of the post should be something like

{
    "this": {
        "type": "S3File",
        "url": "full_s3_path"
    }
}

Now you case use C3 permission framework (with AdminGroups, Roles and Permissions) to control access to the S3File API.
For more details on how to make those network calls, check in your browser the network tab.

You can also wrap the call to S3File.read into your own Type/Action if you need some logic to be implemented before deciding what to return. Check this post Create an API for a 3rd party service


#3

/file/ endpoint is preferred option to access content in the C3 File System. @mjlovell when you say ‘partially works’ what do you mean specifically - if you are just opening a link in a browser window then browser will decide to download or just show content based on Content Type.

Note that if you have a large textual content you should definitely use compression (.gz)

Use of File API as @bachr suggested is fine (but do avoid using specific sub types like S3File). However wrapping call to File API or making call from browser JS has a potential danger of running out of memory for large files.

As far as permissions go you can control access to a File type - however individual file ACL is only available starting from v7.9.


#4

By the /file/ endpoint ‘partially works’ I mostly mean we do not know how to control security to it. Is that just a shorthand to a Type that could be controlled via normal roles, or is it handled differently?

Is there someway to have the /file/ endpoint automatically perform compression, or do I need to manually zip the file after creating it (using File.zip, for example)?

We could do an API call to File.read, but does that stream the file back to the browser, or does it first read the full file in to JS, then return it all at once? That could add a large delay before the download starts for the user for large files. We do have an API wrapper around File.read that we are already using for smaller files, but did not think that approach sounded good for large files.


#5

/file/ endpoint has same security as File C3 type. I.e. you can control Users who have access to the endpoint but not at individual file level (later requires v7.9)

You should compress as you create the file - e.g. by setting File.contentEncoding or if you are letting it guess metadata from file name then making sure file ends with .gz extension.

Using zip method is less efficient as it will write the file multiple times.

Yes, as I said downloading file content in browser JS should be avoided.