Getting data into S3 via Typesystem

#1

Is there a way to send files (via the typesystem) to an S3 bucket? Are there any special roles/keys needed to do this action?

1 Like

#2

You can use curl and the following command to load data from your terminal into S3 via the typesystem:

curl -H "Content-Type: text/csv"  -H "Authorization: <YOUR_AUTH_TOKEN>" -H "Connection: close" -X PUT --data-binary @<FILENAME> https://ENV/file/1/TENANT/TAG/<PATH_IN_DEFAULT_BUCKET/FILENAME --ssl

Each tenant/tag has a default S3 bucket. The exact name is based on the pod name, tenant name and tag name and will have the following format: c3–<>-<>-<>. If the following command were used to post a file to S3:

curl -H -ssl -X PUT -v"Content-Type: text/csv"  -u <u:p>  --data-binary @<FILENAME> https://ENV/file/1/solarlab/prod/CanonicalFacility/2017-07-09T00:11:11.csv

The file will be found at the following path:

s3:// c3--<<pod>>-<<tenant>>-<<tag>>/CanonicalFacility/2017-07-09T00:11:11.csv

If you want to load Timeseries data into a FileData type so that you can create a normalized Timeseries which is stored in Cassandra, you do the same thing except for a more specific file path. For example if your FileDataType was ‘GenerationSensorMeasurement’ and you had a series header w/ ID = tag_1010_8, you should load all your files into this filepath:
https://ENV/file/1/TENANT/TAG/data/GenerationSensorMeasurement/tag_1010_8/

0 Likes

#3

Further clarifications:

  • curl supports -u option so you can use username / pwd instead of Auth Token
  • Content-Type header is required and significant so make sure it matches file content
  • Content-Encoding can be used to upload gzip content (and it is recommended) - C3 type-system makes it as convenient to work with gzip files as uncompressed files
  • in general user needs to have access to File & FileSystem type but permission can be further limited to upload only (e.g. to minimize security implications if data upload user credentials get compromised)
1 Like

#4

@DavidT Thanks for the clarifications. What Content-Econding types are supported in the Typesystem? Assuming I have a text/csv file, what encoding (eg ASCII, utf-8, utf-16 etc) and compression (eg zip, gzip etc) are supported?

0 Likes

#5

An example to curl a file from local machine to S3 via C3 typesystem:

curl -H "Content-Type: text/csv" -H "Content-Encoding: gzip"  -H "Authorization: AUTH" -H "Connection: close" -X PUT --data-binary @SummaryBillDemoData.csv.gz https://your-env.c3-e.com/file/1/tenant/tag/path/to/file/SummaryBillDemoData.csv.gz --ssl -v

Note the file will be curled to the default bucket corresponding to the tenant and tag and will be placed at the path specified in the curl request. In the above e.g. file will be placed in s3://default-bucket-for-tenant-tag/path/to/file/SummaryBillDemoData.csv.gz

0 Likes

#6

How do we know what the default bucket is for t/g?

0 Likes

#7

I use this in console
S3.bucketName(‘DEFAULT’)

0 Likes

#8

c3ShowType(ContentType)

0 Likes

#9

Hey David, that helps me understand but it doesn’t give me the answer explicitly. I can only guess what types are supported based on the methods shown, eg plainText(), binary(), csv(), etc. It would be helpful if the mimeType and charset fields were enums.

I asked this question to understand which types/charsets/encodings are supported by the integration engine. This is what I understand:

  • Content-Types: csv, xml, json, parquet, avro
  • Content-Encodings: gzip, zip
  • Data transfer protocols: AWS IoT, SFTP, ReST
0 Likes

#10

Good point! ContentEncoding & Charset are already enums but we can also add MimeType enum.

p.s. it’s not clear to me what “data transfer protocol” means

1 Like

#11

Oh, that helps! I also noticed that c3ShowType(ContentEncoding) doesn’t show anything, maybe the c3ShowType command was not built to display enum types?

I just mean the protocol for actually sending data to our platform. Im asking this more so that I can convey the truth to our customers, so they understand how to get data into our platform. To clarify, we currently support data ingestion via:

  • MQTT (w/ AWS IoT SDK)
  • HTTP
  • SFTP (which we expose to customers, but we ultimately push data from that server to our platform via HTTP)

How is my understanding of this process? Am I missing anything?

0 Likes

#12

Is this visible ? https://host/api/1/c3/c3/documentation/type/ContentEncoding

HTTP, SFTP, JMS, MQTT, AWS SQS & AWS Kinesis are all supported at various degrees of convenience and completeness. If we have a specific use case we can enhance support for each of these or easily add new

0 Likes

#13

@caljep MediaType.values() shows all the Content-Types supported.

0 Likes