I am working on helping customer save storage cost. I have a couple of questions related to the choice of S3 vs Cassandra. Right now, the data has about 50M rows, 9 columns. And this number could be much larger, say, 5B rows in the future.
I know that S3 is much cheaper compared with Cassandra, is there any way for us to estimate the cost of saving data in
Cassandrabased on the feature of the dataset (for example, number of rows, number of the unique partition keys, etc)?
Based on this post When to use FileData, I know that Normalized data will still be stored primarily in Cassandra as it’s assumed to be used more frequently. But what if the data won’t be used frequently? Do we have a certain threshold for choosing hot/cold storage framework?
Any other rules we should be aware of when we choose the storage framework besides cost, query frequency and size of data?
Thanks for your advice!