Monitoring C3 Cluster

Hello Everybody,
I’m new to C3IoT platform and would like to monitor a cluster’s health :

  • Cluster status (up or down)
  • Number of workers running, status, health
  • Maybe other interesting information

I would appreciate any help

There is no endpoint of the form Get api/health that you can query to get a summary of health information. A C3 platform is a complex platform with a lot of moving parts.

What you can do is use:

  • Cluster.hosts() to get cluster machine information and status
  • Cassandra.status() cassandra nodes status and role
  • Cassandra.compactionStatus() to check the status of compaction jobs on cassandra
  • PostgreSQL.status() for postegresql

To understand the current load, you can use:

  • Cluster.actionDump() to list current actions been executed by the cluster machines
  • InvalidationQueue.countAll() to list the current number of entries on queue and their respective status (running, pending, failed)
  • SqlBatchQueueEntry.fetch() to look for pending entries in the BatchQueue or any other queue.
  • BatchQueue.errors() to look for the errors in the BatchQueue or any other queue.
  • InvalidationQueueError.fetch() for failed actions/entries.

For data integration, you can check the status of jobs with:

  • DataLoadUploadLog.fetch() for import job
  • DataLoadProcessLog.fetch() for the processing of the data been imported

You can also evaluate CloudWatch metrics on any AWS resource, e.g. the following snippet shows the steps to chart CPUUtilization from a given PG resource:

// find out all Postegres instances:
// get one PG instance
var r = AwsRds.make({resourceName: 'some-postgres-resource'})
// list the CloudWath metrics available for a given PG
// check documentation on evaluating CloudWatch metrics
c3ShowFunc(r, 'evalCloudWatchMetric')
// evaluate CPU utilization
var ts = r.evalCloudWatchMetric({start: '2018-04-15', end: '2018-04-17', interval: 'HOUR', metric: 'CPUUtilization', statistic: 'AVERAGE'})
// visualize the metric

In v7.8, listResources() returns the empty list. Could we get these information again?