StarTree Cloud cluster health dashboard

The StarTree Cloud cluster health dashboard provides an overview of Pinot checks, which lets you observe pass/fail statuses and filter checks based on instances or tables. This dashboard offers a holistic view of the overall health of the cluster.

The ClusterHealthCheckTask task runs every 20 minutes by default. Dashboard checks are cached and kept in memory, and then overwritten with every run.

To use checks ad-hoc, use these controller API calls:

  • GET - /periodictask/run?taskName=ClusterHealthCheckTask (to run the checks now)
  • GET - /clusterHealth (to fetch cluster health)
  • GET - /clusterHealth/list (to list all available cluster health checks)

To view the cluster health dashboard

Log into StarTree Cloud and do the following:

  1. Click the organization, then select the workspace you want to view monitoring metrics for.
  2. Click the Services tab.
  3. Click the link next to My Apps.
  4. Click the Pinot Control Panel tile.

A dashboard containing a list of checks appears, and indicates whether the check passes or fails, and additional details about the check.

List of health checks

Current health checks are listed here.

Description

Checks if a table has any segments whose ExternalView state does not match with IdealState