Skip to main content

Observability

Real-time monitoring, alerting, and diagnostics built into the platform.

Health Score

HyperSDK computes a system health score (0-100) based on:

MetricWeightSource
CPU usage25%/proc/stat
Memory usage25%/proc/meminfo
Disk usage25%/proc/diskstats + filesystem stats
Network health15%/proc/net/dev
Service status10%Process monitoring

The score includes bottleneck detection — identifying the top contributing factor when health drops.

Smart Alerts

Configure threshold-based alerts for:

  • CPU usage exceeding threshold
  • Memory pressure and swap usage
  • Disk space running low
  • Network errors or packet loss
  • Job failures

Alerts appear in the dashboard and are available via the /api/v1/system/alerts endpoint.

Explain Mode

When a metric is elevated, click Explain in the dashboard to see:

  • Contributing factors ranked by impact
  • Historical trend for the metric
  • Recommended actions

Example: "CPU is at 87% — top contributors: migration job (62%), system indexing (15%), other processes (10%)"

Metrics

Dashboard Views

The dashboard provides 8 observability views:

ViewWhat It Shows
Health OverviewSystem health score with bottleneck indicator
AlertsActive and historical alerts
ProcessesRunning processes with CPU/memory usage
ContainersContainer status and resource consumption
SecurityFailed logins, rate limit events, audit trail
DebugLogs, request tracing, error analysis
NetworkInterface stats, connection counts, bandwidth
StoragePool status, disk I/O, capacity planning

Prometheus Export

curl -sk https://your-server:5080/api/v1/metrics

Returns metrics in Prometheus exposition format for integration with Grafana or other monitoring systems.

WebSocket

Real-time metrics stream via WebSocket:

const ws = new WebSocket('wss://your-server:5080/ws');
ws.onmessage = (event) => {
const data = JSON.parse(event.data);
// data.type: 'metrics', 'alert', 'job_update'
};

Carbon-Aware Scheduling

Schedule migration jobs during low-carbon grid periods:

curl -sk https://your-server:5080/api/v1/carbon/schedule

Returns optimal time windows based on real-time electricity grid carbon intensity data. Typical reduction: 30-50% CO2 per migration batch.


Schedule a Demo to see observability in action.