Metrics and Monitoring

Overview

Prometheus is a robust, open-source tool designed for event monitoring and alerting. It collects and stores metrics as time series data, enabling querying and visualization of system performance. Metrics for Aggregator or Attester nodes can be gathered from:

Standard Prometheus metrics: Node uptime, performance statistics, and more.
libp2p metrics: For peer-to-peer networking data.
gossipsub metrics: For the topic-based publish/subscribe communication protocol.

Enabling metrics may impact performance and is therefore optional.

Enabling Metrics

To activate metrics, use the --metrics flag with as per

libp2p Metrics Scraping

Since libp2p metrics do not emit events, data must be actively scraped. To facilitate this, a dedicated server will be launched to expose metrics.

By default, this server listens on port 6060.
You can specify an alternative port using the --metrics.port <port> option.

Usage

Enable metrics on an aggregator node using the default port (6060):

othentic-cli node aggregator --metrics

Enable metrics on an attester node with a custom port:

othentic-cli node attester --metrics --metrics.port 7070

Pushgateway Support

In distributed computing systems monitoring becomes crucial for understanding node health, performance, and task execution. However, the nature of distributed systems often involves dynamic topologies: Nodes may join or leave the network, making direct scraping unreliable in certain cases.

To address these challenges, the Prometheus Pushgateway plays a vital role in ensuring that critical metrics are captured and made available for network developers and operators.

Starting from v1.6.0, in addition to exposing metrics for scraping, support has been added for pushing metrics to a predefined Prometheus Pushgateway URL. This enhancement provides flexibility for capturing metrics, where direct scraping of nodes may not always be practical.

Usage

To enable metrics pushing, use the --metrics.export-url <url> flag to specify the Pushgateway URL. When enabled, metrics will be sent to the specified URL instead of being exposed solely for scraping.

Supported Metrics

The following custom metrics are pushed to the Pushgateway:

Aggregator

`p2p_attestation_per_address_count`

Tracks attestation counts per address for monitoring network participation.

Labels

address : address of the Operator

`p2p_attesters_included_in_attestation_count`

Tracks the total number of tasks where attestations of a given operator ID were included in the final aggregation. This allows global visibility across the network into how often each operator is included, useful for fairness checks, performance auditing, or reputation monitoring.

Labels

operator_id : Id of the Operator

Attester

`p2p_task_received_count`

Counts the number of tasks received by the attester from the aggregator or peer nodes.

`p2p_task_approved_count`

Tracks how many of the received tasks were approved by the attester.

`p2p_task_rejected_count`

Tracks how many of the received tasks were rejected by the attester.

`p2p_task_included_in_attestation_count`

Tracks how many of the received tasks were successfully included in during aggregation. This provides per-node visibility into participation frequency and helps detect connectivity or reputation issues at the attester level.

Metric Expiration: Since Pushgateway retains metrics until explicitly deleted, implement cleanup policies to prevent stale data from cluttering the system.

Prometheus and Grafana Integration

Overview

To monitor your AVS nodes, you need to set up Prometheus for collecting metrics, configure your Attester or Aggregator node to capture system-level metrics (as explained earlier), and use Grafana to visualize the data. By using Docker Compose, you can streamline the process by defining and managing these services in a single configuration file.

For a comprehensive example, refer to the Simple Price Oracle Example repository, which contains a working configuration for Prometheus, Grafana, and the node setup.

Docker Compose Configuration

Prometheus

Prometheus acts as the core of the monitoring system, collecting metrics from the node. The setup uses the prom/prometheus:latest image and binds a custom configuration file, prometheus.yaml, to the container.

services:
  prometheus:
    image: prom/prometheus:latest
    container_name: prometheus
    volumes:
      - ./prometheus.yaml:/etc/prometheus/prometheus.yaml  # Bind mount the config file
    ports:
      - "9090:9090"  # Expose Prometheus on port 9090
    command:
      - '--config.file=/etc/prometheus/prometheus.yaml'  # Specify the config file location
    restart: unless-stopped

Prometheus Pushgateway

Use the following example to integrate in docker-compose.yaml to set up the Pushgateway service along with Prometheus to scrape metrics:

services:
  pushgateway:
    image: prom/pushgateway:latest
    container_name: pushgateway
    ports:
      - "9091:9091"
    restart: unless-stopped

Prometheus Configuration (prometheus.yaml)

To scrape metrics from the Pushgateway, use the following Prometheus configuration:

scrape_interval: 15s

scrape_configs:
  - job_name: 'pushgateway'
    static_configs:
      - targets: ['pushgateway:9091']

Grafana

Grafana provides a user-friendly interface to visualize and analyze the data collected by Prometheus. This setup uses the grafana/grafana:latest image and configures persistent data storage using a Docker volume.

services:
  grafana:
    image: grafana/grafana:latest
    container_name: grafana
    restart: unless-stopped
    ports:
      - "3000:3000"  # Expose Grafana on port 3000
    environment:
      - GF_SECURITY_ADMIN_USER=admin
      - GF_SECURITY_ADMIN_PASSWORD=admin  # Set the admin user password
    volumes:
      - ./grafana/provisioning:/etc/grafana/provisioning
      - ./grafana/dashboards:/var/lib/grafana/dashboards
      - grafana-storage:/var/lib/grafana

volumes:
  grafana-storage: {}

Volume Details:

./grafana/provisioning:/etc/grafana/provisioning

This volume is used for provisioning Grafana with predefined configurations, such as data sources and dashboard templates. You can place YAML files in this directory to automate the setup of data sources or other settings upon container startup.

Example YAML file for a data source:

apiVersion: 1
datasources:
  - name: Prometheus
    type: prometheus
    url: http://prometheus:9090
    access: proxy
    isDefault: true

./grafana/dashboards:/var/lib/grafana/dashboards

This volume stores dashboard JSON files, which define the visual layouts and metrics to display in Grafana. By placing preconfigured dashboards here, you ensure they are automatically available after the container starts.
Example dashboard configuration

grafana-storage:/var/lib/grafana

This Docker volume ensures persistent storage for Grafana's internal database, including user-created dashboards, settings, and other data. If the Grafana container is recreated, this volume retains all saved configurations.

Starting the Setup

Follow these steps to start the monitoring setup:

Prepare the Configuration:
- Ensure the prometheus.yaml file is in the same directory as the Docker Compose file.
- Customize the Grafana provisioning and dashboards directories as needed.
Start the Services:
- Run the following command to start Prometheus and Grafana in detached mode:
  docker-compose up -d
Access the Services:
- Prometheus will be available at: http://localhost:9090
- Grafana will be available at: http://localhost:3000
- Log in to Grafana using the default credentials (admin/admin, unless changed in the configuration).
Visualize Metrics:
- Once the setup is complete, use Grafana to create or import dashboards to visualize the metrics collected by Prometheus.

Screenshots of the Othentic-CLI dashboard in Grafana

PreviousAuth Layer NextLogging

Last updated 5 months ago