Telemetry
Piri uses OpenTelemetry to emit metrics and traces for observability. You can configure custom collectors to send this data to your own monitoring infrastructure.
Metrics
Piri emits metrics via OTLP (OpenTelemetry Protocol) that can be consumed by any compatible collector.
Host Metrics
System-level metrics for monitoring node health:
| Metric | Type | Unit | Description |
|---|---|---|---|
system_cpu_utilization |
Gauge | 0-1 | System-wide CPU utilization |
system_memory_used_bytes |
Gauge | bytes | System memory in use |
system_memory_total_bytes |
Gauge | bytes | Total system memory |
piri_datadir_used_bytes |
Gauge | bytes | Disk space used by data directory |
piri_datadir_free_bytes |
Gauge | bytes | Free disk space for data directory |
piri_datadir_total_bytes |
Gauge | bytes | Total disk space for data directory |
Job Queue Metrics
Track task execution in internal job queues:
| Metric | Type | Description |
|---|---|---|
active_jobs |
UpDownCounter | Currently running jobs |
queued_jobs |
UpDownCounter | Jobs waiting in queue |
failed_jobs |
Counter | Permanently failed jobs |
job_duration |
Histogram | Job execution duration (seconds) |
Labels:
| Label | Description |
|---|---|
queue |
Name of the job queue (e.g., replicator, aggregator, egress_tracker) |
job |
Type of job being executed |
status |
Job outcome (success or failure) |
attempt |
Retry attempt number (1-based) |
failure_reason |
Reason for permanent failure (only on failed_jobs) |
HTTP Server Metrics
Standard OpenTelemetry HTTP instrumentation:
| Metric | Type | Description |
|---|---|---|
http.server.request.duration |
Histogram | Request latency |
http.server.request.body.size |
Histogram | Request body size |
http.server.response.body.size |
Histogram | Response body size |
PDP Metrics
Provable Data Possession task metrics:
| Metric | Type | Description |
|---|---|---|
chain_current_epoch |
Gauge | Current Filecoin chain epoch |
next_challenge_window_start_epoch |
Gauge | Epoch when next challenge window starts |
pdp_next_failure |
Counter | Next proving period task failures |
pdp_prove_failure |
Counter | Proof generation task failures |
message_send_failure |
Counter | Blockchain message send failures |
message_estimate_gas_failure |
Counter | Gas estimation failures |
Replication Metrics
| Metric | Type | Description |
|---|---|---|
transfer_duration |
Histogram | Replica transfer operation duration |
Labels:
| Label | Description |
|---|---|
source |
Origin endpoint where data is pulled from |
sink |
Destination endpoint where data is written to (this node) |
Server Info
Build and runtime information:
| Metric | Type | Description |
|---|---|---|
piri_server_info |
Info | Server metadata |
Labels:
| Label | Description |
|---|---|
version |
Piri software version |
commit |
Git commit hash of the build |
built_by |
Build system identifier |
build_date |
When the binary was compiled |
start_time_unix |
Server start time (Unix timestamp) |
server_type |
Server mode (full or ucan) |
did |
Server's Decentralized Identifier |
owner_address |
Ethereum address of node owner |
public_url |
Server's publicly accessible URL |
proof_set |
PDP proof set ID |
Traces
Distributed tracing provides end-to-end visibility into operations:
| Span | Description |
|---|---|
blob.accept |
Blob acceptance operations |
blob.allocate |
Blob allocation operations |
space.content.retrieve |
Content retrieval operations |
AddRoots |
PDP root addition operations |
Traces use parent-based sampling and integrate with W3C Trace Context propagation.
Integration
Prometheus
Use an OpenTelemetry Collector with a Prometheus exporter:
# otel-collector-config.yaml
receivers:
otlp:
protocols:
http:
endpoint: "0.0.0.0:4317"
exporters:
prometheus:
endpoint: "0.0.0.0:9090"
service:
pipelines:
metrics:
receivers: [otlp]
exporters: [prometheus]
Configure Piri to send metrics to your collector:
Jaeger
For distributed tracing, configure a Jaeger backend with OTLP support:
Grafana
Connect your Prometheus datasource and create dashboards using the metrics above. Key metrics to monitor:
- System health:
system_cpu_utilization,system_memory_used_bytes,piri_datadir_free_bytes - Job queue health:
active_jobs,failed_jobs,job_duration - API performance:
http.server.request.duration(p95, p99)
Configuration
See Configuration > telemetry for collector setup options.
Analytics
Piri can optionally send anonymized analytics to Storacha to help improve the software. See Operations > Telemetry for details and opt-out instructions.