Monitor with Prometheus
Enable Prometheus metrics endpoint for monitoring
Enable Metrics
Metrics are enabled by default on port 9391:
# Verify metrics are enabled
juju config concourse-ci enable-metrics
# Should show: true
# To disable (not recommended)
juju config concourse-ci enable-metrics=false
Access Metrics Endpoint
# Get web server IP
juju status concourse-ci
# Fetch metrics
curl http://<web-ip>:9391/metrics
Integrate with Prometheus Charm
# Deploy Prometheus (machine charm)
juju deploy prometheus-machine prometheus --channel edge
# Create relation
juju integrate concourse-ci:monitoring prometheus:target
# Verify relation
juju status --relations
Result: Prometheus automatically scrapes metrics from Concourse CI web server.
Note: This guide uses prometheus-machine and grafana-machine charms, which are machine charms compatible with this Concourse CI machine charm deployment.
Key Metrics to Monitor
Worker Metrics
concourse_workers_running- Active workersconcourse_workers_stalled- Stalled workersconcourse_containers- Running containers
Build Metrics
concourse_builds_running- Builds in progressconcourse_builds_finished_total- Completed buildsconcourse_builds_duration_seconds- Build duration histogram
Database Metrics
concourse_db_connections- Active DB connectionsconcourse_db_queries_total- Query count
Sample Prometheus Queries
# Number of running workers
concourse_workers_running
# Build success rate (last hour)
rate(concourse_builds_finished_total{status="succeeded"}[1h])
/ rate(concourse_builds_finished_total[1h])
# Average build duration
rate(concourse_builds_duration_seconds_sum[5m])
/ rate(concourse_builds_duration_seconds_count[5m])
# Worker utilization
concourse_containers / concourse_workers_running
Set Up Grafana Dashboard
# Deploy Grafana (machine charm)
juju deploy grafana-machine grafana --channel edge
# Integrate with Prometheus
juju integrate grafana:grafana-source prometheus:grafana-source
# Get Grafana URL and credentials
juju run grafana/leader get-admin-password
Import Concourse CI dashboard from Grafana.com.
Alert Rules Example
groups:
- name: concourse
interval: 30s
rules:
- alert: ConcourseNoWorkersRunning
expr: concourse_workers_running == 0
for: 5m
annotations:
summary: "No Concourse workers running"
- alert: ConcourseStalledWorkers
expr: concourse_workers_stalled > 0
for: 10m
annotations:
summary: "{{ $value }} stalled workers detected"
- alert: ConcourseHighBuildDuration
expr: rate(concourse_builds_duration_seconds_sum[5m]) > 300
for: 15m
annotations:
summary: "Average build duration exceeds 5 minutes"
Verify Metrics Working
# Check Prometheus targets
# Open Prometheus UI: http://<prometheus-ip>:9090/targets
# Look for concourse-ci endpoint with status "UP"
# Run test query in Prometheus UI
concourse_workers_running
✅ Success: Query returns current number of workers (should match
fly workers output).
Related Documentation
- Configuration Reference - All monitoring options
- Troubleshooting - Debug monitoring issues