Skip to main content

Prometheus scrape labels

Every service in a Server AI Hub Compose file is tagged with a small set of prometheus.* Docker labels. The labels do nothing in Compose-land today — the running prometheus.yml is still hand-written — but they are the migration contract for the eventual move to k3s / Prometheus Operator. When that lands, a translator script reads these labels off each container spec and emits a corresponding ServiceMonitor CRD with zero hand-editing.

The convention

services:
langfuse:
image: langfuse/langfuse:latest
container_name: langfuse
restart: unless-stopped
# ... rest of service config ...
labels:
- "prometheus.scrape=false" # or true
- "prometheus.port=3000" # numeric port INSIDE the container
- "prometheus.path=/metrics" # default; omit if /metrics
- "prometheus.job=langfuse" # PromQL job label
LabelRequiredMeaning
prometheus.scrapeyestrue or false. The single boolean every scraper / discovery process checks first
prometheus.portwhen scrape=trueContainer-internal port serving /metrics
prometheus.pathwhen scrape=trueURL path. Defaults to /metrics if omitted
prometheus.jobyesThe job label that appears in the up{} series — usually the service's container_name

Why this matters

When the time comes to move to k3s, a service like the one above becomes a ServiceMonitor:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: langfuse
labels:
prometheus: serveraihub
spec:
selector:
matchLabels:
app: langfuse
endpoints:
- port: web # named port matching prometheus.port
path: /metrics
interval: 15s

The translator is a small Python script that walks every Compose service in the repo, reads the prometheus.* labels off it, and emits one ServiceMonitor file per service. No archaeology, no manual matching, no surprises. Drop the result into kustomize and apply.

Three buckets of "scrape" today

Looking across the Hub's 26 tagged services, scrape decisions fall into three buckets:

BucketscrapeExamplesWhy
Native /metrics existstrueQdrant, Caddy, vLLM (bespoke-mc-7b, bge-m3-reranker), Prometheus, Grafana, cAdvisor, node-exporterAlready exposes Prometheus exposition format on a documented port
Needs sidecar exporterfalse (today)redis-llm, serveraihub-redis, langfuse-db (postgres), langfuse-redisSwitch to true once the exporter sidecar is added
Custom service, needs instrumentationfalse (today)BGEembed, hem-nli, graph-explorer, griff-api, serveraihub-dashboardAdd prometheus_client to the Python service and expose /metrics

The YAML comment above each labels: block explains the bucket per service.

The rule of thumb

cAdvisor + node-exporter cover container-level CPU / memory / network / disk / restarts / OOM for everything that runs in Docker. That's ~90% of "is this thing healthy" out of the box. Only add scrape configs for services where the business metric is what you care about — request latency, model accuracy, queue depth, retrieval QPS — not just "is the container alive". Container-alive is free via cAdvisor.

Maintenance

When adding a new service:

  1. Pick a container_name.
  2. Decide whether it has, or should have, a /metrics endpoint.
  3. Add the four labels at the end of the service's compose entry.
  4. If you've added a sidecar exporter, the sidecar gets its own service entry with its own labels.

Re-running scripts/add_prom_labels.py (kept in /tmp in this repo's tree) is idempotent — it strips any prior prometheus.* labels and re-injects from the central registry. If you'd rather hand-edit, that's fine too; the script just keeps the registry honest.