Prometheus scrape labels

Every service in a Server AI Hub Compose file is tagged with a small set of prometheus.* Docker labels. The labels do nothing in Compose-land today — the running prometheus.yml is still hand-written — but they are the migration contract for the eventual move to k3s / Prometheus Operator. When that lands, a translator script reads these labels off each container spec and emits a corresponding ServiceMonitor CRD with zero hand-editing.

The convention

services:
  langfuse:
    image: langfuse/langfuse:latest
    container_name: langfuse
    restart: unless-stopped
    # ... rest of service config ...
    labels:
      - "prometheus.scrape=false"   # or true
      - "prometheus.port=3000"      # numeric port INSIDE the container
      - "prometheus.path=/metrics"  # default; omit if /metrics
      - "prometheus.job=langfuse"   # PromQL job label

Label	Required	Meaning
`prometheus.scrape`	yes	`true` or `false`. The single boolean every scraper / discovery process checks first
`prometheus.port`	when scrape=true	Container-internal port serving `/metrics`
`prometheus.path`	when scrape=true	URL path. Defaults to `/metrics` if omitted
`prometheus.job`	yes	The `job` label that appears in the `up{}` series — usually the service's container_name

Why this matters

When the time comes to move to k3s, a service like the one above becomes a ServiceMonitor:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: langfuse
  labels:
    prometheus: serveraihub
spec:
  selector:
    matchLabels:
      app: langfuse
  endpoints:
  - port: web              # named port matching prometheus.port
    path: /metrics
    interval: 15s

The translator is a small Python script that walks every Compose service in the repo, reads the prometheus.* labels off it, and emits one ServiceMonitor file per service. No archaeology, no manual matching, no surprises. Drop the result into kustomize and apply.

Three buckets of "scrape" today

Looking across the Hub's 26 tagged services, scrape decisions fall into three buckets:

Bucket	scrape	Examples	Why
Native /metrics exists	`true`	Qdrant, Caddy, vLLM (bespoke-mc-7b, bge-m3-reranker), Prometheus, Grafana, cAdvisor, node-exporter	Already exposes Prometheus exposition format on a documented port
Needs sidecar exporter	`false` (today)	redis-llm, serveraihub-redis, langfuse-db (postgres), langfuse-redis	Switch to `true` once the exporter sidecar is added
Custom service, needs instrumentation	`false` (today)	BGEembed, hem-nli, graph-explorer, griff-api, serveraihub-dashboard	Add `prometheus_client` to the Python service and expose `/metrics`

The YAML comment above each labels: block explains the bucket per service.

The rule of thumb

cAdvisor + node-exporter cover container-level CPU / memory / network / disk / restarts / OOM for everything that runs in Docker. That's ~90% of "is this thing healthy" out of the box. Only add scrape configs for services where the business metric is what you care about — request latency, model accuracy, queue depth, retrieval QPS — not just "is the container alive". Container-alive is free via cAdvisor.

Maintenance

When adding a new service:

Pick a container_name.
Decide whether it has, or should have, a /metrics endpoint.
Add the four labels at the end of the service's compose entry.
If you've added a sidecar exporter, the sidecar gets its own service entry with its own labels.

Re-running scripts/add_prom_labels.py (kept in /tmp in this repo's tree) is idempotent — it strips any prior prometheus.* labels and re-injects from the central registry. If you'd rather hand-edit, that's fine too; the script just keeps the registry honest.

The convention​

Why this matters​

Three buckets of "scrape" today​

The rule of thumb​

Maintenance​

The convention

Why this matters

Three buckets of "scrape" today

The rule of thumb

Maintenance