Monitoring Measure system health with metrics and alerts. YAML1 2 3 4 5 6# Prometheus alert example - alert: HighErrorRate expr: sum(rate(http_requests_total{status=~"5.."}[5m])) > 5 for: 10m labels: severity: critical