Good observability is a feature. Here is a lean setup for Next.js apps using the App Router.
Tracing with OpenTelemetry
- Install
@opentelemetry/api,@opentelemetry/sdk-trace-web, and an exporter (OTLP/HTTP is a safe default). - Wrap fetch handlers and critical server actions with spans; propagate context headers to downstream services.
- Sample smartly: full traces in staging; low-rate probabilistic sampling in production with tail-based sampling if your backend supports it.
Metrics that matter
- Track request latency (p50/p95/p99), error rate, cache hit rate, and external call duration.
- Emit lightweight counters and histograms; avoid high-cardinality label values.
- Surface build-time metrics (bundle size, route generation) in CI for regression detection.
Structured logging
- Use a logger that emits JSON with fields:
timestamp,level,message,requestId,route,userId(when available). - Correlate logs to traces by injecting
traceId/spanIdinto log context. - Redact PII at the edge; avoid logging secrets or tokens.
Deployment tips
- Run an OTLP collector sidecar or gateway to fan out to vendors (e.g., Tempo, Jaeger, Honeycomb).
- Keep the client bundle clean: only instrument what runs server-side unless you need client traces.
- Add health endpoints that expose minimal metrics (e.g., uptime, last successful build ID) without sensitive data.
Next steps
- Add CI checks that fail on missing trace context in server routes touching external dependencies.
- Build a “golden path” dashboard: core pages, API routes, and third-party calls with shared SLOs.
- Run a game day: deliberately break an upstream dependency and verify alerts and dashboards guide you to root cause.

