From Traces to Profiles: How OpenTelemetry’s New Profiling Signal and eBPF Auto‑Instrumentation Upgrade Reliability Metrics

Modern reliability work is increasingly shaped by open standards. Over the past 18 months, three threads have converged in a meaningful way for SREs and platform teams:

This article distills what changed recently, why it matters for reliability metrics, and how to add these capabilities to an existing OpenTelemetry stack with minimal fuss.

Note on recency: OpenTelemetry announced early profiling support in March 2024, followed by guidance and feature‑gate’d Collector support later that year. As of today (September 22, 2025), profiling in OTLP remains in “development” status, with practical ways to experiment now and production readiness on the horizon. (opentelemetry.io)

Why this matters for reliability work

What changed lately (and is worth adopting)

Turn traces into reliability metrics with span‑to‑metrics

RED metrics (Rate, Errors, Duration) are a reliable way to quantify service health and drive SLOs. If you already have traces, you don’t need to build separate counters: the OpenTelemetry Collector’s spanmetrics connector aggregates span streams into request counts, error counts, and latency histograms. It’s a clean replacement for earlier “spanmetrics processors.” (opentelemetry.io)

Example Collector configuration to generate RED metrics from traces:

receivers:
  otlp:
    protocols:
      grpc:
      http:

processors:
  batch:

connectors:
  spanmetrics:
    metrics_flush_interval: 15s   # adjust to your window

exporters:
  prometheus:
    endpoint: "0.0.0.0:8889"
  otlp:
    endpoint: "${BACKEND_OTLP_GRPC}"

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [batch]
      exporters: [spanmetrics, otlp]
    metrics:
      receivers: [spanmetrics]
      exporters: [prometheus, otlp]

This setup turns spans into:

You’ll see usable RED metrics in Prometheus or any OTLP‑capable metrics backend. Red Hat’s docs and several vendor distributions ship similar examples. (docs.openshift.com)

When a metrics chart spikes, you want a one‑click hop to a representative trace. Exemplars make that reliable: the SDK records a small sample of measurements with the active trace/span IDs, and backends (Prometheus, Grafana, Jaeger/Tempo) surface those as clickable dots in the chart.

Minimal .NET setup sketch (abbreviated):

var resource = ResourceBuilder.CreateDefault().AddService("checkout");
using var tracer = Sdk.CreateTracerProviderBuilder()
    .AddSource("app")
    .SetResourceBuilder(resource)
    .Build();

using var meterProvider = Sdk.CreateMeterProviderBuilder()
    .SetExemplarFilter(ExemplarFilterType.TraceBased)
    .AddMeter("app")
    .AddPrometheusExporter()
    .Build();

With exemplars on, a spike in your latency histogram includes a trace_id you can click to jump straight to the culprit span. (opentelemetry.io)

Fill coverage gaps with eBPF auto‑instrumentation (zero code)

Even mature OTel rollouts have blind spots: legacy services, third‑party apps, or teams reluctant to add agents or rebuild binaries. eBPF auto‑instrumentation helps by attaching to network and user‑space hooks at runtime and emitting spans and RED metrics per request—no code changes, no restarts.

Practical advice:

Bring profiles into the same pipeline

Profiles show where CPU time, allocations, and context switches happen. Combining them with traces and metrics helps in two critical reliability moments:

Here’s what’s feasible today:

1) Transport and spec
OTLP defines gRPC/HTTP message types for “profiles.” The default HTTP path is currently /v1development/profiles while the signal hardens. Expect changes; pin versions across agents, Collector, and backends. (opentelemetry.io)

2) Collector support (behind a feature gate)
Recent releases of the OpenTelemetry Collector can receive, process, and export profiles if you enable the profiles feature gate. That lets you prototype an end‑to‑end profiles pipeline alongside your existing traces/metrics/logs. (opentelemetry.io)

A minimal prototype looks like:

receivers:
  otlp:
    protocols:
      grpc:

exporters:
  otlp:
    endpoint: "pyroscope:4040"     # any backend that understands OTLP profiles
    tls:
      insecure: true

service:
  # feature gate enablement is done via collector args; consult release notes
  pipelines:
    profiles:
      receivers: [otlp]
      exporters: [otlp]

Backends: Grafana Pyroscope 1.10+ can receive and visualize OTLP profiles (marked experimental) and provides notes on symbolization and compatibility. If you’re already on Grafana, it’s a low‑friction way to try profiles without adding a separate protocol. (grafana.com)

Agents: You can experiment with the opentelemetry‑ebpf‑profiler repository or the Elastic‑contributed Universal Profiling agent now living under OpenTelemetry—both designed for very low overhead. (github.com)

Caveats:

A pragmatic rollout plan

Phase 1: Get reliable RED metrics from traces

Phase 2: Wire metrics to traces with exemplars

Phase 3: Raise coverage with eBPF auto‑instrumentation

Phase 4: Pilot profiles

Phase 5: Turn insights into SLOs and budgets

Common pitfalls and how to avoid them

What to watch next

TL;DR: A reliability‑first recipe

If you already standardized on OpenTelemetry for traces and metrics, you’re closer than you think: the next increment is mostly configuration and a lab environment. The payoff is faster incident triage and more credible SLOs—because you can see, with precision, how user pain maps to real code running on real CPUs.