OpenTelemetry in Spring Boot: A Production Observability Setup

OpenTelemetry has become the default observability stack for modern Java services. It's vendor-neutral (you can ship to Datadog, Honeycomb, Grafana Tempo, Jaeger. same code), it covers traces, metrics, and logs in one SDK, and Spring Boot's integration story is finally clean. This is the production setup that holds up for real workloads.

Why OpenTelemetry, and Why Now

The pre-OTel era had three problems:

Vendor lock-in. Datadog APM, New Relic agents, AppDynamics. switching meant rewriting instrumentation.
Three separate APIs. Logs through SLF4J, metrics through Micrometer, traces through... whatever your vendor shipped. Correlating them was bespoke per stack.
No standard for context propagation. Every vendor had their own header (X-Datadog-Trace-Id, X-NewRelic-...); microservices boundaries broke traces.

OpenTelemetry fixes all three. One SDK, one wire format (OTLP), one set of headers (W3C Trace Context). You can swap backends by changing an environment variable.

Minimum Viable Setup

Add the Spring Boot starter:

dependencies {
    implementation("io.opentelemetry.instrumentation:opentelemetry-spring-boot-starter")
    implementation("io.micrometer:micrometer-registry-otlp")
}

Configure the exporter via environment variables (the OTel convention):

OTEL_SERVICE_NAME=order-service
OTEL_EXPORTER_OTLP_ENDPOINT=https://otel-collector.example.com:4317
OTEL_EXPORTER_OTLP_PROTOCOL=grpc
OTEL_RESOURCE_ATTRIBUTES=deployment.environment=production,service.version=1.42.0

That's the entire setup. Spring Boot auto-instruments:

Every HTTP request handled by the embedded server
Every JDBC query (if you have the JDBC instrumentation module)
Every outbound HTTP call via RestClient / WebClient
Every JPA repository method
Kafka producer/consumer operations
Redis commands

You'll see distributed traces flowing immediately. The hard work is what comes next: making the traces useful.

Custom Spans Where the Auto-Instrumentation Falls Short

Auto-instrumentation gives you wide coverage but shallow detail. The traces show "this method was called and took 320ms". they don't tell you why. Add custom spans where the business logic matters:

import io.opentelemetry.instrumentation.annotations.WithSpan;
import io.opentelemetry.instrumentation.annotations.SpanAttribute;

@Service
public class CheckoutService {
    @WithSpan("checkout.process")
    public CheckoutResult checkout(
        @SpanAttribute("checkout.cart_id") UUID cartId,
        @SpanAttribute("checkout.customer_tier") String tier
    ) {
        var pricing = applyDiscounts(cartId);
        var payment = chargeCard(pricing);
        var order = createOrder(payment);
        return CheckoutResult.success(order);
    }
}

The @WithSpan annotation creates a span. The @SpanAttribute annotations attach the values as searchable tags. In your observability backend, you can now filter "all checkouts for customer_tier = enterprise that took > 2s". which is the kind of query that actually answers production questions.

Structured Logging with Trace Correlation

Logs without trace IDs are 10× harder to debug than logs with them. Spring Boot 3.2+ wires this up automatically when OTel is on the classpath:

private static final Logger log = LoggerFactory.getLogger(CheckoutService.class);

public Order placeOrder(NewOrderCommand cmd) {
    log.info("Placing order for customer {}", cmd.customerId());
    // logs include trace_id and span_id automatically
    return orderRepo.save(Order.from(cmd));
}

Set the log format to JSON for ingestion into your log backend:

# application.yml
logging:
  pattern:
    console: "%d{ISO8601} %-5level [%X{trace_id:-},%X{span_id:-}] %logger{36} - %msg%n"
  structured:
    format:
      console: ecs   # or logstash, gelf

Now your logs look like:

2026-04-07T14:23:11.482Z INFO [4f2c1a... ,8e9b...] CheckoutService - Placing order for customer 1234

In Datadog, Honeycomb, or Tempo, clicking a log line takes you to the full trace. The full trace links back to all the related logs. You can debug a customer report in seconds instead of grep-and-pray.

Custom Metrics with Micrometer

Micrometer is the metrics façade Spring Boot has used for years. With OTel, it ships those same metrics over OTLP. no new API to learn:

@Service
public class CheckoutService {
    private final Counter checkoutSuccess;
    private final Counter checkoutFailure;
    private final Timer checkoutDuration;

    public CheckoutService(MeterRegistry registry) {
        this.checkoutSuccess = registry.counter("checkout.outcome", "result", "success");
        this.checkoutFailure = registry.counter("checkout.outcome", "result", "failure");
        this.checkoutDuration = registry.timer("checkout.duration");
    }

    public CheckoutResult checkout(NewOrderCommand cmd) {
        return checkoutDuration.record(() -> {
            try {
                var result = doCheckout(cmd);
                checkoutSuccess.increment();
                return result;
            } catch (PaymentException e) {
                checkoutFailure.increment();
                throw e;
            }
        });
    }
}

Every metric is automatically tagged with your service name, environment, and version (from the OTEL_RESOURCE_ATTRIBUTES). You can build the same dashboard for any service in your fleet without per-service configuration.

Sampling: The Setting Most Teams Get Wrong

Sending 100% of traces to your observability backend at production volume costs five figures a month. Sampling at the edge is mandatory.

The default OpenTelemetry sampler is parentbased_traceidratio: it samples a fixed fraction (default 100%, change this) and respects the upstream sampling decision so traces stay coherent across services.

OTEL_TRACES_SAMPLER=parentbased_traceidratio
OTEL_TRACES_SAMPLER_ARG=0.05  # sample 5% of traces

The pattern that actually works in production: head-based sampling for the bulk, tail-based sampling for the interesting ones.

Head-based: sample 5% of traces uniformly. Cheap, captures the typical case.

Tail-based: keep 100% of traces for requests that errored or were slow. This requires a tail-sampling collector (the OTel Collector supports it natively). it buffers spans, waits for the trace to complete, and decides what to keep based on the full picture.

# otel-collector-config.yaml
processors:
  tail_sampling:
    decision_wait: 10s
    policies:
      - name: errors-policy
        type: status_code
        status_code: { status_codes: [ERROR] }
      - name: slow-policy
        type: latency
        latency: { threshold_ms: 1000 }
      - name: probabilistic-policy
        type: probabilistic
        probabilistic: { sampling_percentage: 5 }

You ingest 5% of traces + every error + every slow request. Storage costs collapse; debugging quality stays high.

What "Done" Looks Like

A Spring Boot service with mature OTel observability hits these checkpoints:

Every incoming and outgoing HTTP call has a span (auto-instrumented)
Every business-critical method (checkout, payment, fulfilment) has a custom span with domain attributes
Every log line carries trace_id and span_id
Custom metrics for business KPIs (checkout success rate, payment latency, etc.)
Sampling configured: 5% head + 100% tail of errors and slow requests
One environment variable change can move you from Datadog to Honeycomb to self-hosted Tempo

Get there once, and observability stops being the thing you build per service. It becomes the platform every service inherits.