← Back to Blog
Engineering March 31, 2026 • 18 min read

OpenTelemetry on Kubernetes: The Complete Production Setup Guide

A production-ready guide to deploying OpenTelemetry on Kubernetes—covering the Collector Operator, DaemonSet agents, gateway deployments, RBAC, resource limits, and how to get logs, metrics, and traces flowing in under an hour.

Share:

If you're running microservices on Kubernetes, OpenTelemetry is no longer optional—it's the standard. The CNCF-graduated project has become the lingua franca of observability instrumentation, and for good reason: it's vendor-neutral, actively maintained, and supported by virtually every cloud provider and monitoring platform on the planet.

But the gap between “OpenTelemetry works in my local environment” and “OpenTelemetry is running reliably in production Kubernetes” is wider than most guides let on.

This post is the guide we wished existed when we were setting this up ourselves. You'll get a production-ready architecture, real YAML configs, the mistakes to avoid, and an explanation of why each decision matters—not just a pile of kubectl apply commands.

By the end, you'll have all three telemetry signals—logs, metrics, and traces—flowing from your Kubernetes cluster into a backend of your choice.

Why OpenTelemetry + Kubernetes Is the New Standard

OpenTelemetry (OTel) solves a problem that anyone who's ever switched monitoring vendors knows intimately: instrumentation lock-in. Before OTel, adding distributed tracing meant choosing a vendor (Jaeger, Zipkin, Datadog, etc.) and writing instrumentation tied to that vendor's SDK. Switching backends meant rewriting your instrumentation layer.

OpenTelemetry breaks that coupling. You instrument once using the OTel SDK, emit data to the OTel Collector, and route to any backend. The Collector handles format translation, enrichment, filtering, and fan-out. Your application code stays clean and portable.

Kubernetes amplifies this value. In a dynamic cluster environment, pods come and go, namespaces multiply, and understanding which pod generated which log or span is genuinely hard without a structured approach to telemetry collection and metadata enrichment.

The combination—OTel instrumented apps + OTel Collector on Kubernetes—gives you:

Let's build it.

The Two-Tier Collector Architecture

The production-ready approach is a two-tier architecture:

[Instrumented Pods]
       │ OTLP (gRPC/HTTP)
       ▼
[Agent Collectors — DaemonSet]   ← one per node
       │ OTLP
       ▼
[Gateway Collector — Deployment] ← centralized processing
       │ OTLP (or backend-native protocol)
       ▼
[Observability Backend]

Why two tiers?

The DaemonSet agent runs one Collector per node. It's responsible for:

The gateway Deployment runs a small number of replicas (typically 2–3 for HA). It's responsible for:

This separation of concerns gives you resilience (gateway goes down → agents buffer locally), scale isolation (don't overwhelm your backend by adjusting gateway replicas), and a clean place to put expensive processing (like tail-based sampling decisions).

Step 1: Install the OpenTelemetry Operator

The OpenTelemetry Operator is the recommended way to manage Collectors declaratively on Kubernetes. It provides custom resources (OpenTelemetryCollector, Instrumentation) that plug into your GitOps workflow and handle upgrades cleanly.

First, install cert-manager (required by the Operator):

kubectl apply -f https://github.com/cert-manager/cert-manager/releases/latest/download/cert-manager.yaml
kubectl wait --for=condition=Available deployment --all -n cert-manager --timeout=120s

Then install the Operator:

kubectl apply -f https://github.com/open-telemetry/opentelemetry-operator/releases/latest/download/opentelemetry-operator.yaml

Verify it's running:

kubectl get pods -n opentelemetry-operator-system

Step 2: Configure RBAC

The k8sattributes processor needs permission to query the Kubernetes API for pod metadata. Create a ClusterRole and bind it to the Collector's service account:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: otel-collector
  namespace: observability
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: otel-collector
rules:
  - apiGroups: [""]
    resources:
      - nodes
      - nodes/proxy
      - services
      - endpoints
      - pods
      - namespaces
    verbs: ["get", "list", "watch"]
  - apiGroups: ["apps"]
    resources: ["replicasets"]
    verbs: ["get", "list", "watch"]
  - apiGroups: ["extensions"]
    resources: ["replicasets"]
    verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: otel-collector
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: otel-collector
subjects:
  - kind: ServiceAccount
    name: otel-collector
    namespace: observability

Step 3: Deploy the DaemonSet Agent

Create the agent OpenTelemetryCollector resource. This runs on every node and handles local signal collection and enrichment:

apiVersion: opentelemetry.io/v1alpha1
kind: OpenTelemetryCollector
metadata:
  name: otel-agent
  namespace: observability
spec:
  mode: daemonset
  serviceAccount: otel-collector
  tolerations:
    - key: node-role.kubernetes.io/control-plane
      effect: NoSchedule

  resources:
    limits:
      memory: 512Mi
      cpu: 250m
    requests:
      memory: 128Mi
      cpu: 50m

  config: |
    receivers:
      otlp:
        protocols:
          grpc:
            endpoint: 0.0.0.0:4317
          http:
            endpoint: 0.0.0.0:4318

      hostmetrics:
        collection_interval: 30s
        scrapers:
          cpu:
          memory:
          disk:
          filesystem:
          network:
          load:

      filelog:
        include:
          - /var/log/pods/*/*/*.log
        start_at: beginning
        include_file_path: true
        include_file_name: false
        operators:
          - type: router
            id: get-format
            routes:
              - output: parser-docker
                expr: 'body matches "^\\{"'
              - output: parser-crio
                expr: 'body matches "^[^ Z]+ "'
          - type: json_parser
            id: parser-docker
            output: extract-metadata-from-filepath
          - type: regex_parser
            id: parser-crio
            regex: '^(?P<time>[^ Z]+) (?P<stream>stdout|stderr) (?P<logtag>[^ ]*) ?(?P<log>.*)$'
            output: extract-metadata-from-filepath
          - type: regex_parser
            id: extract-metadata-from-filepath
            regex: '^.*\/(?P<namespace>[^_]+)_(?P<pod_name>[^_]+)_(?P<uid>[a-f0-9\-]{36})\/(?P<container_name>[^\._]+)\/(?P<restart_count>\d+)\.log$'
            parse_from: attributes["log.file.path"]
            cache:
              size: 128
          - type: move
            from: attributes.log
            to: body

    processors:
      memory_limiter:
        check_interval: 5s
        limit_percentage: 80
        spike_limit_percentage: 25

      batch:
        send_batch_size: 1000
        timeout: 5s
        send_batch_max_size: 2000

      k8sattributes:
        auth_type: serviceAccount
        passthrough: false
        filter:
          node_from_env_var: KUBE_NODE_NAME
        extract:
          metadata:
            - k8s.pod.name
            - k8s.pod.uid
            - k8s.deployment.name
            - k8s.statefulset.name
            - k8s.daemonset.name
            - k8s.cronjob.name
            - k8s.namespace.name
            - k8s.node.name
            - k8s.pod.start_time
            - k8s.pod.ip
            - container.image.name
            - container.image.tag
          labels:
            - tag_name: app.label.component
              key: app.kubernetes.io/component
              from: pod
          annotations:
            - tag_name: annotation.prometheus_io_scrape
              key: prometheus.io/scrape
              from: pod
        pod_association:
          - sources:
              - from: resource_attribute
                name: k8s.pod.ip
          - sources:
              - from: resource_attribute
                name: k8s.pod.uid
          - sources:
              - from: connection

      resourcedetection:
        detectors: [env, k8snode]
        k8snode:
          node_from_env_var: KUBE_NODE_NAME

    exporters:
      otlp:
        endpoint: otel-gateway-collector.observability.svc.cluster.local:4317
        tls:
          insecure: false
          ca_file: /etc/ssl/certs/ca-certificates.crt

    service:
      pipelines:
        traces:
          receivers: [otlp]
          processors: [memory_limiter, k8sattributes, resourcedetection, batch]
          exporters: [otlp]
        metrics:
          receivers: [otlp, hostmetrics]
          processors: [memory_limiter, k8sattributes, resourcedetection, batch]
          exporters: [otlp]
        logs:
          receivers: [otlp, filelog]
          processors: [memory_limiter, k8sattributes, resourcedetection, batch]
          exporters: [otlp]

  env:
    - name: KUBE_NODE_NAME
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: spec.nodeName

  volumeMounts:
    - name: varlogpods
      mountPath: /var/log/pods
      readOnly: true
    - name: varlibdockercontainers
      mountPath: /var/lib/docker/containers
      readOnly: true

  volumes:
    - name: varlogpods
      hostPath:
        path: /var/log/pods
    - name: varlibdockercontainers
      hostPath:
        path: /var/lib/docker/containers

The key processors here are:

memory_limiter — Always put this first. It prevents the Collector from crashing your node by OOM-killing when a burst of telemetry arrives. The 80/25 limits mean: start dropping data when memory exceeds 80% of the limit, and apply backpressure when a spike would push you 25% above the baseline usage.

k8sattributes — This is what makes Kubernetes observability actually useful. Without it, every log is just text with no context about which pod, deployment, or namespace generated it. With it, you automatically get k8s.pod.name, k8s.namespace.name, k8s.deployment.name, and more attached to every span, metric, and log line.

batch — Batching reduces the number of export requests and dramatically improves throughput to your backend. Don't skip it.

Step 4: Deploy the Gateway

The gateway receives aggregated telemetry from all agents, applies cluster-level processing, and exports to your backend:

apiVersion: opentelemetry.io/v1alpha1
kind: OpenTelemetryCollector
metadata:
  name: otel-gateway
  namespace: observability
spec:
  mode: deployment
  replicas: 2
  serviceAccount: otel-collector

  resources:
    limits:
      memory: 1Gi
      cpu: 500m
    requests:
      memory: 256Mi
      cpu: 100m

  config: |
    receivers:
      otlp:
        protocols:
          grpc:
            endpoint: 0.0.0.0:4317
          http:
            endpoint: 0.0.0.0:4318

    processors:
      memory_limiter:
        check_interval: 5s
        limit_percentage: 80
        spike_limit_percentage: 25

      batch:
        send_batch_size: 5000
        timeout: 10s
        send_batch_max_size: 10000

      # Optional: tail-based sampling for traces
      # probabilistic_sampler:
      #   sampling_percentage: 20

    exporters:
      otlphttp:
        endpoint: https://YOUR_BACKEND_ENDPOINT
        headers:
          Authorization: "Bearer ${BACKEND_API_KEY}"
        retry_on_failure:
          enabled: true
          initial_interval: 5s
          max_interval: 30s
          max_elapsed_time: 300s
        sending_queue:
          enabled: true
          num_consumers: 10
          queue_size: 1000

    service:
      pipelines:
        traces:
          receivers: [otlp]
          processors: [memory_limiter, batch]
          exporters: [otlphttp]
        metrics:
          receivers: [otlp]
          processors: [memory_limiter, batch]
          exporters: [otlphttp]
        logs:
          receivers: [otlp]
          processors: [memory_limiter, batch]
          exporters: [otlphttp]

Note the retry_on_failure and sending_queue configuration on the exporter. These are critical for production—without them, any blip in backend availability causes telemetry loss. The sending queue buffers data in memory while retrying, giving you a window of resilience.

Step 5: Auto-Instrument Your Applications

Once the Collector infrastructure is in place, you can auto-instrument applications without changing their code using the Instrumentation custom resource:

apiVersion: opentelemetry.io/v1alpha1
kind: Instrumentation
metadata:
  name: otel-instrumentation
  namespace: my-app
spec:
  exporter:
    endpoint: http://otel-agent-collector.observability.svc.cluster.local:4318

  propagators:
    - tracecontext
    - baggage
    - b3

  sampler:
    type: parentbased_traceidratio
    argument: "0.1"   # Sample 10% of traces in production

  java:
    image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-java:latest
  python:
    image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-python:latest
  nodejs:
    image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-nodejs:latest
  dotnet:
    image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-dotnet:latest
  go:
    image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-go:latest

Then annotate your deployments to opt in:

annotations:
  instrumentation.opentelemetry.io/inject-java: "my-app/otel-instrumentation"
  # or for Python:
  instrumentation.opentelemetry.io/inject-python: "my-app/otel-instrumentation"

The Operator automatically injects the appropriate SDK as an init container, sets the required environment variables, and wires everything to the local agent Collector. Zero code changes required.

Common Production Mistakes

1. Skipping the memory limiter

The most common cause of OTel Collector crashes in production. Add memory_limiter as the first processor in every pipeline, always.

2. Running a single Collector as a Deployment without the DaemonSet

This forces all pod telemetry to traverse the network to reach a central Collector, adds latency, and means the Collector can't efficiently tail logs from /var/log/pods. Use the two-tier architecture.

3. Not setting resource limits

Collectors without resource limits can consume unbounded memory during traffic spikes and get OOM-killed, causing data loss and node pressure. Set both requests and limits.

4. Forgetting to enrich host metrics with k8sattributes

Host metrics from the hostmetrics receiver don't automatically have pod/namespace context. Make sure they flow through k8sattributes so you can correlate CPU spikes with specific workloads.

5. Sampling at the agent instead of the gateway

Tail-based sampling (where you decide to keep a trace after seeing all its spans) must happen at a single point—typically the gateway—because spans from a single trace arrive from different pods on different nodes. Agent-level probabilistic sampling works, but tail-based sampling requires the gateway.

6. Ignoring backpressure

If your backend slows down or becomes unavailable, your Collectors need somewhere to buffer. Configure sending_queue and retry_on_failure on all exporters, and size the queue appropriately for your expected traffic burst duration.

Sending Your Telemetry to Qorrelate

Once your OTel pipeline is running, you need a backend that can actually handle the data at scale—fast enough to be useful during an incident, cheap enough that you're not being punished for storing history.

Qorrelate is purpose-built for exactly this. We're a full-stack observability platform built on ClickHouse—the columnar database that ingests and queries telemetry orders of magnitude faster than Elasticsearch-based stacks, at a fraction of the storage cost. We natively ingest OTLP over HTTP, so your OTel Collector gateway just needs one config change:

exporters:
  otlphttp:
    endpoint: https://ingest.qorrelate.io
    headers:
      Authorization: "Bearer YOUR_QORRELATE_API_KEY"

That's it. Your logs, metrics, and traces will start appearing in Qorrelate within seconds.

Beyond the basics, Qorrelate layers capabilities on top of your OTel data that you won't find elsewhere:

Start Free

Try the interactive sandbox with live demo data, or sign up free — 5 GB logs, 500K trace spans, and 5K active unique time series per month, no credit card required.

What's Next

With the two-tier Collector architecture in place and your applications instrumented, you have a solid foundation. Here's what to tackle next:

Tail-based sampling — Once you're generating significant trace volume, head-based sampling (where you decide at trace start whether to keep it) loses valuable data. Add the tailsampling processor to your gateway to make smarter decisions based on error rates, latency outliers, and span count.

Cost controls — Add filter processors to drop high-cardinality, low-value metrics before they reach your backend. Kubernetes emits thousands of metrics per node; most of them you'll never look at.

Alerting — Pair your observability data with alert rules that page on signal, not noise. Qorrelate's workflow automation lets you set up multi-signal alerts that trigger only when both an error rate spikes and latency degrades—dramatically reducing alert fatigue.

Log-to-trace correlation — Ensure your application logs include trace_id and span_id in their structured output. Most OTel SDKs inject these automatically. Qorrelate uses them to let you click from a log line directly to the parent trace—the single most useful feature for debugging production issues.


Have questions about your OTel setup? We're happy to help — reach out at support@qorrelate.io or join our Slack community.

Ready to see your OTel data in action?

Get logs, metrics, and traces flowing into Qorrelate in under 5 minutes. No credit card required.

Continue Reading