Published on

eBPF for Backend Engineers — Zero-Instrumentation Observability

Authors

Introduction

eBPF (extended Berkeley Packet Filter) runs sandboxed programs in the Linux kernel. For observability, eBPF intercepts syscalls, network packets, and kernel events without instrumenting your code. This post covers eBPF concepts, Cilium for networking, Hubble for service flows, and continuous profiling.

What eBPF Is (Kernel Programs Without Kernel Modules)

eBPF programs:

  • Run in kernel (privileged context)
  • Are sandboxed (can't crash the kernel)
  • Hook into kernel events (syscalls, network packets, function calls)
  • Are verified before loading (safe to run)

Unlike kernel modules, eBPF programs don't require:

  • Recompiling the kernel
  • Rebooting the system
  • Kernel version matching

Example: Track all TCP connections without touching application code.

# List all TCP connections on the system
# No code instrumentation, no app restart
tc qdisc add dev eth0 ingress
tc filter add dev eth0 ingress bpf direct-action object-file tcptrack.o section trace_connect

Cilium for Kubernetes Network Observability

Cilium uses eBPF to replace iptables and observe all network flows in Kubernetes. It provides:

  • Network policy enforcement (no iptables needed)
  • Service load balancing (faster than kube-proxy)
  • Network observability (every packet is visible)
# cilium-values.yaml - Helm deploy Cilium
helm repo add cilium https://helm.cilium.io
helm install cilium cilium/cilium \
  --namespace kube-system \
  --set ebpf.enabled=true \
  --set hubble.relay.enabled=true \
  --set hubble.ui.enabled=true

Cilium monitors:

  • Pod-to-pod traffic
  • Pod-to-external traffic
  • DNS queries
  • HTTP requests (L7 visibility)
  • TCP/UDP port scanning

Hubble for Service-to-Service Flow Visibility

Hubble (built on Cilium) visualizes service flows. Export to observability backends (Prometheus, Elasticsearch).

# service-flow.yaml - Observe traffic between services
# Install Hubble in Kubernetes
helm install hubble cilium/hubble-relay -n kube-system

# Hubble CLI: view traffic in real-time
hubble observe --pod-label app=api-server --output json

# Output:
# {
#   "timestamp": "2026-03-15T10:30:00Z",
#   "source": {"namespace": "default", "pod_name": "api-server-1"},
#   "destination": {"namespace": "default", "pod_name": "db-postgres-1"},
#   "l4": {"protocol": "tcp", "destination_port": 5432},
#   "verdict": "ALLOWED",
#   "bytes_sent": 4096
# }

Export to Prometheus:

# hubble-prometheus.yaml - Scrape Hubble metrics
apiVersion: v1
kind: ConfigMap
metadata:
  name: hubble-prometheus-config
data:
  prometheus.yml: |
    scrape_configs:
      - job_name: 'hubble-metrics'
        static_configs:
          - targets: ['localhost:6444']
        relabel_configs:
          - source_labels: [__address__]
            target_label: instance

Continuous Profiling with Parca/Pyroscope

Always-on profiling captures CPU usage at function granularity. eBPF enables zero-instrumentation profiling.

# parca-deployment.yaml - Deploy Parca for continuous profiling
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: parca-agent
spec:
  selector:
    matchLabels:
      app: parca-agent
  template:
    metadata:
      labels:
        app: parca-agent
    spec:
      hostNetwork: true
      hostPID: true
      containers:
      - name: parca-agent
        image: ghcr.io/parca-dev/parca-agent:latest
        args:
          - "--node=$(HOSTNAME)"
          - "--parca-address=parca:7070"
          - "--enable-cpu-profiling"
          - "--enable-memory-profiling"
        env:
        - name: HOSTNAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
        securityContext:
          privileged: true

Query Parca for bottlenecks:

# Find top CPU consumers
parca-cli query select '{cpu_profile}' --start='2h ago'

# Find memory leaks
parca-cli query select '{memory_profile}' --start='24h ago'

# Compare profiles (before/after optimization)
parca-cli diff \
  'select {cpu_profile}' \
  'select {cpu_profile}' \
  --start='2026-03-14T00:00:00Z' \
  --end='2026-03-15T00:00:00Z'

bpftrace for Ad-Hoc Investigation

bpftrace is a high-level tracing language. Write one-liners to investigate system behavior.

# Trace all file opens by nginx
bpftrace -e 'tracepoint:syscalls:sys_enter_open* /comm == "nginx"/ { printf("%s %s\n", comm, str(args->filename)); }'

# Trace slow syscalls (>10ms)
bpftrace -e '
  tracepoint:raw_syscalls:sys_enter { @start[tid] = nsecs; }
  tracepoint:raw_syscalls:sys_exit { @latency = hist(nsecs - @start[tid]); delete(@start[tid]); }
'

# Find memory allocations by size
bpftrace -e '
  kprobe:kmalloc { @allocs[args->size] = count(); }
  END { print(@allocs); }
'

# Trace HTTP requests to a specific backend
bpftrace -e '
  kprobe:tcp_cleanup_rbuf { printf("Socket %p closed\n", args->sk); }
'

TCP Retransmit Tracing

High TCP retransmits indicate network problems. Use eBPF to detect and locate them.

# Monitor TCP retransmits in real-time
bpftrace -e '
  kprobe:tcp_retransmit_skb {
    printf("Retransmit: %s -> %s\n", 
      ntop(AF_INET, args->skb->__skb_basic_meta->dev_net->ns->inum),
      ntop(AF_INET, args->skb->__skb_basic_meta->transport_header)
    );
  }
'

# Export to Prometheus
# (bpftrace output → custom exporter → Prometheus)

Latency Profiling at Syscall Level

Identify bottlenecks by measuring time spent in syscalls.

# Syscall latencies
bpftrace -e '
  tracepoint:raw_syscalls:sys_enter {
    @syscall_start[tid] = nsecs;
  }
  tracepoint:raw_syscalls:sys_exit {
    if (@syscall_start[tid]) {
      @syscalls[strjoin(", ", @args)] = hist(nsecs - @syscall_start[tid]);
    }
  }
'

# Output:
# @syscalls[sys_epoll_wait]:
#   [0..1ms): 10234 |@@@@@@@@@@@|
#   [1..10ms): 5432  |@@@@@     |
#   [10..100ms): 123 |@         |
#   [100ms+): 5     |          |

eBPF vs Sidecar Overhead Comparison

ApproachCPU OverheadMemoryLatency ImpactDeployment
eBPF2-5%100MB< 1usKernel module
Sidecar (Envoy)10-20%1GB+5-10usContainer per pod
Code instrumentation5-15%200MB2-5usRecompile app
eBPF + Sidecar15-25%1.1GB+10us+Both

eBPF wins on efficiency; sidecars win on control.

Checklist

  • Deploy Cilium for network observability in Kubernetes
  • Use Hubble to visualize service flows
  • Set up Parca for always-on CPU/memory profiling
  • Write bpftrace one-liners for quick investigations
  • Monitor TCP retransmits as a network health indicator
  • Profile syscall latency to find bottlenecks
  • Compare eBPF vs instrumentation (eBPF usually cheaper)
  • Test eBPF programs in staging before production
  • Monitor Cilium CPU overhead
  • Export Hubble/Parca data to long-term storage

Conclusion

eBPF provides observability without instrumenting code. Cilium and Hubble visualize network flows. Parca profiles CPU/memory continuously. For diagnosing production issues, bpftrace is unmatched. Start with Cilium + Hubble for network visibility; add Parca for continuous profiling; use bpftrace for investigations.