eBPF Thread Profiler

A kernel-level thread profiler for Python and Java applications using eBPF syscall tracing.


The Problem

Debugging thread-level performance issues in production is hard. Traditional profilers add overhead and often miss the real bottlenecks. When your servers are slow under load and you don’t know why — whether it’s lock contention, I/O blocking, or something else — you need visibility at the kernel level.

How It Works

The profiler traces syscalls at the kernel level using eBPF. It tracks request lifecycles, measures lock wait times, and identifies I/O blocking for each thread — giving you X-ray vision into your application’s threading behavior. Zero code changes required. The eBPF program runs in the kernel and attaches to your process from outside.

Built and deployed in production at Clickpost where it helped identify performance bottlenecks and scale servers while reducing cloud costs.

Features

  • zero code changes attaches to running processes from outside
  • kernel-level tracing syscall-level visibility, not sampling
  • lock wait time measures exactly how long threads wait on locks
  • I/O blocking identifies which I/O calls are blocking threads
  • minimal overhead eBPF runs in the kernel, not in your app
  • Python & Java works with both runtimes

Stack

  • language C, Python
  • technique eBPF · bpftrace · BCC · syscall tracing


Copyright 2026 Deepanshu Kartikey, all rights reserved — kartikey406@gmail.com