eBPF · Causal Graphs · Deterministic Replay
Kernel-level visibility
for financial systems.
kprobe attaches to the Linux kernel using eBPF and captures everything your application-layer tools miss — CPU scheduling decisions, memory pressure events, network packet timing — without touching a single line of your code.
No instrumentation required Attaches directly to kernel tracepoints. No library imports, no
redeployment, no code changes ever.
Full causal chain to kernel level Not just what failed — exactly why, down to the kernel event
that triggered it, across every service.
Deterministic incident replay Reproduce any production incident on a dev machine hours after
it happened. Test fixes before shipping.
live · kernel event stream
03:47:12.004 tcp_sendmsg PID 2841 payment-handler → settlement-svc · 1.2kb
03:47:12.005 sys_write PID 2841 fd=7 · ledger write initiated
03:47:12.821 mm_page_fault PID 4721 addr=0x7f3a · batch-job memory pressure
03:47:12.822 sched_switch PID 2841 preempted · CPU 3 → PID 4721
03:47:13.210 sched_switch PID 2841 resumed · CPU 3 · 388ms delayed
03:47:13.621 sys_write PID 2841 fd=7 · completed · 802ms total
03:47:13.622 tcp_recvmsg PID 2841 timeout — payment-handler · ERR
capabilities
Zero-instrumentation capture
An eBPF probe attaches to kernel tracepoints as a Kubernetes
DaemonSet. No library imports, no code changes, no redeployment.
It sees everything from the moment it deploys.
tcp_sendmsg tcp_recvmsg sched_switch sys_write mm_page_fault Causal graph engine
Correlates kernel events with OTel traces on PID + timestamp.
Builds a directed causal graph in Neo4j. Any financial event to
root kernel cause in milliseconds.
Deterministic replay
Reproduce any production incident exactly on a dev machine via
ptrace. Inject timing changes, test fixes against the real event log
before shipping.
Nanosecond precision
Every event timestamped at nanosecond resolution. Timeline view
zoomable to microsecond level across all services simultaneously.
Financial domain primitives
Settlement boundaries, clearing windows, ledger writes, order book
ops — kprobe maps kernel events to your financial domain natively.
causal trace
Payment #98721 received
—
Risk check passed
0.4ms
Settlement write initiated
1.2ms
Memory pressure — kernel
Settlement write completed
800ms
Payment failed — timeout exceeded 750ms
+52ms
5 min
median investigation time with kprobe
4 hrs
median investigation time without it
0
lines of application code changed
kprobe was recording the entire time. The causal graph was ready before anyone woke up.
coverage
What no other tool sees.
| Signal | Datadog | Jaeger | OpenTelemetry | kprobe |
|---|---|---|---|---|
| Distributed traces | Yes | Yes | Instrumented only | Zero instrumentation |
| Database query timing | Partial | Partial | Partial | Yes |
| CPU scheduling decisions | — | — | — | Yes |
| Memory pressure events | — | — | — | Yes |
| Network packet-level timing | — | — | — | Yes |
| Cross-process causal chain | — | — | — | Yes |
| Root cause to kernel level | — | — | — | Yes |
| Deterministic incident replay | — | — | — | Yes |
One command into any cluster.
Requires Kubernetes 1.26+ and Linux kernel 5.15+. No changes to existing services.
$
helm repo add kprobe https://charts.kprobe.io $
helm install kprobe kprobe/kprobe --namespace monitoring
--create-namespace built on
Every technical decision was deliberate.
Rust + Aya
Kernel-side eBPF programs and userspace loader. Compiled to eBPF
bytecode. No C anywhere — memory-safe from the kernel up.
Go
Causal engine, replay engine, gRPC API server. Syscall
interception for deterministic replay via ptrace.
Apache Kafka
KRaft mode, topic-per-event-type. Handles millions of kernel
events per second. Durable and replayable, no Zookeeper.
ClickHouse
Columnar storage for billions of timestamped kernel events.
Sub-second analytical queries at scale.
Neo4j
Graph database for causal relationships. Cypher queries traverse
from any financial event to root kernel cause in milliseconds.
Kubernetes
eBPF probe deployed as a DaemonSet across all nodes. Single Helm
install deploys the full stack into any existing cluster.