Hooking SslTransportLayer via ByteBuddy
Companion piece to Decrypting Kafka TLS without a proxy. Same POC, this time with the code that made it work and the two missteps that ate the most hours.
Choosing where to hook
Three places make sense for intercepting Kafka client TLS:
| Hook point | What you see | What you don't |
|---|---|---|
SSLEngine.wrap / unwrap (JDK) |
All TLS traffic from any consumer of the engine | Application-level framing context |
Socket layer (SocketChannel.read/write) |
Encrypted bytes | Plaintext |
org.apache.kafka.common.network.SslTransportLayer |
Plaintext Kafka wire bytes, scoped to the Kafka client | Anything outside the Kafka client |
We went with SslTransportLayer. Narrowest cut: Kafka traffic only,
plaintext already, one class to instrument, stable surface across kafka-clients since
2.x. Hooking SSLEngine would have caught every TLS user in the JVM (JMX
console, Schema Registry HTTP client, anything else). More noise, slower path to a
useful capture.
What ByteBuddy is doing
ByteBuddy is a bytecode rewriter. Paired with the JDK's Java Instrumentation API, it can redefine classes at load time or retransform classes already loaded. We use it to insert entry and exit advice into the methods we care about, no source changes, no recompile.
The agent boils down to one AgentBuilder call wired from
premain:
public static void premain(String args, Instrumentation inst) {
TapPublisher.start(); // boots the UDS writer thread
new AgentBuilder.Default()
.disableClassFormatChanges()
.with(AgentBuilder.RedefinitionStrategy.RETRANSFORMATION)
.with(AgentBuilder.InitializationStrategy.NoOp.INSTANCE)
.with(AgentBuilder.TypeStrategy.Default.REDEFINE)
.ignore(nameStartsWith("net.bytebuddy."))
.ignore(nameStartsWith("io.kapture.tap."))
.type(named("org.apache.kafka.common.network.SslTransportLayer"))
.transform((builder, type, cl, module, pd) ->
builder
.visit(Advice.to(ReadAdvice.class)
.on(named("read").and(takesArguments(ByteBuffer.class))))
.visit(Advice.to(WriteAdvice.class)
.on(named("write")
.and(takesArgument(0, ByteBuffer[].class))
.and(takesArguments(3))))
)
.installOn(inst);
}
ReadAdvice grabs the buffer after read returns.
WriteAdvice grabs the buffer array before write runs. Both
push bytes onto a bounded queue that a dedicated writer thread drains to a Unix domain
socket.
Trap one: which write overload do you actually hook
SslTransportLayer exposes three write methods, inherited from
GatheringByteChannel:
public int write(ByteBuffer src)
public long write(ByteBuffer[] srcs)
public long write(ByteBuffer[] srcs, int offset, int length)
The Kafka client calls write(srcs, 0, length) from
KafkaChannel.write(). Internally, the (ByteBuffer[]) overload
delegates to the three-arg form. The single-buffer form is rarely called by Kafka client
code itself, but it fires inside the TLS wrap loop.
I hooked all three. The receiver dumped every frame two or three times in a row, identical byte content each time. The agent was firing on the public method, the delegated method, and the inner loop method. Same data, three captures. Took me longer than I'd like to admit to figure out why.
Fix: hook only the three-arg form, the one that actually moves bytes:
.on(named("write")
.and(takesArgument(0, ByteBuffer[].class))
.and(takesArguments(3)))
Before: 90 captured frames for a 10-message producer. After: 14. That 14 is correct: three handshake RPCs, then one Produce per message after batching settles.
Trap two: ByteBuddy 1.14.x does not support Java 25
I was on a JDK 25 dev box. ByteBuddy 1.14.19, the latest stable at the time, officially
supports up to Java 23. The premain installed fine, our "installed" banner
printed, everything looked happy. The first time SslTransportLayer loaded,
ByteBuddy threw:
java.lang.IllegalArgumentException: Java 25 (69) is not supported by the current
version of Byte Buddy which officially supports Java 23 (67) - update Byte
Buddy or set io.kapture.tap.shaded.bytebuddy.experimental as a VM property
Silently, because no AgentBuilder.Listener was wired up. The class loaded
as if no instrumentation had been requested. Two hours of "why does my matcher
never fire" later, I bolted on a listener:
.with(new AgentBuilder.Listener.Adapter() {
@Override public void onError(String typeName, ClassLoader cl,
JavaModule m, boolean loaded, Throwable th) {
if (typeName.contains("TransportLayer")) {
System.err.println("[agent] error on " + typeName + ": " + th);
}
}
})
First run told me exactly what was broken. Wire Listener.onError on every
AgentBuilder you ever write. Silent failure mode is the default and it will
eat your afternoon.
Two ways out: bump to ByteBuddy 1.15+ when it ships full Java 25 support, or set
-Dio.kapture.tap.shaded.bytebuddy.experimental=true until then.
What the advice looks like
ReadAdvice runs at method exit and slices freshly-decrypted bytes out of
the destination buffer:
public class ReadAdvice {
@Advice.OnMethodEnter
public static int enter(@Advice.Argument(0) ByteBuffer dst) {
return dst == null ? -1 : dst.position();
}
@Advice.OnMethodExit(suppress = Throwable.class)
public static void exit(@Advice.This Object self,
@Advice.Argument(0) ByteBuffer dst,
@Advice.Enter int oldPos) {
if (dst == null || oldPos < 0) return;
int n = dst.position() - oldPos;
if (n <= 0) return;
byte[] payload = new byte[n];
ByteBuffer dup = dst.duplicate();
dup.position(oldPos);
dup.limit(oldPos + n);
dup.get(payload);
TapPublisher.capture(self, (byte) 1, payload);
}
}
Two non-negotiables. dst.duplicate() so the advice never touches the
original buffer's position. Corrupt that and the Kafka client starts reading
garbage. suppress = Throwable.class so a bug in observation code can never
bubble up into the client's hot path. The application must never see an exception
that came from us.
What's left on the floor
The agent has a bounded 8192-frame queue and a dedicated writer thread. Fine for
localhost. High-throughput brokers will need batching or shared memory. Next: bump
ByteBuddy, replace the toy Rust receiver with Kapture's wire decoder, surface tap
sessions in the Protocol tab with a source: tap badge.
Next: Why eBPF isn't needed for JVM TLS, and where it actually is the right tool.