Emmanuel's production-ready Kafka framework: extending dlt the right way
- Aman Gupta,
Data Engineer
Kafka in production isnât âhello world.â Itâs oh no world.
Think: messy topics, shifting schemas, and the occasional midnight panic when offsets decide to cosplay as Schrödingerâs cat.
Enter Emmanuel Ogunwede. Instead of rage-quitting Kafka (tempting) or spinning up a Spark cluster the size of a minor moon, he built a slim framework on top of dlt that levels up the vanilla Kafka source into something youâd actually trust to run in production.
đCheck out the repo on GitHub.
What vanilla dlt gives you (and where it stops short)
dltâs built-in Kafka source is great if you just need a simple pipeline up and running:
- You point it at specific topics
- It happily ingests UTF-8 text (JSON out of the box)
- But no Schema Registry integration
- Adding new topics after the first run requires manual handling
Perfect for getting started and honestly, way easier than most first tries at Kafka.
But in production, Kafka throws curveballs and thatâs where Emmanuelâs framework comes in.
Emmanuelâs upgrades đ
Instead of reinventing the wheel, Emmanuel identified the specific gaps and filled them systematically:
- Dynamic topic discovery via regex patterns (
.*_events
finds all event topics automatically) - Avro + Schema Registry support with proper deserialization and schema evolution
- Clean CLI interface that feels like a real tool
The genius? He built on top of dlt, not around it.
See it in action đ„
Emmanuel even made a video walking through it (watch at 2Ă speed if youâre impatient).
Done. The framework handles topic discovery, schema fetching, offset management, and loading.
Why this matters
Most Kafka setups live on two extremes:
- Too complex: Spark/Flink sized for Mars missions
- Too hacky: cron + script duct-taped together
Emmanuel found the middle ground: microâbatch ingestion thatâs productionâready and maintainable.
It builds on what dlt already does well, like schema changes, normalization, and datatype inference, etc. and extends it to handle the messy realities of Kafka:
- Topics appearing and disappearing
- Offset management across restarts
- Multiple serialization formats
Build smarter, not bigger
Don't fight your tools, extend them thoughtfully.
Emmanuel didnât rageâquit dlt because it lacked Avro. He bridged the gaps and ended up with something that feels like a natural extension, not a replacement.
Try it now
đ Read Emmanuelâs implementation and technical design doc.
â Donât forget to give the repo a star.
Emmanuel showed us that production-grade doesn't have to mean complex. Sometimes it just means being thoughtful about the details that matter.