Blog

December 22, 2025 •
- Tutorials,
- Industry
11 Pythonic Data Quality Recipes for every day
- Aman Gupta,
  Data Engineer
11 practical, copy-paste data quality recipes for dlt. From schema freezes to alerts, learn how to keep pipelines clean, safe, and production-ready
Read more
December 16, 2025 •
- Community,
- Product
DuckLake to MotherDuck: Validate locally, deploy to cloud in minutes
- Aman Gupta,
  Data Engineer
Start local with DuckLake, validate your data, then deploy to MotherDuck in minutes. Same pipeline, same code, just switch the destination.
Read more
December 8, 2025 •
- Industry,
- Engineering
Data contract agreement vs enforcement
- Adrian Brudaru,
  Co-Founder & CDO
Data contracts keep systems predictable by pairing clear rules with checks that catch bad data before it flows downstream.
Read more
November 20, 2025 •
- Industry,
- Engineering
Convergence: The Anti-Entropy Engine
- Adrian Brudaru,
  Co-Founder & CDO
Most LLM runs don’t fail. They converge fast, and the secret isn’t smarter models but better scaffolds that guide the work instead of forcing it.
Read more
October 20, 2025 •
- Industry,
- Engineering
Openflow vs. dlt for Snowflake users
- Adrian Brudaru,
  Co-Founder & CDO
Openflow and dltHub represent two distinct but valuable visions for the future of data ingestion.
Read more
October 14, 2025 •
- Product
Surviving the AI code Deluge: Data quality in the Spotlight
- Adrian Brudaru,
  Co-Founder & CDO
This is, we’re told, the great democratization of data engineering. The tedious work is gone. The barrier to entry is gone. Everyone can now be a data engineer.
Read more
September 24, 2025 •
- Engineering,
- Industry
Motherduck Europe & dlt DuckLake support
- Adrian Brudaru,
  Co-Founder & CDO
MotherDuck lands in Europe with serverless DuckDB warehousing. dlt adds DuckLake support, giving EU teams a fast, modern data stack.
Read more
September 22, 2025 •
- Engineering
SAP Data Ingestion with Python: A Technical Breakdown of Using the SAP RFC Protocol
- Mateusz Paździor,
  Head of Data Engineering at The Scalable Way
SAP data is hard to extract. Dominik’s new Python connector replaces pyRFC, enabling faster, chunked ingestion into modern pipelines.
Read more
September 19, 2025 •
- Engineering
"Scaled Mediocrity": The counterintuitive AI Strategy that's delivering ROI
- Adrian Brudaru,
  Co-Founder & CDO
LLM leaders agree: the true win is "scaled mediocrity." We're empowering teams with good enough tools for massive, real-world impact.
Read more
September 10, 2025 •
- Updates
Supercharge your data loading: Go beyond `pandas.to_sql()`
- Adrian Brudaru,
  Co-Founder & CDO
For quick tasks, df.to_sql() is perfect. But for production pipelines, it quickly shows its limits when data volume, frequency, and schema change.
Read more
September 9, 2025 •
- Product,
- Tutorials
SCD2 Deep Dive with dlt: How nested data affects queries and costs
- Aman Gupta,
  Data Engineer
Learn how dlt automates SCD2 for nested JSON data without complex SQL headaches. Real BigQuery benchmarks show incremental loading cuts costs by 25-35%.
Read more
August 20, 2025 •
- Community,
- Engineering
Emmanuel's production-ready Kafka framework: extending dlt the right way
- Aman Gupta,
  Data Engineer
Emmanuel built a slim framework on top of dlt that levels up the vanilla Kafka source into a production-ready setup. Check it out 🚀
Read more
August 6, 2025 •
- Product
Build → Deploy → Share: A roadmap for sharing on dltHub
- Adrian Brudaru,
  Co-Founder & CDO
You want connectors, and you want them to be many, high quality and customisable? A man can dream? here’s our roadmap to making those dreams a reality, and how you can help us today.
Read more
August 5, 2025 •
- Industry,
- Product
Sling vs dlt SQL connector Benchmark: Spend 3x Less, Load Faster with dlt
- Adrian Brudaru,
  Co-Founder & CDO
- Aman Gupta,
  Data Engineer
- Shreyas Gowda,
  Working Student
We compared dlt and Sling for data ingestion performance, cost, and flexibility. See how they stack up and which might suit your data needs best.
Read more
July 31, 2025 •
- Community,
- Engineering
Why a simple task speaks volumes
- Aman Gupta,
  Data Engineer
Ajay Moorjani turned a deceptively simple JSON to Snowflake task into a rock solid pipeline using dlt, dbt, and Airflow, built in less than a coffee break.
Read more
July 28, 2025 •
- Community,
- Industry
Leveraging Claude Code to Build a dlt & Visivo Project
- Jared Jesionek,
  CEO & Co-founder of Visivo
Leveraging AI to build a dlt extract and load of coldplay data from spotify and visualize it in Visivo.
Read more
July 24, 2025 •
- Tutorials,
- Community
Michael, dlt, and the art of unbreakable API pipelines
- Adrian Brudaru,
  Co-Founder & CDO
Built another pipeline just to keep a dashboard alive? Then it broke again? Michael Shoemaker shows how dlt makes API pipelines fix themselves, no drama.
Read more
July 16, 2025 •
- Updates
We’re building dltHub to make data engineering accessible for all Python developers
- Matthaus Krzykowski,
  Co-Founder & CEO
We’re excited to announce that we’re building dltHub, an LLM-native data engineering platform that enables any Python developer to build, run dlt pipelines, and deliver valuable end-user-ready reports.
Read more
July 15, 2025 •
- Product
A Practitioner’s Guide to LLM-native Pipeline Building with dltHub Workspace
- Adrian Brudaru,
  Co-Founder & CDO
LLM-native scaffolds for 1000+ APIs. The IKEA moment in data engineering is here. Build pipelines with LLMs, faster and cleaner.
Read more
July 7, 2025 •
- Tutorials
Turn your Documentation into a Queryable Knowledge Graph for High retrieval accuracy and low hallucinations
- Hiba Jamal,
  Junior Data & AI Manager
Using dlt + Cognee, we take API docs from Slack, PayPal, and TicketMaster and built a knowledge graph.
Read more
June 29, 2025 •
- Tutorials,
- Community
How I went from “I’ll never build a pipeline” to doing it in an hour with Cursor
- Roshni Melwani,
  Working Student
Dev takes Alena’s dlt course, then uses AI to build a WHOOP sleep-data pipeline, saving the data to Parquet, demonstrating that beginners can master pipelines quickly.
Read more
June 25, 2025 •
- Product,
- Industry
We've been using LanceDB to make AI development smoother
- Adrian Brudaru,
  Co-Founder & CDO
We've been using LanceDB for months at dltHub to build AI systems more quickly. The same setup works locally and in the cloud. Handles structured and vector data in one place.
Read more
June 18, 2025 •
- Industry,
- Product
Building Engine-Agnostic Data Stacks
- Adrian Brudaru,
  Co-Founder & CDO
Mixing Spark, DuckDB, and Snowflake? Iceberg decouples data, Ibis decouples logic, run your analytics anywhere, without rewrites or vendor lock-in.
Read more
June 16, 2025 •
- Industry,
- Product
Iceberg-First Ingestion: How Taktile cut 70% of costs
- Adrian Brudaru,
  Co-Founder & CDO
Taktile cut 70% of data loading costs by shifting ingestion to Iceberg via Lambda + dlt, keeping Snowflake for analytics. Smart layers, big savings.
Read more
June 3, 2025 •
- Industry,
- Product
From Singer to simplicity: Why Data Teams choose dlt.
- Adrian Brudaru,
  Co-Founder & CDO
Singer was Stitch's incomplete competitive response to Fivetran. Meltano completed what Stitch never intended to fully open source. dlt learned from both and built the fitting abstraction for pythonic data teams.
Read more
May 28, 2025 •
- Industry,
- Product
Fivetran vs dlt: Quickstart vs Endgame
- Adrian Brudaru,
  Co-Founder & CDO
A side-by-side look at Fivetran and dlt, covering cost models, customization, and how each approach affects team workflows as your data needs evolve.
Read more
May 27, 2025 •
- Tutorials
The REST API Integration costs: How AI + dlt is finally making it bearable
- Aman Gupta,
  Data Engineer
REST API integrations come with hidden costs, pagination, schema drift, rate limits. With dlt + Cursor, you skip the boilerplate and build pipelines in minutes, not days. Less code. Less chaos. More time to build.
Read more
May 14, 2025 •
- Community
Materializing Multi-Asset REST API Sources with dlt, Dagster, and DuckDB
- Jairus Martinez,
  Analytics Engineer at Brooklyn Data Company
A hands-on guide to combining dlt and Dagster for orchestrating multi-endpoint API ingestion pipelines, with assets materialized into DuckDB. Three patterns. One powerful workflow. Plus, a peek at the new CLI and DuckDB UI.
Read more
May 13, 2025 •
- Product
Breaking free from SQL: A Normie's guide to portable data pipelines
- Adrian Brudaru,
  Co-Founder & CDO
Data engineering shouldn't require rewriting the same logic multiple times for different environments. dlt's dataset interface gives you one consistent way to work with your data, regardless of where it lives.
Read more
May 5, 2025 •
- Updates
What’s new in dlt for Databricks: built-in staging, zero-config notebooks, no headaches
- Aman Gupta,
  Data Engineer
Ingesting to Databricks should be simple. With dlt, it finally is. No config files, no staging, just Python and go.
Read more
April 29, 2025 •
- Product
Vibe Coding: Why Building Data Pipelines with LLMs Actually Works
- Adrian Brudaru,
  Co-Founder & CDO
Vibe coding so clean, it will make your old code look bad.
Read more
April 28, 2025 •
- Community
Julian Alves and dlt: when expertise meets simplicity
- Adrian Brudaru,
  Co-Founder & CDO
Julian Alves builds reliable, simple data infrastructure. He partners with dlt to help companies create systems that deliver value, not burden.
Read more
April 22, 2025 •
- Updates
Celebrating our 3,000th OSS dlt customer as dlt’s momentum accelerates
- Matthaus Krzykowski,
  Co-Founder & CEO
dlt has grown from 1,000 to over 3,000 open-source users in just six months, with monthly downloads surpassing 1.4 million. This momentum reflects a growing demand for Python-native, modular, and AI-ready data tools — and dlt is building exactly that.
Read more
April 22, 2025 •
- Updates
What’s next for dlt in 2025: a simpler solution for solving complex problems
- Marcin Rudolf,
  Co-Founder & CTO
dlt started as a tool for handling JSON documents. It was meant for the average Python user that does not want to deal with creating and evolving schemas, SQL models, backends and data engineers that control them.
Read more
April 15, 2025 •
- Engineering
The future's Re-Composable: Converting Connectors Between Solutions with LLMs
- Adrian Brudaru,
  Co-Founder & CDO
Let's stop reinventing connectors in isolation. Use LLMs to transform scattered integrations into shared, reusable solutions.
Read more
April 14, 2025 •
- Community
Fabric + dlt, Course and Explorations
- Adrian Brudaru,
  Co-Founder & CDO
As Rakesh was exploring Fabric, dlt kept showing up in Rakesh's stack. Not by design, but because it just worked. Different projects, same ingestion layer, quietly doing its job.
Read more
April 3, 2025 •
- Tutorials
AI built the Pipeline, I plugged the leaks
- Adrian Brudaru,
  Co-Founder & CDO
I tried Vibe-coding a Singer tap (Pipedrive) into dlt and it worked, but it needed some user intervention.
Read more
April 2, 2025 •
- Community
How to run dlt with Airflow (Or any other Python thing)
- Francesco Mucio,
  Untitled Data Company
Explore four ways to run dlt with Apache Airflow, from PythonOperators to KubernetesPods, and learn which setup scales best for clean, reliable pipelines.
Read more
March 31, 2025 •
- Tutorials
Towards a Benchmark for AI-Generated Data Pipelines
- Adrian Brudaru,
  Co-Founder & CDO
Building pipelines with AI isn’t one task, it's many. In this post, we explore how to split and test them individually, so failures are easier to diagnose and fix.
Read more
March 29, 2025 •
- Engineering
Are you moving the right data? Write. Audit. Publish. (WAP)
- Aman Gupta,
  Data Engineer
The Write. Audit. Publish. (WAP) framework brings discipline from software engineering: write in isolation, audit for correctness, quality, and compliance, publish with confidence. But can data engineering really follow suit? Let's discuss.
Read more
March 26, 2025 •
- Tutorials
Erase tech debt from data loading with dlt + Cursor + LLMs
- Adrian Brudaru,
  Co-Founder & CDO
Modernisation at its finest, from trash to cutting edge in seconds. It works amazing, just give it a try, stop paying for tech debt
Read more
March 25, 2025 •
- Tutorials
From Airbyte YAML to Scalable Python Pipelines with dlt (dltHub), Cursor and LLMs
- Adrian Brudaru,
  Co-Founder & CDO
In this microblog + video we explore generating python pipelines (dlt REST API) from Airbyte low code yaml spec. tl;dr: it works well.
Read more
March 21, 2025 •
- Tutorials
Query your API’s data with SQL, no data warehouse needed
- Aman Gupta,
  Data Engineer
Want to run SELECT * on your API data without setting up a database? dlt datasets let you query API data using SQL without setting up a database or data warehouse. They follow the Write-Audit-Publish (WAP) pattern, enabling direct SQL queries while keeping workflows efficient.
Read more
March 18, 2025 •
- Community
Why Iceberg + Python is the Future of Open Data Lakes
- Adrian Brudaru,
  Co-Founder & CDO
Data lakes are broken. Python + Iceberg fixes them. No lock-in. No silos. Just open, AI-ready data. Read on why and how to switch ->
Read more
March 4, 2025 •
- Product
Deep dive: Our initial assistants and model context protocol (MCP) on the Continue Hub
- Matthaus Krzykowski,
  Co-Founder & CEO
We do a deep dive on the initial assistants and model context protocol (MCP) that we published on the Continue Hub
Read more
March 4, 2025 •
- Product
Integrating CI/CD Practices into Data Engineering
- Adrian Brudaru,
  Co-Founder & CDO
Software Engineering Has CI/CD, Data Engineering Has YOLO – Until Now
Read more
February 26, 2025 •
- Updates
An early path with to make compound AI systems work for data engineering
- Matthaus Krzykowski,
  Co-Founder & CEO
Today we announce our partnership with Continue and the release of initial two assistants, including one where developers can chat with the dlt documentation from the IDE and pass it to the LLM to help you write dlt code. Developers can also access building blocks that allow them to build their own custom assistants. In this post we want to talk about:
- why we think SaaS connector catalog black box solutions have been a dead end for LLMs
- what we have been doing so far to build for AI data engineering compound systems
- our vision for a dlt+ data infrastructure that generates trusted data that will unlock additional data engineering assistants and building blocks in future
Read more
February 26, 2025 •
- Product
Data Engineers, stop testing in production!
- Adrian Brudaru,
  Co-Founder & CDO
Software engineers don’t test in production. Why are data engineers still doing it? ELT made loading easy, but debugging in the warehouse is a nightmare. dlt+ Staging fixes that.
Read more
February 20, 2025 •
- Updates
We are releasing dlt+ Project & Cache in early access
- Matthaus Krzykowski,
  Co-Founder & CEO
We take the next step in our recent journey from dlt to dlt+ by releasing the initial two features of dlt+, our developer framework for running dlt pipelines in production and at scale:
- dlt+ Project: A declarative yaml collaboration point for your team.
- dlt+ Cache: A database-like portable compute layer for developing, testing and running transformations before loading.
Read more
February 19, 2025 •
- Product
What is dlt+ Cache?
- Adrian Brudaru,
  Co-Founder & CDO
Discover how dlt+ Cache gives data engineers a lightning-fast staging environment to test, validate, and debug transformations before they hit production!
Read more
February 19, 2025 •
- Product
What is dlt+ Project?
- Adrian Brudaru,
  Co-Founder & CDO
Introducing dlt+ Project – the declarative, YAML-powered manifest that transforms data pipeline development!
Read more
February 11, 2025 •
- Community,
- Tutorials
Stop writing SQL models and let your data pipeline do it automatically
- Aman Gupta,
  Data Engineer
This post discusses `sqlmesh init -t dlt` command that integrates dlt’s metadata with SQLMesh’s modeling capabilities. It automatically generates SQL models that accurately handle incremental processing and schema changes. Inspired by David SJ's post, this approach was demonstrated using the Bluesky API, transforming raw data into structured tables without the need for writing SQL.
Read more
February 10, 2025 •
- Tutorials
Real-time Data Replication with Debezium and Python
- Ismail Simsek,
  Senior Data Engineer
When it comes to replicating operational data for analytics, Change Data Capture (CDC) is the gold standard. It offers scalability, near real-time performance, and captures all data modifications, ensuring your analytical datasets are always up-to-date.
Read more
January 28, 2025 •
- Product
From Commoditization to Democratization: Building the Data Platforms of Tomorrow
- Adrian Brudaru,
  Co-Founder & CDO
Moving data isn’t hard because engineers lack skill. It’s hard because commoditised systems bog us down with complexity disguised as simplicity.
Read more
January 21, 2025 •
- Product
Unified Data Access with dlt, from local to cloud
- Adrian Brudaru,
  Co-Founder & CDO
Tired of juggling multiple tools and formats? Discover how a single interface can simplify how you access, transform, and share your data, no matter where it lives.
Read more
January 17, 2025 •
- Engineering
Somewhere Between Data Democracy and Data Anarchy ⚔️
- Hiba Jamal,
  Junior Data & AI Manager
Data democracy is a beautiful thing. People are more empowered, less dependent and unblocked in terms of data curiosity... However, what breaks this utopic dream is when big curious ideas, several undocumented pipelines (with perhaps with the same data) and conflicting dashboards cause confusion and indecision.
Read more
January 16, 2025 •
- Community
How dltHub consulting partner Mooncoon speeds up complex dlt pipeline development 2x with Cursor
- Marcel Coetzee,
  Data and AI Plumber at Mooncoon, an analytics and data agency.
AI + dlt = 2x faster pipelines. Mooncoon 🦝 shares how Cursor IDE transforms pipeline dev. AI handles boilerplate; you ship faster. Practical workflows & a live demo inside.
Read more
December 29, 2024 •
- Community
2024 dlt Recap: Moments, Mentions, and Milestones
- Adrian Brudaru,
  Co-Founder & CDO
- Aman Gupta,
  Data Engineer
2024 was a remarkable year for dltHub. Together with our users and partners, we streamlined workflows, introduced powerful capabilities, and laid a stronger foundation for the future.
Read more
December 20, 2024 •
- Updates
Our main benefits from the silver consulting partnership with dltHub: Help with complex integrations, co-development of tools we use, increased revenue
- Adrian Brudaru,
  Co-Founder & CDO
If you are a data engineering consultant or run a data-focused consultancy and want to do more with less, consider joining our partner program.
Read more
December 20, 2024 •
- Updates
Announcing Our Consulting Partnerships Program
- Adrian Brudaru,
  Co-Founder & CDO
dltHub is community driven in partnerships too, featuring an everybody wins model that optimises client satisfaction.
If you're excited about being part of a collaborative ecosystem that amplifies everyone's strengths while delivering exceptional value to clients, we want to hear from you.
Read more
December 9, 2024 •
- Community
10x data engineer with dlt+ and Tower: A Taktile Case Study
- Adrian Brudaru,
  Co-Founder & CDO
With dlt+ and Tower, anyone who writes a bit of Python can ship production data pipelines in under an hour. Fast, open, and headache-free, this is the future of data engineering.
Read more
November 29, 2024 •
- Engineering
Cross-Organisational data mesh as a requirement in decentralised energy infrastructure
- Adrian Brudaru,
  Co-Founder & CDO
Europe’s Energiewende data Challenge: Decentralised cross organisational data mesh and environment portability as baseline requirements.
Read more
November 19, 2024 •
- Community
Shift YOURSELF Left
- Adrian Brudaru,
  Co-Founder & CDO
Data changes, let's just accept that. So how do you get in the change loop when the "left" department just won't add you in? Simple: Get yourself in the loop. Instead of shifting the responsibility to the team on the left, shift your ownership left.
Read more
November 19, 2024 •
- Tutorials
Self hosted tools Benchmarking
- Aman Gupta,
  Data Engineer
SQL is key in data analysis, especially where production databases are used. We benchmarked Meltano, Airbyte, dlt, and Sling.
Read more
November 13, 2024 •
- Community
cognee: Scalable Memory Layer for AI Applications
- Vasilije Markovic,
  Founder Topoteretes
cognee is an open-source, scalable semantic layer for AI applications. You can now use modular ECL pipelines to connect data and reduce hallucinations.
Read more
October 30, 2024 •
- Engineering
SQL Benchmarking: comparing data pipeline tools
- Aman Gupta,
  Data Engineer
In modern data workflows, transferring data from SQL databases to data warehouses like BigQuery, Redshift, and Snowflake is an important part of modern data workflows. And with various tools available, how do you choose the right one for your needs? We conducted a detailed benchmark test to answer this question, comparing popular tools like Fivetran, Stitch, Airbyte, and the data load tool (dlt).
Read more
October 30, 2024 •
- Product
Semantic data contracts
- Adrian Brudaru,
  Co-Founder & CDO
Data mesh or governance is simplified when using a semantic data contract nstead of a governance api.
Read more
October 23, 2024 •
- Product
Portability principle: The path to vendor-agnostic Data Platforms
- Adrian Brudaru,
  Co-Founder & CDO
The current state of the ecosystem towards breaking vendor locks is best described as “incomplete”. By creating a portable data lake as a kind of framework where components are vendor agnostic, we are able to take advantage of the next developments quickly.
Read more
October 22, 2024 •
- Product,
- Community
Harness builds an end to end data platform with dlt + SQLMesh
How Harness chooses dlt + SQLMesh to create an end-to-end next generation data platform.
Read more
October 14, 2024 •
- Product
Metadata as Glue: A dlt-dbt generator
- Adrian Brudaru,
  Co-Founder & CDO
Imagine you go to a burger place and order a cheeseburger. They hand you a paper bag containing the following items:

A package of ready-bake flour. Just add water. A raw beef patty. A slice of cheese. A head of lettuce, a tomato, and an onion. A packet of ketchup and mustard.

Technically, you have everything needed to make a cheeseburger. This scenario mirrors the current state of the modern data stack.
Read more
October 10, 2024 •
- Product,
- Community
dlt-SQLMesh generator: A case of metadata handover
- Adrian Brudaru,
  Co-Founder & CDO
Most tools interact only through the database layer, treating it as a universal translator. However, this approach is limited because it doesn't capture the rich metadata that could enable more intelligent data processing and integration. Without end-to-end metadata flow, the promise of a cohesive data pipeline remains unfulfilled.
Read more
October 3, 2024 •
- Product
Portable data lake: A development environment for data lakes
- Adrian Brudaru,
  Co-Founder & CDO
What if we had a portable data lake? A pip install data platform...
Read more
October 1, 2024 •
- Tutorials
A guide on how to migrate your Hubspot data pipeline from Fivetran to dlt
- Aman Gupta,
  Data Engineer
This guide details the migration of HubSpot data pipelines from Fivetran to the open-source dlt, highlighting dlt's cost-efficiency, speed, and customization capabilities. Providing a step-by-step transition process and strategies for unifying data sources empowers organizations to optimize their data infrastructure for better control and scalability.
Read more
September 16, 2024 •
- Updates
Celebrating 1,000 dlt OSS customers in production
- Matthaus Krzykowski,
  Co-Founder & CEO
Earlier today, Marcin announced the release of dlt version 1.0.0, marking a significant milestone in its evolution into a stable, production-ready library for data movement.
Read more
September 16, 2024 •
- Product,
- Updates
Introducing dlt 1.0.0: A Production-Ready Python Library for Data Movement
- Marcin Rudolf,
  Co-Founder & CTO
We are excited to announce the release of dlt version 1.0.0, a major milestone marking the library’s maturity and readiness for production use. After months of hard work and development, this update integrates key use cases directly into the core library (code for database syncs, files, the REST API toolkit and an SQLAlchemy destination), making dlt more powerful than ever.
Read more
September 3, 2024 •
- Tutorials
Migrate your SQL data pipeline from Airbyte to dlt
- Aman Gupta,
  Data Engineer
In this post, we explore how to migrate your SQL data pipeline from Airbyte to dlt, an open-source solution that offers greater control, speed, and cost-efficiency. If you're ready to take your data strategy to the next level, this guide will show you how to make the switch. Dive in and start your journey.
Read more
September 2, 2024 •
- Tutorials
Migrate your SQL data pipeline from Stitch data to dlt
- Aman Gupta,
  Data Engineer
In this post, we explore how to migrate your SQL data pipeline from Stitch data to dlt, an open-source solution that offers greater control, speed, and cost-efficiency. If you're ready to take your data strategy to the next level, this guide will show you how to make the switch. Dive in and start your journey.
Read more
August 13, 2024 •
- Tutorials
Migrate your SQL data pipeline from Fivetran to dlt
- Aman Gupta,
  Data Engineer
In this post, we explore how to migrate your SQL data pipeline from Fivetran to dlt, an open-source solution that offers greater control, speed, and cost-efficiency. If you're ready to take your data strategy to the next level, this guide will show you how to make the switch. Dive in and start your journey.
Read more
August 12, 2024 •
- Community
RAG playground: Build your own RAG bot
- Adrian Brudaru,
  Co-Founder & CDO
We recently held a workshop at Data Talks Club - LLM Zoomcamp on building a Retrieval-Augmented Generation (RAG) system. The session covered loading data from Notion into LanceDB, creating a RAG Bot with Ollama, and interacting with it through practical examples. Below is a summary of the key resources, tools, and steps from the workshop.
Read more
August 5, 2024 •
- Product
Standardizing Ingestion and its metadata for compliant Data Platforms
- Adrian Brudaru,
  Co-Founder & CDO
💡 What if we had some magic documentation about our data sources, before even fiilling any tables? What if we could profile source data based on source info, or in flight before loading? And what if we had a single way of doing it that’s not dependent on the storage solution?

Read on to find how.
Read more
July 25, 2024 •
- Engineering
Data Platform Engineers: The Game-Changers of data teams
- Adrian Brudaru,
  Co-Founder & CDO
Imagine a workplace where data governance and access, documentation and infra are seamlessly integrated, empowering every decision. This is the norm for companies with Data Platform Engineers.

Read on to understand who they are, where they come from, and how they can help you.
Read more
July 11, 2024 •
- Community
How dlt uses Apache Arrow
- Jorrit Sandbrink,
  Open Source Software Engineer
Read more
June 21, 2024 •
- Tutorials
Syncing Google Forms data with Notion using dlt
- Aman Gupta,
  Data Engineer
Hello, I'm Aman, and I assist the dlthub team with various data-related tasks. In a recent project, the Operations team needed to gather information through Google Forms and integrate it into a Notion database. Initially, they tried using the Zapier connector as a quick and cost-effective solution, but it didn’t work as expected.
Read more
June 19, 2024 •
- Tutorials
Slowly Changing Dimension Type2: Explanation and code
- Aman Gupta,
  Data Engineer
Read more
June 12, 2024 •
- Product
From Pandas to Production: How we built dlt as the right ELT tool for Normies
- Adrian Brudaru,
  Co-Founder & CDO
Read more
May 28, 2024 •
- Product,
- Updates
Instant pipelines with dlt-init-openapi
- Adrian Brudaru,
  Co-Founder & CDO
Read more
May 23, 2024 •
- Tutorials
How I contributed my first data pipeline to the open source.
- Aman Gupta,
  Data Engineer
Read more
May 14, 2024 •
- Updates,
- Tutorials,
- Engineering
Announcing: REST API Source toolkit from dltHub - A Python-only high level approach to pipelines
- Adrian Brudaru,
  Co-Founder & CDO
Read more
May 7, 2024 •
- Tutorials
On Orchestrators: You Are All Right, But You Are All Wrong Too
- Anuun Chinbat,
  Junior Software Engineer
Read more
April 23, 2024 •
- Tutorials,
- Updates
Replacing SaaS ETL with Python dlt: A painless experience for Yummy.eu
- Adrian Brudaru,
  Co-Founder & CDO
Read more
April 12, 2024 •
- Product
Portable, embeddable ETL - what if pipelines could run anywhere?
- Adrian Brudaru,
  Co-Founder & CDO
Read more
April 11, 2024 •
- Product
The Second Data Warehouse, aka the "disaster recovery" project
- Adrian Brudaru,
  Co-Founder & CDO
Read more
April 5, 2024 •
- Tutorials,
- Updates
Shift Left Data Democracy: the link between democracy, governance, data contracts and data mesh.
- Adrian Brudaru,
  Co-Founder & CDO
Read more
March 28, 2024 •
- Product
Yes code ELT: dlt make easy things easy, and hard things possible
- Adrian Brudaru,
  Co-Founder & CDO
Read more
March 25, 2024 •
- Tutorials
What is so smart about smart dashboarding tools?
- Hiba Jamal,
  Junior Data & AI Manager
Read more
March 25, 2024 •
- Tutorials,
- Updates,
- Product
dlt adds Reverse ETL - build a custom destination in minutes
- Adrian Brudaru,
  Co-Founder & CDO
Read more
March 12, 2024 •
- Product
Coding data pipelines is faster than renting connector catalogs
- Matthaus Krzykowski,
  Co-Founder & CEO
Read more

11 Pythonic Data Quality Recipes for every day

DuckLake to MotherDuck: Validate locally, deploy to cloud in minutes

Data contract agreement vs enforcement

Convergence: The Anti-Entropy Engine

Openflow vs. dlt for Snowflake users

Surviving the AI code Deluge: Data quality in the Spotlight

Motherduck Europe & dlt DuckLake support

SAP Data Ingestion with Python: A Technical Breakdown of Using the SAP RFC Protocol

"Scaled Mediocrity": The counterintuitive AI Strategy that's delivering ROI

Supercharge your data loading: Go beyond `pandas.to_sql()`

SCD2 Deep Dive with dlt: How nested data affects queries and costs

Emmanuel's production-ready Kafka framework: extending dlt the right way

Build → Deploy → Share: A roadmap for sharing on dltHub

Sling vs dlt SQL connector Benchmark: Spend 3x Less, Load Faster with dlt

Why a simple task speaks volumes

Leveraging Claude Code to Build a dlt & Visivo Project

Michael, dlt, and the art of unbreakable API pipelines

We’re building dltHub to make data engineering accessible for all Python developers

A Practitioner’s Guide to LLM-native Pipeline Building with dltHub Workspace

Turn your Documentation into a Queryable Knowledge Graph for High retrieval accuracy and low hallucinations

How I went from “I’ll never build a pipeline” to doing it in an hour with Cursor

We've been using LanceDB to make AI development smoother

Building Engine-Agnostic Data Stacks

Iceberg-First Ingestion: How Taktile cut 70% of costs

From Singer to simplicity: Why Data Teams choose dlt.

Fivetran vs dlt: Quickstart vs Endgame

The REST API Integration costs: How AI + dlt is finally making it bearable

Materializing Multi-Asset REST API Sources with dlt, Dagster, and DuckDB

Breaking free from SQL: A Normie's guide to portable data pipelines

What’s new in dlt for Databricks: built-in staging, zero-config notebooks, no headaches

Vibe Coding: Why Building Data Pipelines with LLMs Actually Works

Julian Alves and dlt: when expertise meets simplicity

Celebrating our 3,000th OSS dlt customer as dlt’s momentum accelerates

What’s next for dlt in 2025: a simpler solution for solving complex problems

The future's Re-Composable: Converting Connectors Between Solutions with LLMs

Fabric + dlt, Course and Explorations

AI built the Pipeline, I plugged the leaks

How to run dlt with Airflow (Or any other Python thing)

Towards a Benchmark for AI-Generated Data Pipelines

Are you moving the right data? Write. Audit. Publish. (WAP)

Erase tech debt from data loading with dlt + Cursor + LLMs

From Airbyte YAML to Scalable Python Pipelines with dlt (dltHub), Cursor and LLMs

Query your API’s data with SQL, no data warehouse needed

Why Iceberg + Python is the Future of Open Data Lakes

Deep dive: Our initial assistants and model context protocol (MCP) on the Continue Hub

Integrating CI/CD Practices into Data Engineering

An early path with to make compound AI systems work for data engineering

Data Engineers, stop testing in production!

We are releasing dlt+ Project & Cache in early access

What is dlt+ Cache?

What is dlt+ Project?

Stop writing SQL models and let your data pipeline do it automatically

Real-time Data Replication with Debezium and Python

From Commoditization to Democratization: Building the Data Platforms of Tomorrow

Unified Data Access with dlt, from local to cloud

Somewhere Between Data Democracy and Data Anarchy ⚔️

How dltHub consulting partner Mooncoon speeds up complex dlt pipeline development 2x with Cursor

2024 dlt Recap: Moments, Mentions, and Milestones

Our main benefits from the silver consulting partnership with dltHub: Help with complex integrations, co-development of tools we use, increased revenue

Announcing Our Consulting Partnerships Program

10x data engineer with dlt+ and Tower: A Taktile Case Study

Cross-Organisational data mesh as a requirement in decentralised energy infrastructure

Shift YOURSELF Left

Self hosted tools Benchmarking

cognee: Scalable Memory Layer for AI Applications

SQL Benchmarking: comparing data pipeline tools

Semantic data contracts

Portability principle: The path to vendor-agnostic Data Platforms

Harness builds an end to end data platform with dlt + SQLMesh

Metadata as Glue: A dlt-dbt generator

dlt-SQLMesh generator: A case of metadata handover

Portable data lake: A development environment for data lakes

A guide on how to migrate your Hubspot data pipeline from Fivetran to dlt

Celebrating 1,000 dlt OSS customers in production

Introducing dlt 1.0.0: A Production-Ready Python Library for Data Movement

Migrate your SQL data pipeline from Airbyte to dlt

Migrate your SQL data pipeline from Stitch data to dlt

Migrate your SQL data pipeline from Fivetran to dlt

RAG playground: Build your own RAG bot

Standardizing Ingestion and its metadata for compliant Data Platforms