Blog
- •
- Community
10x data engineer with dlt+ and Tower: A Taktile Case Study
- Adrian Brudaru,
Co-Founder & CDO
Read moreWith dlt+ and Tower, anyone who writes a bit of Python can ship production data pipelines in under an hour. Fast, open, and headache-free, this is the future of data engineering.
- •
- Engineering
Cross-Organisational data mesh as a requirement in decentralised energy infrastructure
- Adrian Brudaru,
Co-Founder & CDO
Read moreEurope’s Energiewende data Challenge: Decentralised cross organisational data mesh and environment portability as baseline requirements.
- •
- Community
Shift YOURSELF Left
- Adrian Brudaru,
Co-Founder & CDO
Read moreData changes, let's just accept that. So how do you get in the change loop when the "left" department just won't add you in? Simple: Get yourself in the loop. Instead of shifting the responsibility to the team on the left, shift your ownership left.
- •
- Tutorials
Self hosted tools Benchmarking
- Aman Gupta,
Jr. Data Engineer
Read moreSQL is key in data analysis, especially where production databases are used. We benchmarked Meltano, Airbyte, dlt, and Sling.
- •
- Community
cognee: Scalable Memory Layer for AI Applications
- Vasilije Markovic,
Founder Topoteretes
Read morecognee is an open-source, scalable semantic layer for AI applications. You can now use modular ECL pipelines to connect data and reduce hallucinations.
- •
- Engineering
SQL Benchmarking: comparing data pipeline tools
- Aman Gupta,
Jr. Data Engineer
Read moreIn modern data workflows, transferring data from SQL databases to data warehouses like BigQuery, Redshift, and Snowflake is an important part of modern data workflows. And with various tools available, how do you choose the right one for your needs? We conducted a detailed benchmark test to answer this question, comparing popular tools like Fivetran, Stitch, Airbyte, and the data load tool (dlt).
- •
- Product
Semantic data contracts
- Adrian Brudaru,
Co-Founder & CDO
Read moreData mesh or governance is simplified when using a semantic data contract nstead of a governance api.
- •
- Product
Portability principle: The path to vendor-agnostic Data Platforms
- Adrian Brudaru,
Co-Founder & CDO
Read moreThe current state of the ecosystem towards breaking vendor locks is best described as “incomplete”. By creating a portable data lake as a kind of framework where components are vendor agnostic, we are able to take advantage of the next developments quickly.
- •
- Product,
- Community
Harness builds an end to end data platform with dlt + SQLMesh
Read moreHow Harness chooses dlt + SQLMesh to create an end-to-end next generation data platform.
- •
- Product
Metadata as Glue: A dlt-dbt generator
- Adrian Brudaru,
Co-Founder & CDO
Read moreImagine you go to a burger place and order a cheeseburger. They hand you a paper bag containing the following items:
A package of ready-bake flour. Just add water. A raw beef patty. A slice of cheese. A head of lettuce, a tomato, and an onion. A packet of ketchup and mustard.
Technically, you have everything needed to make a cheeseburger. This scenario mirrors the current state of the modern data stack. - •
- Product,
- Community
dlt-SQLMesh generator: A case of metadata handover
- Adrian Brudaru,
Co-Founder & CDO
Read moreMost tools interact only through the database layer, treating it as a universal translator. However, this approach is limited because it doesn't capture the rich metadata that could enable more intelligent data processing and integration. Without end-to-end metadata flow, the promise of a cohesive data pipeline remains unfulfilled.
- •
- Product
Portable data lake: A development environment for data lakes
- Adrian Brudaru,
Co-Founder & CDO
Read moreWhat if we had a portable data lake? A pip install data platform...
- •
- Tutorials
A guide on how to migrate your Hubspot data pipeline from Fivetran to dlt
- Aman Gupta,
Jr. Data Engineer
Read moreThis guide details the migration of HubSpot data pipelines from Fivetran to the open-source dlt, highlighting dlt's cost-efficiency, speed, and customization capabilities. Providing a step-by-step transition process and strategies for unifying data sources empowers organizations to optimize their data infrastructure for better control and scalability.
- •
- Updates
Celebrating 1,000 dlt OSS customers in production
- Matthaus Krzykowski,
Co-Founder & CEO
Read moreEarlier today, Marcin announced the release of dlt version 1.0.0, marking a significant milestone in its evolution into a stable, production-ready library for data movement.
- •
- Product,
- Updates
Introducing dlt 1.0.0: A Production-Ready Python Library for Data Movement
- Marcin Rudolf,
Co-Founder & CTO
Read moreWe are excited to announce the release of dlt version 1.0.0, a major milestone marking the library’s maturity and readiness for production use. After months of hard work and development, this update integrates key use cases directly into the core library (code for database syncs, files, the REST API toolkit and an SQLAlchemy destination), making dlt more powerful than ever.
- •
- Tutorials
Migrate your SQL data pipeline from Airbyte to dlt
- Aman Gupta,
Jr. Data Engineer
Read moreIn this post, we explore how to migrate your SQL data pipeline from Airbyte to dlt, an open-source solution that offers greater control, speed, and cost-efficiency. If you're ready to take your data strategy to the next level, this guide will show you how to make the switch. Dive in and start your journey.
- •
- Tutorials
Migrate your SQL data pipeline from Stitch data to dlt
- Aman Gupta,
Jr. Data Engineer
Read moreIn this post, we explore how to migrate your SQL data pipeline from Stitch data to dlt, an open-source solution that offers greater control, speed, and cost-efficiency. If you're ready to take your data strategy to the next level, this guide will show you how to make the switch. Dive in and start your journey.
- •
- Tutorials
Migrate your SQL data pipeline from Fivetran to dlt
- Aman Gupta,
Jr. Data Engineer
Read moreIn this post, we explore how to migrate your SQL data pipeline from Fivetran to dlt, an open-source solution that offers greater control, speed, and cost-efficiency. If you're ready to take your data strategy to the next level, this guide will show you how to make the switch. Dive in and start your journey.
- •
- Community
RAG playground: Build your own RAG bot
- Adrian Brudaru,
Co-Founder & CDO
Read moreWe recently held a workshop at Data Talks Club - LLM Zoomcamp on building a Retrieval-Augmented Generation (RAG) system. The session covered loading data from Notion into LanceDB, creating a RAG Bot with Ollama, and interacting with it through practical examples. Below is a summary of the key resources, tools, and steps from the workshop.
- •
- Product
Standardizing Ingestion and its metadata for compliant Data Platforms
- Adrian Brudaru,
Co-Founder & CDO
Read more💡 What if we had some magic documentation about our data sources, before even fiilling any tables? What if we could profile source data based on source info, or in flight before loading? And what if we had a single way of doing it that’s not dependent on the storage solution?
Read on to find how. - •
- Engineering
Data Platform Engineers: The Game-Changers of data teams
- Adrian Brudaru,
Co-Founder & CDO
Read moreImagine a workplace where data governance and access, documentation and infra are seamlessly integrated, empowering every decision. This is the norm for companies with Data Platform Engineers.
Read on to understand who they are, where they come from, and how they can help you. - •
- Community
How dlt uses Apache Arrow
- Jorrit Sandbrink,
Open Source Software Engineer
- •
- Tutorials
Syncing Google Forms data with Notion using dlt
- Aman Gupta,
Jr. Data Engineer
Read moreHello, I'm Aman, and I assist the dlthub team with various data-related tasks. In a recent project, the Operations team needed to gather information through Google Forms and integrate it into a Notion database. Initially, they tried using the Zapier connector as a quick and cost-effective solution, but it didn’t work as expected.
- •
- Tutorials
Slowly Changing Dimension Type2: Explanation and code
- Aman Gupta,
Jr. Data Engineer
- •
- Product
From Pandas to Production: How we built dlt as the right ELT tool for Normies
- Adrian Brudaru,
Co-Founder & CDO
- •
- Product,
- Updates
Instant pipelines with dlt-init-openapi
- Adrian Brudaru,
Co-Founder & CDO
- •
- Tutorials
How I contributed my first data pipeline to the open source.
- Aman Gupta,
Jr. Data Engineer
- •
- Updates,
- Tutorials,
- Engineering
Announcing: REST API Source toolkit from dltHub - A Python-only high level approach to pipelines
- Adrian Brudaru,
Co-Founder & CDO
- •
- Tutorials
On Orchestrators: You Are All Right, But You Are All Wrong Too
- Anuun Chinbat,
Working Student
- •
- Tutorials,
- Updates
Replacing SaaS ETL with Python dlt: A painless experience for Yummy.eu
- Adrian Brudaru,
Co-Founder & CDO
- •
- Product
Portable, embeddable ETL - what if pipelines could run anywhere?
- Adrian Brudaru,
Co-Founder & CDO
- •
- Product
The Second Data Warehouse, aka the "disaster recovery" project
- Adrian Brudaru,
Co-Founder & CDO
- •
- Tutorials,
- Updates
Shift Left Data Democracy: the link between democracy, governance, data contracts and data mesh.
- Adrian Brudaru,
Co-Founder & CDO
- •
- Product
Yes code ELT: dlt make easy things easy, and hard things possible
- Adrian Brudaru,
Co-Founder & CDO
- •
- Tutorials
What is so smart about smart dashboarding tools?
- Hiba Jamal,
Working Student
- •
- Tutorials,
- Updates,
- Product
dlt adds Reverse ETL - build a custom destination in minutes
- Adrian Brudaru,
Co-Founder & CDO
- •
- Product
Coding data pipelines is faster than renting connector catalogs
- Matthaus Krzykowski,
Co-Founder & CEO
- •
- Tutorials
Moving away from Segment to a cost-effective do-it-yourself event streaming pipeline with Cloud Pub/Sub and dlt.
- Zaeem Athar,
Jr. Data Engineer
- •
- Tutorials,
- Product
Saving 75% of work for a Chargebee Custom Source via pipeline code generation with dlt
- Adrian Brudaru,
Co-Founder & CDO - Violetta Mishechkina,
Solutions Engineer
- •
- Updates
PyAirbyte - what it is and what it’s not
- Adrian Brudaru,
Co-Founder & CDO
- •
- Product
Single pane of glass for pipelines running on various orchestrators
- Adrian Brudaru,
Co-Founder & CDO
- •
- Tutorials
API playground: Free APIs for personal data projects
- Adrian Brudaru,
Co-Founder & CDO
- •
- Tutorials,
- Updates
dlt & dbt in Semantic Modelling
- Hiba Jamal,
Working Student
- •
- Tutorials
Comparison running dbt-core and dlt-dbt runner on Google Cloud Functions
- Aman Gupta,
Jr. Data Engineer
- •
- Tutorials
The Modern Data Stack with dlt & Mode
- Hiba Jamal,
Working Student
- •
- Community,
- Tutorials
Streaming Pub/Sub JSON to Cloud SQL PostgreSQL on GCP
- William Laroche,
GCP cloud architect * Backend and data engineer
- •
- Community,
- Tutorials
Why Taktile runs dlt on AWS Lambda to process millions of daily tracking events
- Simon Bumm,
Data and Analytics Lead at Taktile
- •
- Tutorials,
- Updates
From Inbox to Insights: AI-enhanced email analysis with dlt and Kestra
- Anuun Chinbat,
Working Student
- •
- Tutorials
Exploring data replication of SAP HANA to Snowflake using dlt
- Rahul Joshi,
Jr. Solutions Engineer
- •
- Tutorials,
- Updates
Data Lineage using dlt and dbt.
- Zaeem Athar,
Jr. Data Engineer
- •
- Tutorials
Deploy Google Cloud Functions as webhooks to capture event-based data from GitHub, Slack, or Hubspot
- Aman Gupta,
Jr. Data Engineer
- •
- Product
Solving data ingestion for Python coders
- Adrian Brudaru,
Co-Founder & CDO
- •
- Tutorials
Orchestrating unstructured data pipeline with Dagster and dlt.
- Zaeem Athar,
Jr. Data Engineer
- •
- Tutorials
Semantic Modeling Capabilities of Power BI, GoodData & Metabase: A Comparison
- Hiba Jamal,
Working Student
- •
- Community
Building resilient pipelines in minutes with dlt + Prefect
Read more - •
- Tutorials
DLT & Deepnote in women's wellness and violence trends: A Visual Analysis
- Hiba Jamal,
Working Student
- •
- Engineering
Get 30x speedups when reading databases with ConnectorX + Arrow + dlt
- Marcin Rudolf,
Co-Founder & CTO
- •
- Product,
- Updates
Running dbt Cloud or core from python - use cases and simple solutions
- Adrian Brudaru,
Co-Founder & CDO
- •
- Tutorials,
- Updates
Your first data warehouse: A practical approach
- Adrian Brudaru,
Co-Founder & CDO
- •
- Product
The role of docs in data products
- Adrian Brudaru,
Co-Founder & CDO
- •
- Tutorials
PDF invoices → Real-time financial insights: How I stopped relying on an engineer to automate my workflow and learnt to do it myself
- Anna Hoffmann,
Co-Founder & COO
- •
- Tutorials
Modeling Unstructured Data for Self-Service Analytics with dlt and Holistics
- Zaeem Athar,
Jr. Data Engineer
- •
- Tutorials,
- Engineering
Talk to your Zendesk tickets with Weaviate’s Verba and dlt: A Step by Step Guide
- Anton Burnashev,
Senior Software Engineer
- •
- Product
How to write a data engineering CV for Europe and America - A hiring manager’s perspective
- Adrian Brudaru,
Co-Founder & CDO
- •
- Product,
- Tutorials
Dumpster diving for data: The MongoDB experience
- Adrian Brudaru,
Co-Founder & CDO
- •
- Tutorials,
- Product
The return of ETL in the Python age
- Adrian Brudaru,
Co-Founder & CDO
- •
- Tutorials,
- Updates
Trust your data! Column and row level lineages, an explainer and a recipe.
- Adrian Brudaru,
Co-Founder & CDO
- •
- Tutorials
dlt-dbt-DuckDB-MotherDuck: My super simple and highly customizable approach to the Modern Data Stack in a box
- Rahul Joshi,
Jr. Solutions Engineer
- •
- Tutorials
dlt AI Assistant provides answers you need!
- Tong Chen,
Data Engineer Intern at dltHub
- •
- Product,
- Updates
dlt & openAPI code generation: A step beyond APIs and towards 10,000s of live datasets
- Matthaus Krzykowski,
Co-Founder & CEO
- •
- Tutorials
Hey GPT, tell me about dlthub!
- Tong Chen,
Data Engineer Intern at dltHub
- •
- Product
Automating the data engineer: Addressing the talent shortage
- Adrian Brudaru,
Co-Founder & CDO
- •
- Tutorials
GPT-accelerated learning: Understanding open source codebases
- Tong Chen,
Data Engineer Intern at dltHub
- •
- Tutorials
Schema Evolution
- Adrian Brudaru,
Co-Founder & CDO
- •
- Tutorials,
- Updates
Using the Google Sheets `dlt` pipeline in analytics and ML workflows
- Rahul Joshi,
Jr. Solutions Engineer
- •
- Tutorials,
- Product
The structured data lake: How schema evolution enables the next generation of data platforms
- Adrian Brudaru,
Co-Founder & CDO
- •
- Tutorials
Using Google BigQuery and Metabase to understand product usage
- Rahul Joshi,
Jr. Solutions Engineer
- •
- Tutorials
Understanding how developers view ELT tools using the Hacker News API and GPT-4
- Rahul Joshi,
Jr. Solutions Engineer
- •
- Tutorials,
- Updates
Internal Dashboard for Google Analytics 4
- Rahul Joshi,
Jr. Solutions Engineer
- •
- Product
Is DuckDB a database for ducks?
- Matthaus Krzykowski,
Co-Founder & CEO
- •
- Product
As DuckDB crosses 1M downloads / month, what do its users do?
- Matthaus Krzykowski,
Co-Founder & CEO
- •
- Product,
- Updates
Who we serve
- Matthaus Krzykowski,
Co-Founder & CEO
- •
- Product
dltHub Mission
- Matthaus Krzykowski,
Co-Founder & CEO