Lightweight Python code to move data
We focus on the needs & constraints of Python-first data platform teams: how to write any data source, achieve data democracy, modernise legacy systems and reduce cloud costs.
Trusted By






3M+
PyPi Downloads
5,000+
OSS companies in production
500+
Snowflake customers in production
OPEN SOURCE
pip install dlt and go
dlt (data load tool) is the most popular production-ready open source Python library for moving data. It loads data from various and often messy data sources into well-structured, live datasets.
Unlike other non-Python solutions, with dlt library, there's no need to use any backends or containers. We do not replace your data platform, deployments, or security models. Simply import it into your favorite AI code editor, or add it to your Jupyter Notebook. You can load data from any source that produces Python data structures, including APIs, files, databases, and more.
import dlt
from dlt.sources.filesystem import filesystem
resource = filesystem(
bucket_url="s3://example-bucket",
file_glob="*.csv"
)
pipeline = dlt.pipeline(
pipeline_name="filesystem_example",
destination="duckdb",
dataset_name="filesystem_data",
)
pipeline.run(resource)
INITIAL DLTHUB WORKFLOW
Made for LLMs: Data source to Live Reports in Python
In July we released the initial dltHub workflow that lets developers build dlt pipelines and reports with LLMs.
Now developers are creating 10,000s of dlt sources each month with an AI code editor of their choice such as Cursor, Claude, Codex or Continue.
We already support more than 5,000 sources, and see a clear path toward hundreds of thousands. We continue to invest into the dlt workspace, a dedicated environment to create, debug, and maintain dlt pipelines in production - all in one streamlined flow, designed for individual developers.
DLTHUB VISION
From Open Source EL to Data Infrastructure That Feels Like Python
dlt makes extracting and loading data simple and Pythonic. With dltHub, we’re taking the next step - extending into ELT, storage, and runtime.
dltHub transforms complex data workflows into something any Python developer can run: deploy pipelines, transformations, and notebooks.
We’re building dltHub in close collaboration with users in highly regulated industries like finance and healthcare - where governance, security, and compliance (like BCBS 239 for risk reporting) are non-negotiable. dltHub brings those guarantees while preserving Pythonic simplicity, complete data lineage, observability, and quality control - all in a platform that feels as natural as writing code.
Our goal is to make dltHub available to individual developers, small teams, and enterprises alike. The first release - dltHub for individual developers - is coming in Q1 2026.
The current machine learning revolution has been enabled by the Cambrian explosion of Python open-source tools that have become so accessible that a wide range of practitioners can use them. As a simple-to-use Python library, dlt is the first tool that this new wave of people can use. By leveraging this library, we can extend the machine learning revolution into enterprise data.

Python and machine learning under security constraints are key to our success. We found that our cloud ETL provider could not meet our needs. dlt is a lightweight yet powerful open source tool we can run together with Snowflake. Our event streaming and batch data loading performs at scale and low cost. Now anyone who knows Python can self-serve to fulfil their data needs.



