- 4,000+
- GitHub stars
- 5,000+
- Community members
- 120+
- Contributors
- 2M+
- Downloads per month
OPEN SOURCE
data load tool (dlt): load data anywhere
dlt (data load tool) is an open source Python library that loads data from often messy data sources into well-structured, live datasets. It automates all your tedious data engineering tasks, with features like schema inference, data normalization and incremental loading.
Run anywhere
Run it where Python runs - on Airflow, serverless functions, notebooks. No external APIs, backends, or containers, scales on micro and large infra alike.
Automated maintenance
With schema inference and evolution and alerts, and with short declarative code, maintenance becomes simple.
Declarative
User-friendly, declarative interface that removes knowledge obstacles for beginners while empowering senior professionals.
Fully customizable
Customize our verified data sources, or any part of the code to suit your needs.

Verified Sources & Destinations
Our verified sources are the simplest way to get started with building your stack. Choose from any of our fully customizable 60+ pre-built sources, such as any SQL database, Google Sheets, Salesforce and others.
With our numerous destinations you can load data to a local database, warehouse or a data lake. Choose from Snowflake, Databricks and more.
Build custom sources
If dlt’s verified sources don’t fit your needs, you can build your own custom source using the REST API source if an API is available. Having a declarative configuration, you’ll save a lot of time on writing custom code. If no API is available, you can build a custom source from scratch in Python.
We at Untitled Data Company create new data pipelines for our customers all the time. We are now using dlt's new REST API toolkit in consulting projects. The toolkit allows us to build data pipelines in very little time and very little code. This proves to be great for us as well as our customers as they can easily maintain the data pipelines by anyone who knows Python on their end.

- Willi Müller
- Co-Founder at Untitled Data Company

Sync your databases
Sync database tables from any 100+ database engines to warehouses, vector databases, files or into custom reverse etl functions. Benefit from schema inference and evolution, incremental loading, deduplication, scd2 materializations and more. Achieve the highest performance with pyarrow and connector-x extraction engines. Simply specify the connection string and the destination you want to sync the data to, and dlt (data load tool) will take care of the rest.
Sync your files
Use dlt (data load tool) to retrieve any files you have stored on S3, Azure, GCS and other buckets. Parse csv, parquet, json, pdf, xls and any other format efficiently. Process your data on the fly, with other features such as schema inference, incremental loading and composability with all machine learning libraries.
Do the same at the destination end: pick your file format and storage layout or use table formats like parquet, delta tables or iceberg to easily create your own data lakes.


OpenAPI toolkit
Pull data from any API with an OpenAPI spec without writing any code. The OpenAPI toolkit generates dlt pipeline code to load the data into any destination of your choice.
What they're saying
Don Bosco van HoiCo-Founder / Owner @ Mothership GmbH#dlt might be the next and only tool you might need for data loading.
I have been working with #dlt for about a year now, so basically since it has been in a very early stage and only have positive things to say about it.
While it may not solve any loading problem in data engineering, it solves most common use cases. I guarantee that there is no other tool in the market available, that solves your integration challenges like dlt does it.
Arjun AnandkumarSenior Data Platform Engineer @ NorlysEvery time a new requirement comes up, I am quite amazed that the dlthub team have thought about these scenarios and built possibilities for custom logic to be implemented where required. After all, it’s just python code, and as long as we feed the data to dlt in a way it can work with, it can run pretty much anything. Kudos, dlthub.com!












