dltHub

Lightweight Python code to move data

We focus on the needs & constraints of Python-first data platform teams: how to write any data source, achieve data democracy, modernise legacy systems and reduce cloud costs.

OPEN SOURCE

Pip install dlt and go

With over 500k downloads per month, dlt is the most popular Python library for ETL. You can add dlt to your Python scripts to load data from various and often messy data sources into well-structured, live datasets. Unlike other non-Python solutions, with dlt, there's no need to use any backends or containers. We do not replace your data platform, deployments, or security models. Simply import dlt in a Python file or a Jupyter Notebook cell. You can load data from any source that produces Python data structures, including APIs, files, databases, and more.

pip install dlt

"The current machine learning revolution has been enabled by the Cambrian explosion of Python open-source tools that have become so accessible that a wide range of practitioners can use them. As a simple-to-use Python library, dlt is the first tool that this new wave of people can use. By leveraging this library, we can extend the machine learning revolution into enterprise data."

Julien Chaumond
CTO/Co-Founder at Hugging Face

"Python and machine learning under security constraints are key to our success. We found that our cloud ETL provider could not meet our needs. dlt is a lightweight yet powerful open source tool we can run together with Snowflake. Our event streaming and batch data loading performs at scale and low cost. Now anyone who knows Python can self-serve to fulfil their data needs."

Maximilian Eber
CPTO & Co-Founder at Taktile
OPEN SOURCE

Access any data you want in Python

Today it is easier to pip install dlt and write a custom pipeline than to setup and configure a traditional ETL platform. In June '24 we crossed 5,000 dlt total custom sources created by the community since we launched dlt in summer '23. Because dlt is code we continue to automate engineering work and pass on productivity gains to organisations using dlt. Our new REST API Source toolkit is a short, declarative configuration driven way of creating sources. dlt-init-openapi is a a new tool that generates pipelines code out of any OpenAPI spec.

"We at Untitled Data Company create new data pipelines for our customers all the time. We are now using dlt's new REST API toolkit in consulting projects. The toolkit allows us to build data pipelines in very little time and very little code. This proves to be great for us as well as our customers as they can easily maintain the data pipelines by anyone who knows Python on their end."

{testimonial.author?.name}
Willi Müller
Co-Founder at Untitled Data Company
FOR DATA PLATFORM TEAMS

Transform your data stack with dlt’s data platform-in-a-box


dltHub’s building blocks for data platform teams enable you to accelerate your data stack modernization goals.

Democratize your data

Securely achieve data democracy, so that your data platform teams no longer act as a bottleneck to accessing data

Modernize legacy systems

The data platform-in-a-box enables you to upgrade and standardize ingestion in your legacy data stack step by step.

Reduce cloud costs

Run dlt anywhere with dlt’s advanced orchestrator integrations and specialized runners. No more unexpected cloud bills.

"dlt has enabled me to completely rewrite all of our core SaaS service pipelines in 2 weeks and have data pipelines in production with full confidence. We also achieved data democracy for our data platform. Our product, business, and operation teams can independently satisfy a majority of their data needs through no-code self-service. The teams built multi-touch attribution for how Harness acquires customers, and models for how Harness customers utilize licenses. If the teams want to build anything else to push the company forward, they don't need to wait for permission or data access to do it."

{testimonial.author?.name}
Alex Butler
Senior Data Engineer at Harness

Get started building