Lightweight Python code to move data
We focus on the needs & constraints of Python-first data platform teams: how to write any data source, achieve data democracy, modernise legacy systems and reduce cloud costs.
OPEN SOURCE
Pip install dlt and go
With over 500k downloads per month, dlt is the most popular Python library for ETL. You can add dlt to your Python scripts to load data from various and often messy data sources into well-structured, live datasets. Unlike other non-Python solutions, with dlt, there's no need to use any backends or containers. We do not replace your data platform, deployments, or security models. Simply import dlt in a Python file or a Jupyter Notebook cell. You can load data from any source that produces Python data structures, including APIs, files, databases, and more.
OPEN SOURCE
Access any data you want in Python
Today it is easier to pip install dlt and write a custom pipeline than to setup and configure a traditional ETL platform. In June '24 we crossed 5,000 dlt total custom sources created by the community since we launched dlt in summer '23. Because dlt is code we continue to automate engineering work and pass on productivity gains to organisations using dlt. Our new REST API Source toolkit is a short, declarative configuration driven way of creating sources. dlt-init-openapi is a a new tool that generates pipelines code out of any OpenAPI spec.
FOR DATA PLATFORM TEAMS
Transform your data stack with dlt’s data platform-in-a-box
dltHub’s building blocks for data platform teams enable you to accelerate your data stack modernization goals.
Democratize your data
Securely achieve data democracy, so that your data platform teams no longer act as a bottleneck to accessing data
Modernize legacy systems
The data platform-in-a-box enables you to upgrade and standardize ingestion in your legacy data stack step by step.
Reduce cloud costs
Run dlt anywhere with dlt’s advanced orchestrator integrations and specialized runners. No more unexpected cloud bills.