Getting started

What is dlt?
dlt is an open-source Python library that loads data from various, often messy data sources into well-structured datasets. It provides lightweight Python interfaces to extract, load, inspect and transform the data. dlt and the dlt docs are built ground up to be used with LLMs: LLM-native workflow will take you pipeline code to data in a notebook for over 5,000 sources.
dlt is designed to be easy to use, flexible, and scalable:
- dlt extracts data from REST APIs, SQL databases, cloud storage, Python data structures, and many more
- dlt infers schemas and data types, normalizes the data, and handles nested data structures.
- dlt supports a variety of popular destinations and has an interface to add custom destinations to create reverse ETL pipelines.
- dlt automates pipeline maintenance with incremental loading, schema evolution, and schema and data contracts.
- dlt supports Python and SQL data access, transformations and supports pipeline inspection and visualizing data in Marimo Notebooks.
- dlt can be deployed anywhere Python runs, be it on Airflow, serverless functions, or any other cloud deployment of your choice.
To get started with dlt, install the library using pip (use clean virtual environment for your experiments!):
pip install dlt
tip
If you'd like to try out dlt without installing it on your machine, check out the Google Colab demo or use our simple marimo / wasm based playground on this docs page.