Release highlights: 1.12.3 & 1.13.0 & 1.14.1
Breaking change: Ibis dataset/relation functionality + new non-Ibis relation featuresβ
What changed
Accessing Ibis relations using dataset["table_name"]
or dataset.table_name
no longer works.
Migration required
To get an Ibis table relation, explicitly set the table type to get the table expression first, then convert it to a table relation:
# Previously: dataset["customers"] # now returns a regular relation
customers_expression = dataset.table("customers", table_type="ibis")
customers_relation = dataset(customers_expression)
df = customers_relation.df() # fetch as DataFrame
Check out the updated docs and migration guide for full details and examples.
If you were using dataset
without having Ibis installed, or without using any Ibis features, no changes are needed.
New: non-Ibis relation enhancements
You can now use a more SQL-like interface on regular relations:
customers = dataset.table("customers")
# Sort
customers.order_by("created_at").fetchall()
# Filter
customers.filter("id = 1").fetchall()
# Aggregation
customers.select("id").max().scalar()
Marimo playground: Try dlt in your browserβ
PR: #2832
Overview
We now host an interactive Marimo notebook playground on GitHub Pages:
π https://dlt-hub.github.io/dlt/playground/
This lets you test dlt code directly in your browser using Pyodide/WASM, no installation needed.
Whatβs included
The first notebook walks through a simple example of loading a Python data structure into DuckDB using dlt. Itβs a quick and friendly way to understand how dlt pipelines work without setting up any environment.
Docs integration
Weβve also embedded the notebook directly into our docs:
This makes it easier to read and run the code side by side.
REST API: New paginator for header-based cursorsβ
PR: #2798
Overview
The new header_cursor
paginator enables pagination for APIs that return the next page cursor in the response headers, such as those using custom headers like NextPageToken
.
Use case
Some REST APIs respond like this:
# response header
Content-Type: application/json
NextPageToken: 123456
# response JSON
[
{"id": 1, "name": "item1"},
{"id": 2, "name": "item2"}
]
To fetch the next page:
https://api.example.com/items?page=123456
Example configuration
{
"path": "items",
"paginator": {
"type": "header_cursor",
"cursor_key": "NextPageToken", # name of the response header
"cursor_param": "page" # query parameter used to pass the token
}
}
You can now extract cursors from headers just like from JSON paths in the response body.
Autocompletion in Notebooks: Dataset and relationβ
PR: #2891
Overview
Working in notebooks just got smoother. Weβve added tab autocompletion for dlt.Dataset
and dlt.Relation
objects.
Now you can:
- Use
dataset.<TAB>
to see available tables - Use
relation.<TAB>
to explore available columns
How it works
The feature overrides _ipython_key_completions_
to inspect dataset/relation structure and return available names.
Example
dataset # <TAB> will suggest all table names
dataset.customers # <TAB> will suggest column names like id, email, created_at
DuckLake setup guide for MotherDuckβ
PR: #2842
Overview
Weβve added a dedicated section in the docs for setting up DuckLake, a powerful way to persist MotherDuck databases on external storage like S3.
The guide includes:
- Creating a DuckLake-managed database on S3
- Storing S3 credentials securely in MotherDuck
- Connecting dlt to DuckLake via
secrets.toml
Why use DuckLake?
- Offload MotherDuck data to S3
- Control over storage location and retention
- Load data via dlt like with any DuckDB destination
Athena + Iceberg + Lake Formation tags and table propertiesβ
PR: #2808
Overview
You can now apply Lake Formation tags to the Glue database used by the Athena destination. All tables created by dlt will inherit these tags.
Example configuration
[destination.athena.lakeformation_config]
enabled = true
[destination.athena.lakeformation_config.tags]
Environment = "prod"
Team = "analytics"
Add post-create SQL statementsβ
PR: #2791
You can now run custom SQL statements after tables are created or altered during schema migration. Useful for foreign keys, constraints, etc.
Example usage
Override _get_table_post_update_sql
in your SqlJobClientBase
:
def _get_table_post_update_sql(self, partial_table):
if partial_table["name"] == "orders":
return [
"ALTER TABLE orders ADD CONSTRAINT fk_user FOREIGN KEY (user_id) REFERENCES users(id)"
]
return []
These run after all tables are updated, allowing dependency-safe logic.
CLI: Friendly drop
command πβπ¦Ίβ
PR: #2720
Overview
The dlt drop
command now includes smarter validation and warnings to avoid accidental misconfigurations.
Warning types
-
Run directory validation
You should run this from the same directory as the pipeline script...
-
Misaligned config context detection
WARNING: Active pipeline `my_pipeline` used `/home/user/project/.dlt`...
-
Pipeline script vs. run dir mismatch
WARNING: Your run dir (/home/user/tmp) is different from the pipeline script...
-
Credential visibility notice
WARNING: When accessing data from the command line, dlt does not execute your pipeline code...
-
Table drop preview Lists tables that will be deleted before performing the drop.
Shout-out to new contributorsβ
Big thanks to our newest contributors:
- @AyushPatel101 (#2803)
- @nicob3y (#2755)
- @kaliole (#2849)
- @franloza (#2869)
Full release notes