Skip to main content
Version: 1.14.1 (latest)

Release highlights: 1.12.3 & 1.13.0 & 1.14.1

Breaking change: Ibis dataset/relation functionality + new non-Ibis relation features​

What changed

Accessing Ibis relations using dataset["table_name"] or dataset.table_name no longer works.

Migration required

To get an Ibis table relation, explicitly set the table type to get the table expression first, then convert it to a table relation:

# Previously: dataset["customers"]  # now returns a regular relation
customers_expression = dataset.table("customers", table_type="ibis")
customers_relation = dataset(customers_expression)

df = customers_relation.df() # fetch as DataFrame

Check out the updated docs and migration guide for full details and examples.

note

If you were using dataset without having Ibis installed, or without using any Ibis features, no changes are needed.


New: non-Ibis relation enhancements

You can now use a more SQL-like interface on regular relations:

customers = dataset.table("customers")

# Sort
customers.order_by("created_at").fetchall()

# Filter
customers.filter("id = 1").fetchall()

# Aggregation
customers.select("id").max().scalar()

Marimo playground: Try dlt in your browser​

PR: #2832

Overview

We now host an interactive Marimo notebook playground on GitHub Pages:

πŸ”— https://dlt-hub.github.io/dlt/playground/

This lets you test dlt code directly in your browser using Pyodide/WASM, no installation needed.

What’s included

The first notebook walks through a simple example of loading a Python data structure into DuckDB using dlt. It’s a quick and friendly way to understand how dlt pipelines work without setting up any environment.

Docs integration

We’ve also embedded the notebook directly into our docs:

Playground tutorial page β†’

This makes it easier to read and run the code side by side.

marimo-playground


REST API: New paginator for header-based cursors​

PR: #2798

Overview

The new header_cursor paginator enables pagination for APIs that return the next page cursor in the response headers, such as those using custom headers like NextPageToken.

Use case

Some REST APIs respond like this:

# response header
Content-Type: application/json
NextPageToken: 123456
# response JSON
[
{"id": 1, "name": "item1"},
{"id": 2, "name": "item2"}
]

To fetch the next page:

https://api.example.com/items?page=123456

Example configuration

{
"path": "items",
"paginator": {
"type": "header_cursor",
"cursor_key": "NextPageToken", # name of the response header
"cursor_param": "page" # query parameter used to pass the token
}
}

You can now extract cursors from headers just like from JSON paths in the response body.


Autocompletion in Notebooks: Dataset and relation​

PR: #2891

Overview

Working in notebooks just got smoother. We’ve added tab autocompletion for dlt.Dataset and dlt.Relation objects.

Now you can:

  • Use dataset.<TAB> to see available tables
  • Use relation.<TAB> to explore available columns

How it works

The feature overrides _ipython_key_completions_ to inspect dataset/relation structure and return available names.

Example

dataset  # <TAB> will suggest all table names
dataset.customers # <TAB> will suggest column names like id, email, created_at

DuckLake setup guide for MotherDuck​

PR: #2842

Overview

We’ve added a dedicated section in the docs for setting up DuckLake, a powerful way to persist MotherDuck databases on external storage like S3.

The guide includes:

  1. Creating a DuckLake-managed database on S3
  2. Storing S3 credentials securely in MotherDuck
  3. Connecting dlt to DuckLake via secrets.toml

Why use DuckLake?

  • Offload MotherDuck data to S3
  • Control over storage location and retention
  • Load data via dlt like with any DuckDB destination

DuckLake setup in docs β†’


Athena + Iceberg + Lake Formation tags and table properties​

PR: #2808

Overview

You can now apply Lake Formation tags to the Glue database used by the Athena destination. All tables created by dlt will inherit these tags.

Example configuration

[destination.athena.lakeformation_config]
enabled = true

[destination.athena.lakeformation_config.tags]
Environment = "prod"
Team = "analytics"

Add post-create SQL statements​

PR: #2791

You can now run custom SQL statements after tables are created or altered during schema migration. Useful for foreign keys, constraints, etc.

Example usage

Override _get_table_post_update_sql in your SqlJobClientBase:

def _get_table_post_update_sql(self, partial_table):
if partial_table["name"] == "orders":
return [
"ALTER TABLE orders ADD CONSTRAINT fk_user FOREIGN KEY (user_id) REFERENCES users(id)"
]
return []

These run after all tables are updated, allowing dependency-safe logic.


CLI: Friendly drop command πŸ•β€πŸ¦Ίβ€‹

PR: #2720

Overview

The dlt drop command now includes smarter validation and warnings to avoid accidental misconfigurations.

Warning types

  1. Run directory validation

    You should run this from the same directory as the pipeline script...
  2. Misaligned config context detection

    WARNING: Active pipeline `my_pipeline` used `/home/user/project/.dlt`...
  3. Pipeline script vs. run dir mismatch

    WARNING: Your run dir (/home/user/tmp) is different from the pipeline script...
  4. Credential visibility notice

    WARNING: When accessing data from the command line, dlt does not execute your pipeline code...
  5. Table drop preview Lists tables that will be deleted before performing the drop.


Shout-out to new contributors​

Big thanks to our newest contributors:

  • @AyushPatel101 (#2803)
  • @nicob3y (#2755)
  • @kaliole (#2849)
  • @franloza (#2869)

Full release notes

View the complete list of changes β†’

This demo works on codespaces. Codespaces is a development environment available for free to anyone with a Github account. You'll be asked to fork the demo repository and from there the README guides you with further steps.
The demo uses the Continue VSCode extension.

Off to codespaces!

DHelp

Ask a question

Welcome to "Codex Central", your next-gen help center, driven by OpenAI's GPT-4 model. It's more than just a forum or a FAQ hub – it's a dynamic knowledge base where coders can find AI-assisted solutions to their pressing problems. With GPT-4's powerful comprehension and predictive abilities, Codex Central provides instantaneous issue resolution, insightful debugging, and personalized guidance. Get your code running smoothly with the unparalleled support at Codex Central - coding help reimagined with AI prowess.