Skip to main content
Version: devel

Release highlights: 1.15

⚠️ Breaking changes: .gz extensions now added to compressed files

PR: #2835

Starting with v1.15, all compressed files will include the .gz extension by default. This applies to:

  • Filesystem destinations
  • Internal working directories
  • Staging locations (feeding downstream destinations like Redshift, BigQuery)

Notes

  • Does not apply to .parquet files
  • Existing filesystem destination will keep the old behavior (no .gz) for backward compatibility
  • Compressed files uploaded to staging destinations will now have the .gz extension, also if dlt is configured to keep data in stage.

More details → File Compression in Filesystem Destination


Export schemas in DBML format

PR: #2929

You can now export your pipeline schema in DBML format, ready for visualization in DBML frontends.

# Generate a string that can be rendered in a DBML frontend
dbml_str = pipeline.default_schema.to_dbml()

This includes:

  • Data and dlt tables
  • Table/column metadata
  • User-defined/root-child/parent-child references
  • Grouping by resources
  • etc.

schema-diagram


REST API Client Enhancements

1. has_more flag in paginators

PR: #2817

Some APIs provide a has_more boolean instead of total or offset limits.

You can now configure paginators to use that flag.

from dlt.sources.helpers.rest_client.paginators import OffsetPaginator, PageNumberPaginator

# Paginator starts at offset=0, increments by `limit`
# Stops automatically when response["has_more"] == False
offset_paginator = OffsetPaginator(
offset=0,
limit=2,
has_more_path="has_more" # JSONPath to the boolean flag
)

# Paginator starts at page=1, increments by page
# Stops automatically when response["has_more"] == False
pn_paginator = PageNumberPaginator(
page=1,
has_more_path="has_more" # JSONPath to the boolean in response
)

2. JSON body pagination support

PR: #2917

Pagination parameters can now be placed in the request body (e.g., for POST requests).

from dlt.sources.helpers.rest_client.paginators import OffsetPaginator, PageNumberPaginator

# We tell dlt to place pagination values INSIDE the request JSON body
offset_paginator = OffsetPaginator(
offset=0,
limit=3,
offset_body_path="offset", # put 'offset' in request.json["offset"]
limit_body_path="limit", # put 'limit' in request.json["limit"]
)

pn_paginator = PageNumberPaginator(
page=1,
page_body_path="page", # put 'page' in request.json["page"]
)

Databricks + Unity Catalog enhancement

PR: #2674

New support for comments, tags, and constraints in Databricks Unity Catalog tables.

  • Add table comments and tags
  • Add column comments and tags
  • Apply primary/foreign key constraints
[destination.databricks]
# Add PRIMARY KEY and FOREIGN KEY constraints to tables
create_indexes = true

Use the adapter for column- and table-level hints:

from dlt.destinations.adapters import databricks_adapter

databricks_adapter(
my_resource,
table_comment="Event data",
table_tags=["pii", {"cost_center": "12345"}],
column_hints={
"user_id": {
"column_comment": "User ID",
"column_tags": ["pii"]
}
},
)

SQLAlchemy destination: MSSQL + Trino support

PR: #2951

  • MSSQL fully supported (including JSON fields)
  • Trino partially supported (with limitations: no merge/SCD2, JSON casted to string)
  • Fixes for ClickHouse temp tables and BigQuery numeric handling

Example of customizing a type mapper for Trino →


Delta Lake: control over streamed execution

PR: #2961

By default, dlt runs Delta Lake upserts in streamed mode.

  • Streamed execution = the source table is read row by row, which reduces memory pressure and works well for large datasets.
  • But in streamed mode, table statistics and pruning predicates are not used — meaning Delta can’t skip irrelevant partitions efficiently.

New option: deltalake_streamed_exec

Now you can explicitly choose whether to run Delta merge operations in streamed mode or not.

  • true (default): safe for very large datasets, lower memory usage.
  • false: enables Delta to use table statistics and pruning (better performance when partitions and stats exist).
[destination.filesystem]
deltalake_streamed_exec = false

Python 3.14 support

PR: #2789

Python 3.14 is currently in beta4. We now provide experimental support for Python 3.14 to make it easier for projects depending on dlt to prepare for the upgrade.


Switched ConnectorX backend from arrow2 to arrow

PR: #2933

ConnectorX dropped the arrow2 backend in v0.4.2, which was dlt’s default.

dlt now defaults to the arrow backend, with minor adjustments for behavioral differences, and relaxed ConnectorX version constraints.


AI Command IDE integration: now works with all major IDEs (Cursor, Copilot, Windsurf, etc.)

PR: #2937

The AI command now supports all major IDEs, automatically creating the correct rule files for each AI editor/agent. This ensures seamless integration whether you’re using Cursor, Copilot, Windsurf, Claude, or other supported editors.


Shout-out to new contributors

Big thanks to our newest contributors:


Full release notes

View the complete list of changes →

This demo works on codespaces. Codespaces is a development environment available for free to anyone with a Github account. You'll be asked to fork the demo repository and from there the README guides you with further steps.
The demo uses the Continue VSCode extension.

Off to codespaces!

DHelp

Ask a question

Welcome to "Codex Central", your next-gen help center, driven by OpenAI's GPT-4 model. It's more than just a forum or a FAQ hub – it's a dynamic knowledge base where coders can find AI-assisted solutions to their pressing problems. With GPT-4's powerful comprehension and predictive abilities, Codex Central provides instantaneous issue resolution, insightful debugging, and personalized guidance. Get your code running smoothly with the unparalleled support at Codex Central - coding help reimagined with AI prowess.