Blog /February 3, 2026

3.7x Faster EL Pipelines: Arrow + ADBC vs. SQLAlchemy

Aman Gupta,
Data Engineer

Intro

Moving 5 million rows of partially nested JSON with SQLAlchemy takes long enough to grab a coffee because row-by-row serialization is slow. So what happens if we use Arrow end-to-end?

TL;DR
The Test: SQLAlchemy: row-based (no ADBC) vs. Arrow + ADBC (columnar ingestion)
The Result: 3.7x speedup for EL pipelines using the Arrow + ADBC path.

5M Rows from DuckDB to MySQL < 100 Seconds

In our benchmarks, we reduced the 344-second baseline to 92 seconds by switching to the Arrow + ADBC path. Users have documented SQLAlchemy's performance bottlenecks for years. For instance, in (GitHub #9436), the architectural wall is encountered when writing to SQL Server due to driver and protocol overhead. Similarly, this same architectural wall means, switching from Pandas to Polars doesn't speed up read and write operations (r/dataengineering).

To skip these bottlenecks, we further improved dlt’s Arrow path and added ADBC support for MySQL, SQL Server, and SQLite in our December release.

The comparison: row vs. columnar

To see the difference, we tested two paths:

SQLAlchemy: Standard row-based (no ADBC).
Arrow + ADBC: Direct columnar ingestion.

Path	Total Time	Normalize	Load	Speedup
SQLAlchemy	344s	167s	132s	1x
Arrow+ADBC	92s	0s	89s	3.7x

Hardware specs: MBP 14” (M4 Pro · 14c CPU · 16GB RAM)

Our experiment - the benchmark

We ran a side-by-side test moving 5M rows with a mix of flat and nested JSON data.

Source: DuckDB
Destination: MySQL
Matchup: SQLAlchemy (row-by-row) vs. Arrow + ADBC (columnar)

The traditional path: SQLAlchemy

Here’s the overhead. The moment we call to_pylist(), Arrow turns into millions of Python dicts, which causes the row by row serialization cost.

import dlt
import duckdb

@dlt.resource(name="large_table")
def row_by_row_source():
   with duckdb.connect("source_data_5m.duckdb", read_only=True) as conn:
       query_result = conn.execute("SELECT * FROM large_table")
       # Data is inflated into Python objects for the row-by-row path
       yield from (batch.to_pylist() for batch in query_result.fetch_record_batch(50_000))

pipeline = dlt.pipeline(
   pipeline_name="columnar_shift_benchmark_mysql",
   destination="mysql",
   dataset_name="benchmark_db"
)

# We enabled 8 parallel workers to give the row-based path every advantage.
pipeline.run(row_by_row_source(), write_disposition="replace")

Results SQLAlchemy (average over 5 runs)

Stage	Time
Extract	45.05s
Normalize	166.75s
Load	132.39s
Total	344.21s (~5.67 min)

The fast lane: Arrow + ADBC

By streaming Arrow batches directly, we eliminate Python object handling costs. We stream Arrow batches “as-is”, preserving the columnar format and letting ADBC bulk load Parquet directly into the destination.

import dlt
import duckdb

@dlt.resource(name="large_table")
def fast_lane_source():
    with duckdb.connect("source_data_5m.duckdb", read_only=True) as conn:
        result = conn.execute("SELECT * FROM large_table")
        yield from result.fetch_record_batch(50_000)

pipeline = dlt.pipeline(
    pipeline_name="benchmark_arrow_mysql",
    destination="mysql",
    dataset_name="benchmark_db_arrow",
)

pipeline.run(
    fast_lane_source(),
    loader_file_format="parquet",
    write_disposition="replace",
)

Results Arrow+ADBC (avg. 5 runs)

Stage	Time
Extract	2.91s
Normalize	0.01s
Load	89.32s
Total	92.36s (~1.54 min)

What the runs tell us

Extraction: 45s -> ~ 3s

By keeping the data in Arrow buffers, we bypass Python object instantiation and serialization overhead. We’ve shifted the bottleneck away from the CPU, allowing the data to flow at the native speed of the source engine.

Normalization: 167s -> ~10ms

Normalization is almost instantaneous because it is now a metadata-only operation. dlt aligns the schema using Arrow metadata instead of iterating through objects to infer types.

Load: 132s -> 89s

We replace row-by-row inserts with ADBC bulk loading.. This moves memory buffers directly into the database engine, bypassing the overhead of traditional communication protocols.

Net result: ~344s down to ~92s, which is a 3.7x speedup end-to-end.

The big win

The real win is architectural simplicity. By the path from source to destination, we’ve reduced the moving parts. This means fewer moving parts, a reduced failure surface, and simplified state management. By doing away with the overhead of transforming columnar data into dict-shaped structures, you not only improve performance but also save on compute.

Why Arrow

Arrow allows us to move structured data without the memory bloat and CPU tax of serialization.

By using its in-memory columnar format, we stop wasting cycles de-serializing, re-typing and re-serializing millions of discrete objects. The bottleneck shifts from the CPU to the underlying transport. By switching to arrow we are removing the overhead of row-based structures, that means lower compute costs and higher throughput.

Why dlt

With dlt, Arrow stops being a special case and becomes the default path. You get the performance of a columnar architecture within the full toolkit for building loading pipelines. It’s a cleaner design with fewer moving parts, and simplified pipeline maintenance.

Try it yourself

Arrow pipelines aren't experimental; they are dlt’s fast lane. Try it yourself easily, by setting loader_file_format="parquet" on supported destinations.

Resources:

Documentation
Slack community

The Builder: Outliving the Modern Data Stack

Finding the UFC GOAT: A Full Stack Pipeline with dlt, dbt and Metabase

IntroLink icon

5M Rows from DuckDB to MySQL < 100 SecondsLink icon

The comparison: row vs. columnarLink icon

Our experiment - the benchmarkLink icon

The traditional path: SQLAlchemyLink icon

The fast lane: Arrow + ADBCLink icon

What the runs tell usLink icon

The big winLink icon

Why ArrowLink icon

Why dltLink icon

Try it yourselfLink icon

Intro

5M Rows from DuckDB to MySQL < 100 Seconds

The comparison: row vs. columnar

Our experiment - the benchmark

The traditional path: SQLAlchemy

The fast lane: Arrow + ADBC

What the runs tell us

The big win

Why Arrow

Why dlt

Try it yourself