Skip to main content
Version: devel

Microsoft SQL Server

Install dlt with MS SQLโ€‹

To install the dlt library with MS SQL dependencies, use:

pip install "dlt[mssql]"

Setup guideโ€‹

Prerequisitesโ€‹

The Microsoft ODBC Driver for SQL Server must be installed to use this destination. This cannot be included with dlt's Python dependencies, so you must install it separately on your system. You can find the official installation instructions here.

Supported driver versions:

  • ODBC Driver 18 for SQL Server
  • ODBC Driver 17 for SQL Server

You can also configure the driver name explicitly.

Create a pipelineโ€‹

1. Initialize a project with a pipeline that loads to MS SQL by running:

dlt init chess mssql

2. Install the necessary dependencies for MS SQL by running:

pip install -r requirements.txt

or run:

pip install "dlt[mssql]"

This will install dlt with the mssql extra, which contains all the dependencies required by the SQL server client.

3. Enter your credentials into .dlt/secrets.toml.

For example, replace with your database connection info:

[destination.mssql.credentials]
database = "dlt_data"
username = "loader"
password = "<password>"
host = "loader.database.windows.net"
port = 1433
connect_timeout = 15
[destination.mssql.credentials.query]
# trust self-signed SSL certificates
TrustServerCertificate="yes"
# require SSL connection
Encrypt="yes"
# send large string as VARCHAR, not legacy TEXT
LongAsMax="yes"

You can also pass a SQLAlchemy-like database connection:

# Keep it at the top of your TOML file, before any section starts
destination.mssql.credentials="mssql://loader:<password>@loader.database.windows.net/dlt_data?TrustServerCertificate=yes&Encrypt=yes&LongAsMax=yes"

You can place any ODBC-specific settings into the query string or destination.mssql.credentials.query TOML table as in the example above.

To connect to an mssql server using Windows authentication, include trusted_connection=yes in the connection string.

destination.mssql.credentials="mssql://loader.database.windows.net/dlt_data?trusted_connection=yes"
note

If you encounter missing credentials errors when using Windows authentication, set the 'username' and 'password' as empty strings in the TOML file.

To connect to a local SQL server instance running without SSL, pass the encrypt=no parameter:

destination.mssql.credentials="mssql://loader:loader@localhost/dlt_data?encrypt=no"

To allow a self-signed SSL certificate when you are getting certificate verify failed: unable to get local issuer certificate:

destination.mssql.credentials="mssql://loader:loader@localhost/dlt_data?TrustServerCertificate=yes"

To use long strings (>8k) and avoid collation errors:

destination.mssql.credentials="mssql://loader:loader@localhost/dlt_data?LongAsMax=yes"

To pass credentials directly, use the explicit instance of the destination

pipeline = dlt.pipeline(
pipeline_name='chess',
destination=dlt.destinations.mssql("mssql://loader:<password>@loader.database.windows.net/dlt_data?connect_timeout=15"),
dataset_name='chess_data')

Write dispositionโ€‹

All write dispositions are supported.

If you set the replace strategy to staging-optimized, the destination tables will be dropped and recreated with an ALTER SCHEMA ... TRANSFER. The operation is atomic: MSSQL supports DDL transactions.

Data loadingโ€‹

Data is loaded via INSERT statements by default. MSSQL has a limit of 1000 rows per INSERT, and this is what we use.

Supported file formatsโ€‹

Supported column hintsโ€‹

mssql will create unique indexes for all columns with unique hints. This behavior may be disabled.

Table and column identifiersโ€‹

SQL Server with the default collation uses case-insensitive identifiers but will preserve the casing of identifiers that are stored in the INFORMATION SCHEMA. You can use case-sensitive naming conventions to keep the identifier casing. Note that you risk generating identifier collisions, which are detected by dlt and will fail the load process.

If you change the SQL Server server/database collation to case-sensitive, this will also affect the identifiers. Configure your destination as below in order to use case-sensitive naming conventions without collisions:

[destination.mssql]
has_case_sensitive_identifiers=true

Syncing of dlt stateโ€‹

This destination fully supports dlt state sync.

Data typesโ€‹

MS SQL does not support JSON columns, so JSON objects are stored as strings in nvarchar columns.

Additional destination optionsโ€‹

The mssql destination does not create UNIQUE indexes by default on columns with the unique hint (i.e., _dlt_id). To enable this behavior:

[destination.mssql]
create_indexes=true

You can explicitly set the ODBC driver name:

[destination.mssql.credentials]
driver="ODBC Driver 18 for SQL Server"

When using a SQLAlchemy connection string, replace spaces with +:

# Keep it at the top of your TOML file, before any section starts
destination.mssql.credentials="mssql://loader:<password>@loader.database.windows.net/dlt_data?driver=ODBC+Driver+18+for+SQL+Server"

dbt supportโ€‹

This destination integrates with dbt via dbt-snowflake.

Additional Setup guidesโ€‹

This demo works on codespaces. Codespaces is a development environment available for free to anyone with a Github account. You'll be asked to fork the demo repository and from there the README guides you with further steps.
The demo uses the Continue VSCode extension.

Off to codespaces!

DHelp

Ask a question

Welcome to "Codex Central", your next-gen help center, driven by OpenAI's GPT-4 model. It's more than just a forum or a FAQ hub โ€“ it's a dynamic knowledge base where coders can find AI-assisted solutions to their pressing problems. With GPT-4's powerful comprehension and predictive abilities, Codex Central provides instantaneous issue resolution, insightful debugging, and personalized guidance. Get your code running smoothly with the unparalleled support at Codex Central - coding help reimagined with AI prowess.