Microsoft SQL Server
Install dlt with MS SQLโ
To install the dlt library with MS SQL dependencies, use:
pip install "dlt[mssql]"
Setup guideโ
Prerequisitesโ
The Microsoft ODBC Driver for SQL Server must be installed to use this destination.
This cannot be included with dlt
's Python dependencies, so you must install it separately on your system. You can find the official installation instructions here.
Supported driver versions:
ODBC Driver 18 for SQL Server
ODBC Driver 17 for SQL Server
You can also configure the driver name explicitly.
Create a pipelineโ
1. Initialize a project with a pipeline that loads to MS SQL by running:
dlt init chess mssql
2. Install the necessary dependencies for MS SQL by running:
pip install -r requirements.txt
or run:
pip install "dlt[mssql]"
This will install dlt
with the mssql
extra, which contains all the dependencies required by the SQL server client.
3. Enter your credentials into .dlt/secrets.toml
.
For example, replace with your database connection info:
[destination.mssql.credentials]
database = "dlt_data"
username = "loader"
password = "<password>"
host = "loader.database.windows.net"
port = 1433
connect_timeout = 15
[destination.mssql.credentials.query]
# trust self-signed SSL certificates
TrustServerCertificate="yes"
# require SSL connection
Encrypt="yes"
# send large string as VARCHAR, not legacy TEXT
LongAsMax="yes"
You can also pass a SQLAlchemy-like database connection:
# Keep it at the top of your TOML file, before any section starts
destination.mssql.credentials="mssql://loader:<password>@loader.database.windows.net/dlt_data?TrustServerCertificate=yes&Encrypt=yes&LongAsMax=yes"
You can place any ODBC-specific settings into the query string or destination.mssql.credentials.query TOML table as in the example above.
To connect to an mssql
server using Windows authentication, include trusted_connection=yes
in the connection string.
destination.mssql.credentials="mssql://loader.database.windows.net/dlt_data?trusted_connection=yes"
To connect to a local SQL server instance running without SSL, pass the encrypt=no
parameter:
destination.mssql.credentials="mssql://loader:loader@localhost/dlt_data?encrypt=no"
To allow a self-signed SSL certificate when you are getting certificate verify failed: unable to get local issuer certificate
:
destination.mssql.credentials="mssql://loader:loader@localhost/dlt_data?TrustServerCertificate=yes"
To use long strings (>8k) and avoid collation errors:
destination.mssql.credentials="mssql://loader:loader@localhost/dlt_data?LongAsMax=yes"
To pass credentials directly, use the explicit instance of the destination
pipeline = dlt.pipeline(
pipeline_name='chess',
destination=dlt.destinations.mssql("mssql://loader:<password>@loader.database.windows.net/dlt_data?connect_timeout=15"),
dataset_name='chess_data')
Write dispositionโ
All write dispositions are supported.
If you set the replace
strategy to staging-optimized
, the destination tables will be dropped and
recreated with an ALTER SCHEMA ... TRANSFER
. The operation is atomic: MSSQL supports DDL transactions.
Data loadingโ
Data is loaded via INSERT statements by default. MSSQL has a limit of 1000 rows per INSERT, and this is what we use.
Supported file formatsโ
- insert-values is used by default
Supported column hintsโ
mssql will create unique indexes for all columns with unique
hints. This behavior may be disabled.
Table and column identifiersโ
SQL Server with the default collation uses case-insensitive identifiers but will preserve the casing of identifiers that are stored in the INFORMATION SCHEMA. You can use case-sensitive naming conventions to keep the identifier casing. Note that you risk generating identifier collisions, which are detected by dlt
and will fail the load process.
If you change the SQL Server server/database collation to case-sensitive, this will also affect the identifiers. Configure your destination as below in order to use case-sensitive naming conventions without collisions:
[destination.mssql]
has_case_sensitive_identifiers=true
Syncing of dlt
stateโ
This destination fully supports dlt state sync.
Data typesโ
MS SQL does not support JSON columns, so JSON objects are stored as strings in nvarchar
columns.
Additional destination optionsโ
The mssql destination does not create UNIQUE indexes by default on columns with the unique
hint (i.e., _dlt_id
). To enable this behavior:
[destination.mssql]
create_indexes=true
You can explicitly set the ODBC driver name:
[destination.mssql.credentials]
driver="ODBC Driver 18 for SQL Server"
When using a SQLAlchemy connection string, replace spaces with +
:
# Keep it at the top of your TOML file, before any section starts
destination.mssql.credentials="mssql://loader:<password>@loader.database.windows.net/dlt_data?driver=ODBC+Driver+18+for+SQL+Server"
dbt supportโ
This destination integrates with dbt via dbt-snowflake.
Additional Setup guidesโ
- Load data from Star Trek to Microsoft SQL Server in python with dlt
- Load data from Zendesk to Microsoft SQL Server in python with dlt
- Load data from Keap to Microsoft SQL Server in python with dlt
- Load data from Klarna to Microsoft SQL Server in python with dlt
- Load data from Google Analytics to Microsoft SQL Server in python with dlt
- Load data from GitLab to Microsoft SQL Server in python with dlt
- Load data from Jira to Microsoft SQL Server in python with dlt
- Load data from HubSpot to Microsoft SQL Server in python with dlt
- Load data from GitHub to Microsoft SQL Server in python with dlt
- Load data from IBM Db2 to Microsoft SQL Server in python with dlt