Skip to main content

Destinations

dlt supports four destinations (more coming soon)

Learn how to set up each of the supported destinations below.

Google BigQuery

1. Initalize a project with a pipeline that loads to BigQuery by running

dlt init chess bigquery

2. Install the necessary dependencies for BigQuery by running

pip install -r requirements.txt

3. Log in to or create a Google Cloud account

Sign up for or log in to the Google Cloud Platform in your web browser.

4. Create a new Google Cloud project

After arriving at the Google Cloud console welcome page, click the project selector in the top left, then click the New Project button, and finally click the Create button after naming the project whatever you would like.

5. Create a service account and grant BigQuery permissions

You will then need to create a service account. After clicking the Go to Create service account button on the linked docs page, select the project you just created and name the service account whatever you would like.

Click the Continue button and grant the following roles, so that dlt can create schemas and load data:

  • BigQuery Data Editor
  • BigQuery Job User
  • BigQuery Read Session User

You don't need to grant users access to this service account at this time, so click the Done button.

6. Download the service account JSON

In the service accounts table page that you are redirected to after clicking Done as instructed above, select the three dots under the Actions column for the service account you just created and select Manage keys.

This will take you to page where you can click the Add key button, then the Create new key button, and finally the Create button, keeping the preselected JSON option.

A JSON file that includes your service account private key will then be downloaded.

7. Update your dlt credentials file with your service account info

Open your dlt credentials file:

open .dlt/secrets.toml

Replace the project_id, private_key, and client_email with the values from the downloaded JSON file:

[destination.bigquery.credentials]

location = "US"
project_id = "project_id" # please set me up!
private_key = "private_key" # please set me up!
client_email = "client_email" # please set me up!

Postgres

1. Initialize a project with a pipeline that loads to Postgres by running

dlt init chess postgres

2. Install the necessary dependencies for Postgres by running

pip install -r requirements.txt

3. Create a new database after setting up a Postgres instance and psql / query editor by running

CREATE DATABASE dlt_data;

Add dlt_data database to .dlt/secrets.toml.

4. Create a new user by running

CREATE USER loader WITH PASSWORD '<password>';

Add loader user and <password> password to .dlt/secrets.toml.

5. Give the loader user owner permissions by running

ALTER DATABASE dlt_data OWNER TO loader;

It is possible to set more restrictive permissions (e.g. give user access to a specific schema).

6. Your .dlt/secrets.toml should now look like

[destination.postgres.credentials]

database = "dlt_data"
username = "loader"
password = "<password>" # replace with your password
host = "localhost" # or the IP address location of your database
port = 5432
connect_timeout = 15

Amazon Redshift

1. Initialize a project with a pipeline that loads to Redshift by running

dlt init chess redshift

2. Install the necessary dependencies for Redshift by running

pip install -r requirements.txt

3. Edit the dlt credentials file with your info

open .dlt/secrets.toml

DuckDB

1. Initialize a project with a pipeline that loads to DuckDB by running

dlt init chess duckdb

2. Install the necessary dependencies for DuckDB by running

pip install -r requirements.txt

3. Run the pipeline

python3 chess.py

Destination Configuration

By default, a DuckDB database will be created in the current working directory with a name <pipeline_name>.duckdb (chess.duckdb in the example above). After loading, it is available in read/write mode via with pipeline.sql_client() as con: which is a wrapper over DuckDBPyConnection. See duckdb docs for details.

The duckdb credentials do not require any secret values. You are free to pass the configuration explicitly via the credentials parameter to dlt.pipeline or pipeline.run methods. For example:

# will load data to files/data.db database file
p = dlt.pipeline(pipeline_name='chess', destination='duckdb', dataset_name='chess_data', full_refresh=False, credentials="files/data.db")

# will load data to /var/local/database.duckdb
p = dlt.pipeline(pipeline_name='chess', destination='duckdb', dataset_name='chess_data', full_refresh=False, credentials="/var/local/database.duckdb")

The destination accepts a duckdb connection instance via credentials, so you can also open a database connection yourself and pass it to dlt to use. :memory: databases are supported.

import duckdb
db = duckdb.connect()
p = dlt.pipeline(pipeline_name='chess', destination='duckdb', dataset_name='chess_data', full_refresh=False, credentials=db)

This destination accepts database connection strings in format used by duckdb-engine.

You can configure a DuckDB destination with secret / config values (e.g. using a secrets.toml file)

destination.duckdb.credentials=duckdb:///_storage/test_quack.duckdb