Version: 1.29.1 (latest) View Markdown

Overview

Key features

Separation of secrets and configs from code - The main role of the configuration system is to keep sensitive information out of your source code.
Built-in credentials - dlt provides built-in support for most common systems with default/machine credential access.
Auto-generated configurations - For functions decorated with @dlt.source, @dlt.resource, and @dlt.destination, dlt automatically generates appropriate configuration specs so they behave like built-in configs and credentials.
Comprehensive configurability - Nearly all aspects of dlt are configurable, including pipelines, normalizers, loaders, and logging, allowing you to change behavior without modifying code. This capability enables performance optimization and other adjustments at runtime.

dlt retrieves configuration and secrets from several locations like environment variables, dedicated files or secure vaults. It understands both simple and verbose layouts of configuration sections. You can use one of built-in credentials for popular external systems. Functions decorated with @dlt.source, @dlt.resource, or @dlt.destination can be configured without writing additional code - dlt will automatically inject missing arguments (like passwords or API keys) when you call them.

Choose where to store configuration

dlt looks for configuration and secrets in various locations (environment variables, toml files or secure vaults) through config providers that are queried when your pipeline runs. You can pick a single location or combine them - for example, define secret api_key in environment variables and api_url in a TOML file. Providers are queried in the following order:

Environment Variables: If a value is found in an environment variable, dlt uses it and doesn't check lower-priority providers.
secrets.toml and config.toml files: These files store configuration values and secrets. secrets.toml contains sensitive information, while config.toml holds non-sensitive configuration.
Vaults: Credentials stored in secure vaults like Google Secrets Manager, Azure Key Vault, or AWS Secrets Manager. Airflow Variables are also included here.
Custom Providers added with register_provider: These are custom implementations you can create to use your own configuration formats or perform specialized preprocessing.
Default Argument Values: The values specified in the function signature.

tip

Make sure your pipeline name contains only alphanumeric characters, hyphens (-), and underscores (_). Avoid whitespace and other punctuation to ensure compatibility with all configuration providers.

Select a configuration layout

You can define configuration in different ways depending on your project's complexity. For a simple pipeline with a single source and destination, your configuration can be straightforward:

Simplest source configuration:

TOML config provider
Environment variables

api_key="some_value"

export API_KEY="some_value"

For destination, you typically need to configure credentials which group multiple related keys together. dlt places these under a credentials section.

TOML config provider
Environment variables

[credentials]
user="dlthub"
password="some_value"

export CREDENTIALS__USER="dlthub"
export CREDENTIALS__PASSWORD="some_value"

Recommended section layout

When using multiple sources with potentially conflicting argument names, or multiple destinations where you want separate credentials, you can organize your config keys with sections. Here's the recommended section layout that is most often used in this documentation and is also generated by dlt init command.

Use sources and destination top-level sections to separate their configurations
Use the Python module name where the source function is defined to separate configuration of sources defined in different modules
Use destination type to separate destinations

TOML config provider
Environment variables

# source defined in notion.py
[sources.notion]
api_key="some_value"

export SOURCES__NOTION__API_KEY="some_value"

Destination:

TOML config provider
Environment variables

# use postgres destination
[destination.postgres.credentials]
user="dlthub"
password="some_value"

export DESTINATION__POSTGRES__CREDENTIALS__USER="dltHub"
export DESTINATION__POSTGRES__CREDENTIALS__PASSWORD="some_value"

Refer to Add credentials guide for more examples and tips how to configure particular source and destination.

How dlt looks for values

dlt starts looking for a particular value with all possible sections present and if value is not found, it will eliminate the rightmost section and try again.

For example, if the source function is in module notion.py:

# module: notion.py

@dlt.source
def notion_databases(api_key: str = dlt.secrets.value):
    pass

dlt will search for the following keys in this order:

sources.notion.notion_databases.api_key
sources.notion.api_key
sources.api_key
api_key

When a source's section differs from its name (e.g., after using .clone(name="my_db", section="my_db_module")), dlt also checks a compact path using just the source name:

sources.my_db_module.my_db.api_key (full path)
sources.my_db_module.api_key
sources.my_db.api_key (compact path)
sources.api_key
api_key

Both the full path and the section path take precedence over the compact path. This lets you use simple, readable keys like sources.my_db.credentials instead of the longer sources.my_db_module.my_db.credentials.

Similarly with destination credentials. In that case credentials sections is considered a required grouping and won't be eliminated:

destination.postgres.credentials.password
destination.credentials.password
credentials.password

tip

For more detailed information about configuration organization, see configuration and secrets structure.

tip

You can use pipeline name to create separate configurations for each pipeline in your project. Configuration values are searched first with the pipeline name prefix, then without it:

[pipeline_name_1.sources.google_sheets.credentials]
client_email = "<client_email_1>"
private_key = "<private_key_1>"
project_id = "<project_id_1>"

[pipeline_name_2.sources.google_sheets.credentials]
client_email = "<client_email_2>"
private_key = "<private_key_2>"
project_id = "<project_id_2>"

Use built-in credential types

Credentials are groups of configs and secrets that are defined together in order to access external systems. dlt implements several built-in credential types to access AWS, Azure, Google Cloud and other common systems

Some of the credential types give you options how you specify them: For example, to connect to a sql_database source, you can either use a connection string:

[sources.sql_database]
credentials="snowflake://user:password@service-account/database?warehouse=warehouse_name&role=role"

Or set up the connection parameters separately:

[sources.sql_database.credentials]
drivername="snowflake"
username="user"
password="password"
database="database"
host="service-account"
warehouse="warehouse_name"
role="role"

tip

dlt can discover default credentials of all major cloud providers: it is able to use what is already present in the runtime environment: i.e. when running in Colab or Google VM it has access to cloud credentials and if nothing is specified in the configuration it will use them instead.

Environment variables

Environment variables provide a convenient way to specify configuration and secrets, especially in deployment environments. When using environment variables, names are capitalized and sections are separated with double underscores (__).

For example, to set the Facebook Ads access token:

export SOURCES__FACEBOOK_ADS__ACCESS_TOKEN="<access_token>"

or when you want to pass a list or dict of values:

export DESTINATION__DUCKDB__CREDENTIALS__PRAGMAS="[\"enable_logging\"]"

Note that you can use environment variables to pass dictionaries and lists: those must be passed as Python literals (not JSON!). You may also need to escape quotation marks, depending on your shell.

See the examples section for more details on setting up credentials with environment variables.

tip

For local development, you can use python-dotenv to automatically load variables from an .env file, making credential management easier and more secure.

tip

Environment variables can also retrieve secret values from /run/secrets/<secret-name> to seamlessly work with Kubernetes/Docker secrets.

For these secrets, dlt uses an alternative name format with lowercase letters, dashes (-) as separators, and underscores converted to dashes. For example, sources--facebook-ads--access-token would be checked for the above environment variable.

Only values marked as secrets (with dlt.secrets.value or using types like TSecretStrValue) are checked this way. Remember to name your secrets appropriately in Kubernetes resources or Docker Compose files.

Vaults

dlt may read configuration from secure vaults - specialized services for storing credentials.

For Google Cloud Secrets Manager, see our example walkthrough and reference
For Airflow Variable see the reference and examples
For other vault integrations like AWS Secrets Manager or Azure Key Vault we are happy to take contributions. There's an abstract class (look for VaultDocProvider) that does all the heavy lifting.

secrets.toml and config.toml

The TOML configuration provider uses two separate files:

config.toml:

Contains non-sensitive configuration data that defines pipeline behavior
Includes settings like file paths, database hosts, timeouts, API URLs, and performance options
Values are accessible in code through the dlt.config dictionary
Can be safely committed to version control

secrets.toml:

Contains sensitive information that must be kept confidential
Includes credentials like passwords, API keys, and private keys
Values are accessible in code through the dlt.secrets dictionary
Should never be committed to version control

By default, the .gitignore file in your project prevents secrets.toml from being added to version control, while config.toml can be freely included.

File locations

The TOML provider loads files from the .dlt folder relative to your current working directory.

For example, if your working directory is my_dlt_project with this structure:

my_dlt_project:
  |
  pipelines/
    |---- .dlt/secrets.toml
    |---- google_sheets.py

When you run:

python pipelines/google_sheets.py

dlt will look for secrets in my_dlt_project/.dlt/secrets.toml and ignore my_dlt_project/pipelines/.dlt/secrets.toml.

If you change your working directory to pipelines and run:

python google_sheets.py

dlt will look for my_dlt_project/pipelines/.dlt/secrets.toml instead.

Special locations

The TOML provider also reads configuration from special locations depending on your runtime environment:

Home directory: If available, dlt checks ~/.dlt/ for config.toml and secrets.toml. These values are merged with project-specific configurations, with project values taking precedence. This is useful for sharing global settings (like telemetry preferences) across all pipelines on a machine.
Google Colab: When running in Colab, you can use Colab Secrets named secrets.toml and config.toml. The provider reads these as if they were TOML files. This functionality is disabled if files exist in the .dlt folder.
Streamlit: When running in Streamlit without local .dlt/secrets.toml, the provider uses Streamlit secrets. You can add dlt secrets directly to your Streamlit secrets.

Custom providers

You can create and register your own configuration providers to customize how dlt accesses configuration values. The simplest approach is to write a function that returns a nested dictionary where keys correspond to sections and argument names.

This example demonstrates how to create a custom provider that loads configuration from a JSON file:

import dlt
from dlt.common import json
from dlt.common.configuration.providers import CustomLoaderDocProvider

# Create a function that loads a dictionary
def load_config():
    with open("config.json", "rb") as f:
        return json.load(f)

# Create the custom provider
provider = CustomLoaderDocProvider(
    "my_json_provider", 
    load_config, 
    supports_secrets=False
)

# Register the provider with dlt
dlt.config.register_provider(provider)

tip

Check out our example YAML provider that supports switchable configuration profiles.

Examples

Configure both config and secrets

This example uses the Notion source and filesystem destination to demonstrate how to organize configuration in TOML files using the recommended section layout.

The Notion source is defined in a file named notion.py, so we use that module name in the configuration. We configure the api_key in our configuration while passing the list of database IDs explicitly in code. For the filesystem destination, we split configuration between config.toml (for bucket_url) and secrets.toml (for AWS credentials).

import dlt

@dlt.source
def notion_databases(
    database_ids = None,
    api_key: str = dlt.secrets.value,  # mark argument to be injected as secret
):
   ...

# Pass database_id in code, let `dlt` inject api_key
sales_database = notion_databases(  # type: ignore
  database_ids=[
            {
                "id": "a94223535c674d33a24e313e7921ce15",
                "use_name": "sales_alias",
            }
        ]
)

TOML config provider
Environment variables
In the code

config.toml

[runtime]
log_level="INFO"

# Do not compress files sent to the filesystem bucket
[normalize.data_writer]
disable_compression=true

# Recommended sections for the destination (destination.module)
[destination.filesystem]
bucket_url = "s3://[your_bucket_name]"

secrets.toml

# Recommended sections for sources (sources.module)
[sources.notion]
api_key = "your-notion-api-key"  # Will be injected to api_key argument

# Recommended sections for destination credentials
[destination.filesystem.credentials]
aws_access_key_id = "ABCDEFGHIJKLMNOPQRST" 
aws_secret_access_key = "1234567890_access_key" 

# Environment variables for both config and secrets follow the same format
export RUNTIME__LOG_LEVEL="INFO"
export DESTINATION__FILESYSTEM__BUCKET_URL="s3://[your_bucket_name]"
export NORMALIZE__DATA_WRITER__DISABLE_COMPRESSION="true"
export SOURCES__NOTION__API_KEY="your-notion-api-key"
export DESTINATION__FILESYSTEM__CREDENTIALS__AWS_ACCESS_KEY_ID="ABCDEFGHIJKLMNOPQRST"
export DESTINATION__FILESYSTEM__CREDENTIALS__AWS_SECRET_ACCESS_KEY="1234567890_access_key"

import os
import dlt
import botocore.session
from dlt.common.credentials import AwsCredentials

# Set configuration values directly in code
# Via environment variables
os.environ["RUNTIME__LOG_LEVEL"] = "INFO"
os.environ["DESTINATION__FILESYSTEM__BUCKET_URL"] = "s3://[your_bucket_name]"
os.environ["NORMALIZE__DATA_WRITER__DISABLE_COMPRESSION"] = "true"

# Or directly through dlt.config
dlt.config["runtime.log_level"] = "INFO"
dlt.config["destination.filesystem.bucket_url"] = "s3://[your_bucket_name]"
dlt.config["normalize.data_writer.disable_compression"] = "true"

# For secrets, avoid hardcoding - use existing environment variables
os.environ["SOURCES__NOTION__API_KEY"] = os.environ.get("NOTION_KEY")

# Or use credentials from third-party providers
credentials = AwsCredentials()
session = botocore.session.get_session()
credentials.parse_native_representation(session)
dlt.secrets["destination.filesystem.credentials"] = credentials

warning

While you can put all configuration and credentials in secrets.toml for convenience, sensitive information should never be placed in config.toml or other non-secure locations. dlt will raise an exception if it detects secrets in inappropriate locations.

Use different Google credentials for source and destination

This example shows how to configure different credentials for Google-based sources and destinations:

If you want both the BigQuery destination and Google Sheets source to use the same credentials:

TOML config provider
Environment variables
In the code

[credentials]
client_email = "<client_email_for_both>"
private_key = "<private_key_for_both>"
project_id = "<project_id_for_both>"

export CREDENTIALS__CLIENT_EMAIL="<client_email_for_both>"
export CREDENTIALS__PRIVATE_KEY="<private_key_for_both>"
export CREDENTIALS__PROJECT_ID="<project_id_for_both>"

import os

# Avoid setting secrets directly in code
# Instead, use existing environment variables
os.environ["CREDENTIALS__CLIENT_EMAIL"] = os.environ.get("GOOGLE_CLIENT_EMAIL")
os.environ["CREDENTIALS__PRIVATE_KEY"] = os.environ.get("GOOGLE_PRIVATE_KEY")
os.environ["CREDENTIALS__PROJECT_ID"] = os.environ.get("GOOGLE_PROJECT_ID")

Option 2: Use separate credentials for sources and destinations

To keep source and destination credentials separate:

TOML config provider
Environment variables
In the code

# Google Sheets credentials
[sources.credentials]
client_email = "<sheets_client_email>"
private_key = "<sheets_private_key>"
project_id = "<sheets_project_id>"

# BigQuery credentials
[destination.credentials]
client_email = "<bigquery_client_email>"
private_key = "<bigquery_private_key>"
project_id = "<bigquery_project_id>"

# Google Sheets credentials
export SOURCES__CREDENTIALS__CLIENT_EMAIL="<sheets_client_email>"
export SOURCES__CREDENTIALS__PRIVATE_KEY="<sheets_private_key>"
export SOURCES__CREDENTIALS__PROJECT_ID="<sheets_project_id>"

# BigQuery credentials
export DESTINATION__CREDENTIALS__CLIENT_EMAIL="<bigquery_client_email>"
export DESTINATION__CREDENTIALS__PRIVATE_KEY="<bigquery_private_key>"
export DESTINATION__CREDENTIALS__PROJECT_ID="<bigquery_project_id>"

import dlt
import os

# For destination credentials, use existing environment variables
os.environ["DESTINATION__CREDENTIALS__CLIENT_EMAIL"] = os.environ.get("BIGQUERY_CLIENT_EMAIL")
os.environ["DESTINATION__CREDENTIALS__PRIVATE_KEY"] = os.environ.get("BIGQUERY_PRIVATE_KEY")
os.environ["DESTINATION__CREDENTIALS__PROJECT_ID"] = os.environ.get("BIGQUERY_PROJECT_ID")

# For source credentials, set values in dlt.secrets
dlt.secrets["sources.credentials.client_email"] = os.environ.get("SHEETS_CLIENT_EMAIL")
dlt.secrets["sources.credentials.private_key"] = os.environ.get("SHEETS_PRIVATE_KEY")
dlt.secrets["sources.credentials.project_id"] = os.environ.get("SHEETS_PROJECT_ID")

With this setup, dlt looks for destination credentials in this order:

destination.bigquery.credentials --> Not found
destination.credentials --> Found

And for source credentials:

sources.google_sheets_module.google_sheets_function.credentials --> Not found
sources.google_sheets_function.credentials --> Not found
sources.credentials --> Found

Configure credentials for multiple sources and destinations

When working with multiple Google-based sources and destinations, you can use recommended sections layout:

TOML config provider
Environment variables
In the code

# Google Sheets credentials
[sources.google_sheets.credentials]
client_email = "<sheets_client_email>"
private_key = "<sheets_private_key>"
project_id = "<sheets_project_id>"

# Google Analytics credentials
[sources.google_analytics.credentials]
client_email = "<analytics_client_email>"
private_key = "<analytics_private_key>"
project_id = "<analytics_project_id>"

# BigQuery credentials
[destination.bigquery.credentials]
client_email = "<bigquery_client_email>"
private_key = "<bigquery_private_key>"
project_id = "<bigquery_project_id>"

# Google Sheets credentials
export SOURCES__GOOGLE_SHEETS__CREDENTIALS__CLIENT_EMAIL="<sheets_client_email>"
export SOURCES__GOOGLE_SHEETS__CREDENTIALS__PRIVATE_KEY="<sheets_private_key>"
export SOURCES__GOOGLE_SHEETS__CREDENTIALS__PROJECT_ID="<sheets_project_id>"

# Google Analytics credentials
export SOURCES__GOOGLE_ANALYTICS__CREDENTIALS__CLIENT_EMAIL="<analytics_client_email>"
export SOURCES__GOOGLE_ANALYTICS__CREDENTIALS__PRIVATE_KEY="<analytics_private_key>"
export SOURCES__GOOGLE_ANALYTICS__CREDENTIALS__PROJECT_ID="<analytics_project_id>"

# BigQuery credentials
export DESTINATION__BIGQUERY__CREDENTIALS__CLIENT_EMAIL="<bigquery_client_email>"
export DESTINATION__BIGQUERY__CREDENTIALS__PRIVATE_KEY="<bigquery_private_key>"
export DESTINATION__BIGQUERY__CREDENTIALS__PROJECT_ID="<bigquery_project_id>"

import os
import dlt

# For Analytics credentials
os.environ["SOURCES__GOOGLE_ANALYTICS__CREDENTIALS__CLIENT_EMAIL"] = os.environ.get("ANALYTICS_CLIENT_EMAIL")
os.environ["SOURCES__GOOGLE_ANALYTICS__CREDENTIALS__PRIVATE_KEY"] = os.environ.get("ANALYTICS_PRIVATE_KEY")
os.environ["SOURCES__GOOGLE_ANALYTICS__CREDENTIALS__PROJECT_ID"] = os.environ.get("ANALYTICS_PROJECT_ID")

# For BigQuery credentials
os.environ["DESTINATION__BIGQUERY__CREDENTIALS__CLIENT_EMAIL"] = os.environ.get("BIGQUERY_CLIENT_EMAIL")
os.environ["DESTINATION__BIGQUERY__CREDENTIALS__PRIVATE_KEY"] = os.environ.get("BIGQUERY_PRIVATE_KEY")
os.environ["DESTINATION__BIGQUERY__CREDENTIALS__PROJECT_ID"] = os.environ.get("BIGQUERY_PROJECT_ID")

# For Google Sheets credentials
dlt.secrets["sources.google_sheets.credentials.client_email"] = os.environ.get("SHEETS_CLIENT_EMAIL")
dlt.secrets["sources.google_sheets.credentials.private_key"] = os.environ.get("SHEETS_PRIVATE_KEY")
dlt.secrets["sources.google_sheets.credentials.project_id"] = os.environ.get("SHEETS_PROJECT_ID")

Configure multiple instances of the same source

If you need to extract data from the same source type with different configurations, you can run them in different pipeline names:

TOML config provider
Environment variables
In the code

[pipeline_name_1.sources.sql_database]
credentials="snowflake://user1:password1@service-account/database1?warehouse=warehouse_name&role=role1"

[pipeline_name_2.sources.sql_database]
credentials="snowflake://user2:password2@service-account/database2?warehouse=warehouse_name&role=role2"

export PIPELINE_NAME_1__SOURCES__SQL_DATABASE__CREDENTIALS="snowflake://user1:password1@service-account/database1?warehouse=warehouse_name&role=role1"
export PIPELINE_NAME_2__SOURCES__SQL_DATABASE__CREDENTIALS="snowflake://user2:password2@service-account/database2?warehouse=warehouse_name&role=role2"

import os
import dlt

# Use existing environment variables to set credentials
os.environ["PIPELINE_NAME_1__SOURCES__SQL_DATABASE__CREDENTIALS"] = os.environ.get("SQL_CREDENTIAL_STRING_1")

# Or set values directly in dlt.secrets
dlt.secrets["pipeline_name_2.sources.sql_database.credentials"] = os.environ.get("SQL_CREDENTIAL_STRING_2")

tip

You have additional options for using multiple instances of the same source:

Use the clone() method as explained in the sql_database documentation.
Create named destinations to use the same destination type with different configurations.

Troubleshoot configuration errors

If dlt can't find a required configuration value or secret, it raises a ConfigFieldMissingException that provides detailed information about what was searched for and where.

For example, running the chess.py example without providing the password:

$ CREDENTIALS="postgres://loader@localhost:5432/dlt_data" python chess.py
...
dlt.common.configuration.exceptions.ConfigFieldMissingException: Following fields are missing: ['password'] in configuration with spec PostgresCredentials
        for field "password" config providers and keys were tried in the following order:
                In Environment Variables key CHESS_GAMES__DESTINATION__POSTGRES__CREDENTIALS__PASSWORD was not found.
                In Environment Variables key CHESS_GAMES__DESTINATION__CREDENTIALS__PASSWORD was not found.
                In Environment Variables key CHESS_GAMES__CREDENTIALS__PASSWORD was not found.
                In secrets.toml key chess_games.destination.postgres.credentials.password was not found.
                In secrets.toml key chess_games.destination.credentials.password was not found.
                In secrets.toml key chess_games.credentials.password was not found.
                In Environment Variables key DESTINATION__POSTGRES__CREDENTIALS__PASSWORD was not found.
                In Environment Variables key DESTINATION__CREDENTIALS__PASSWORD was not found.
                In Environment Variables key CREDENTIALS__PASSWORD was not found.
                In secrets.toml key destination.postgres.credentials.password was not found.
                In secrets.toml key destination.credentials.password was not found.
                In secrets.toml key credentials.password was not found.
Please refer to https://dlthub.com/docs/general-usage/credentials/ for more information

This error message shows exactly:

Which field is missing (password in this case)
All the keys and locations dlt checked, in order of priority
That it first looked with the pipeline name (chess_games) prefix, then without it
That it searched environment variables first, then secrets.toml

Note that config.toml wasn't checked since it's not appropriate for storing secrets.

Overview

Key features

Choose where to store configuration

Select a configuration layout

Recommended section layout

How dlt looks for values

Use built-in credential types

Environment variables

Vaults

secrets.toml and config.toml

File locations

Special locations

Custom providers

Examples

Configure both config and secrets

Use different Google credentials for source and destination

Option 2: Use separate credentials for sources and destinations

Configure credentials for multiple sources and destinations

Configure multiple instances of the same source

Troubleshoot configuration errors

DHelp

Ask a question

Key features​

Choose where to store configuration​

Select a configuration layout​

Recommended section layout​

How dlt looks for values​

Use built-in credential types​

Environment variables​

Vaults​

secrets.toml and config.toml​

File locations​

Special locations​

Custom providers​

Examples​

Configure both config and secrets​

Use different Google credentials for source and destination​

Option 1: Share credentials between source and destination​

Option 2: Use separate credentials for sources and destinations​

Configure credentials for multiple sources and destinations​

Configure multiple instances of the same source​

Troubleshoot configuration errors​

DHelp

Ask a question

Key features

Choose where to store configuration

Select a configuration layout

Recommended section layout

How dlt looks for values

Use built-in credential types

Environment variables

Vaults

secrets.toml and config.toml

File locations

Special locations

Custom providers

Examples

Configure both config and secrets

Use different Google credentials for source and destination

Option 1: Share credentials between source and destination

Option 2: Use separate credentials for sources and destinations

Configure credentials for multiple sources and destinations

Configure multiple instances of the same source

Troubleshoot configuration errors