Skip to main content
Version: 1.5.0 (latest)

Setup

Need help deploying these sources or figuring out how to run them in your data stack?
Join our Slack community or Book a call with a dltHub Solutions Engineer.

To connect to your SQL database using dlt, follow these steps:

  1. Initialize a dlt project in the current working directory by running the following command:

    dlt init sql_database duckdb

    This will add necessary files and configurations for a dlt pipeline with SQL database as the source and DuckDB as the destination.

tip

If you'd like to use a different destination, simply replace duckdb with the name of your preferred destination.

  1. Add credentials for your SQL database

    To connect to your SQL database, dlt would need to authenticate using necessary credentials. To enable this, paste your credentials in the secrets.toml file created inside the .dlt/ folder in the following format:

    [sources.sql_database.credentials]
    drivername = "mysql+pymysql" # driver name for the database
    database = "Rfam" # database name
    username = "rfamro" # username associated with the database
    host = "mysql-rfam-public.ebi.ac.uk" # host address
    port = "4497" # port required for connection

    Alternatively, you can also authenticate using connection strings:

    [sources.sql_database.credentials]
    credentials="mysql+pymysql://rfamro@mysql-rfam-public.ebi.ac.uk:4497/Rfam"

    To learn more about how to add credentials into your sql_database pipeline, see here.

  2. Add credentials for your destination (if necessary)

    Depending on which destination you're loading into, you might also need to add your destination credentials. For more information, read the General Usage: Credentials.

  3. Install any necessary dependencies

    pip install -r requirements.txt
  4. Run the pipeline

    python sql_database_pipeline.py

    Executing this command will run the example script sql_database_pipeline.py created in step 1. In order for this to run successfully, you will need to pass the names of the databases and/or tables you wish to load. See the section on configuring the sql_database source for more details.

  1. Make sure everything is loaded as expected with

    dlt pipeline <pipeline_name> show
    note

    The pipeline_name for the above example is rfam, you may also use any custom name instead.

This demo works on codespaces. Codespaces is a development environment available for free to anyone with a Github account. You'll be asked to fork the demo repository and from there the README guides you with further steps.
The demo uses the Continue VSCode extension.

Off to codespaces!

DHelp

Ask a question

Welcome to "Codex Central", your next-gen help center, driven by OpenAI's GPT-4 model. It's more than just a forum or a FAQ hub – it's a dynamic knowledge base where coders can find AI-assisted solutions to their pressing problems. With GPT-4's powerful comprehension and predictive abilities, Codex Central provides instantaneous issue resolution, insightful debugging, and personalized guidance. Get your code running smoothly with the unparalleled support at Codex Central - coding help reimagined with AI prowess.