Load Data from Adobe Commerce (Magento)
to Supabase
with dlt
in Python
We will be using the dlt PostgreSQL destination to connect to Supabase. You can get the connection string for your Supabase database as described in the Supabase Docs.
Join our Slack community or book a call with our support engineer Violetta.
Adobe Commerce (Magento)
is a flexible and scalable commerce platform that lets you create uniquely personalized B2B and B2C experiences. This documentation explains how to load data from Adobe Commerce (Magento)
to Supabase
using the open-source Python library dlt
. Supabase
is an open-source Firebase alternative that provides a Postgres database, Authentication, instant APIs, Edge Functions, Realtime subscriptions, Storage, and Vector embeddings. By leveraging dlt
, you can streamline the process of transferring your commerce data to Supabase
, ensuring efficient and reliable data management. For more information on Adobe Commerce (Magento)
, visit this link.
dlt
Key Features
- Pipeline Metadata:
dlt
pipelines leverage metadata to provide governance capabilities. This metadata includes load IDs, which consist of a timestamp and pipeline name. Load IDs enable incremental transformations and data vaulting by tracking data loads and facilitating data lineage and traceability. Learn more - Schema Enforcement and Curation:
dlt
empowers users to enforce and curate schemas, ensuring data consistency and quality. Schemas define the structure of normalized data and guide the processing and loading of data. Learn more - Scalability via Iterators, Chunking, and Parallelization:
dlt
offers scalable data extraction by leveraging iterators, chunking, and parallelization techniques. This approach allows for efficient processing of large datasets by breaking them down into manageable chunks. Learn more - Implicit Extraction DAGs:
dlt
incorporates the concept of implicit extraction DAGs to handle the dependencies between data sources and their transformations automatically. This extraction DAG determines the optimal order for extracting the resources to ensure data consistency and integrity. Learn more - Schema Evolution:
dlt
enables proactive governance by alerting users to schema changes. When modifications occur in the source data’s schema, such as table or column alterations,dlt
notifies stakeholders, allowing them to take necessary actions. Learn more
Getting started with your pipeline locally
dlt-init-openapi
0. Prerequisites
dlt
and dlt-init-openapi
requires Python 3.9 or higher. Additionally, you need to have the pip
package manager installed, and we recommend using a virtual environment to manage your dependencies. You can learn more about preparing your computer for dlt in our installation reference.
1. Install dlt and dlt-init-openapi
First you need to install the dlt-init-openapi
cli tool.
pip install dlt-init-openapi
The dlt-init-openapi
cli is a powerful generator which you can use to turn any OpenAPI spec into a dlt
source to ingest data from that api. The quality of the generator source is dependent on how well the API is designed and how accurate the OpenAPI spec you are using is. You may need to make tweaks to the generated code, you can learn more about this here.
# generate pipeline
# NOTE: add_limit adds a global limit, you can remove this later
# NOTE: you will need to select which endpoints to render, you
# can just hit Enter and all will be rendered.
dlt-init-openapi magento --url https://raw.githubusercontent.com/dlt-hub/openapi-specs/main/open_api_specs/Business/magento.yaml --global-limit 2
cd magento_pipeline
# install generated requirements
pip install -r requirements.txt
The last command will install the required dependencies for your pipeline. The dependencies are listed in the requirements.txt
:
dlt>=0.4.12
You now have the following folder structure in your project:
magento_pipeline/
├── .dlt/
│ ├── config.toml # configs for your pipeline
│ └── secrets.toml # secrets for your pipeline
├── rest_api/ # The rest api verified source
│ └── ...
├── magento/
│ └── __init__.py # TODO: possibly tweak this file
├── magento_pipeline.py # your main pipeline script
├── requirements.txt # dependencies for your pipeline
└── .gitignore # ignore files for git (not required)