Loading Data from Sentry
to Azure Cloud Storage
with dlt
in Python
Join our Slack community or book a call with our support engineer Violetta.
This documentation provides a guide for loading data from Sentry
to Azure Cloud Storage
using the open-source Python library dlt
. Sentry
is a developer-first application monitoring platform that supports over 100 languages and frameworks, offering comprehensive error monitoring for frontend, backend, mobile, and gaming applications. Azure Cloud Storage
serves as a filesystem destination, allowing you to store data on Microsoft Azure and create data lakes with ease. Data can be uploaded in formats such as JSONL, Parquet, or CSV. This guide will help you set up and configure dlt
to seamlessly transfer your Sentry
data to Azure Cloud Storage
. For more information about Sentry
, visit Sentry.io.
dlt
Key Features
- Governance Support:
dlt
pipelines offer robust governance support through pipeline metadata, schema enforcement, and schema change alerts. Read more - Alerting: Set up monitoring and alerting for
dlt
pipelines to keep track of the health of your data product and receive actionable alerts. Read more - Schema Enforcement and Curation: Ensure data consistency and quality by enforcing and curating schemas in
dlt
pipelines. Read more - Scaling and Finetuning: Scale up and finetune
dlt
pipelines with parallel processing, memory buffers, and compression options. Read more - Provider Key Formats: Learn how
dlt
translates provider-specific formats for TOML and environment variables, making it easier to manage credentials and configurations. Read more
Getting started with your pipeline locally
dlt-init-openapi
0. Prerequisites
dlt
and dlt-init-openapi
requires Python 3.9 or higher. Additionally, you need to have the pip
package manager installed, and we recommend using a virtual environment to manage your dependencies. You can learn more about preparing your computer for dlt in our installation reference.
1. Install dlt and dlt-init-openapi
First you need to install the dlt-init-openapi
cli tool.
pip install dlt-init-openapi
The dlt-init-openapi
cli is a powerful generator which you can use to turn any OpenAPI spec into a dlt
source to ingest data from that api. The quality of the generator source is dependent on how well the API is designed and how accurate the OpenAPI spec you are using is. You may need to make tweaks to the generated code, you can learn more about this here.
# generate pipeline
# NOTE: add_limit adds a global limit, you can remove this later
# NOTE: you will need to select which endpoints to render, you
# can just hit Enter and all will be rendered.
dlt-init-openapi sentry --url https://raw.githubusercontent.com/getsentry/sentry-api-schema/main/openapi-derefed.json --global-limit 2
cd sentry_pipeline
# install generated requirements
pip install -r requirements.txt
The last command will install the required dependencies for your pipeline. The dependencies are listed in the requirements.txt
:
dlt>=0.4.12
You now have the following folder structure in your project:
sentry_pipeline/
├── .dlt/
│ ├── config.toml # configs for your pipeline
│ └── secrets.toml # secrets for your pipeline
├── rest_api/ # The rest api verified source
│ └── ...