Python Guide: Load Zendesk Data to AWS S3 using dlt
Join our Slack community or book a call with our support engineer Violetta.
This guide provides technical documentation on how to use the open-source Python library, dlt
, to load data from Zendesk
to AWS S3
. Zendesk
is an award-winning customer service software trusted by over 200,000 customers, providing various channels for customer interaction including text, mobile, phone, email, live chat, and social media. On the other hand, AWS S3
is a robust filesystem destination that allows you to store data in formats such as JSONL, Parquet, or CSV, making it ideal for creating datalakes. Further information about Zendesk
can be found at https://www.zendesk.de.
dlt
Key Features
- Zendesk Verified Source: Detailed guide on how to utilize the Zendesk verified source in the
dlt
pipeline, including information on data retrieval methods and setting up credentials. Zendesk Verified Source - dlt init Command: Explanation of the
dlt init
command and how it can be used to initialize adlt
project with different sources and destinations. dlt init Command - Importing Zendesk Data to Weaviate: Step-by-step guide on how to import ticket data from the Zendesk API to Weaviate, a vector database, for advanced analysis. Importing Zendesk Data to Weaviate
- Similarity Searching with Qdrant: Tutorial on how to use the
dlt
source, Zendesk anddlt
destination, Qdrant to conduct a similarity search on your tickets data. Similarity Searching with Qdrant - Filesystem Setup Guide: A guide on how to initialize a
dlt
project with a filesystem as the destination. Filesystem Setup Guide
Getting started with your pipeline locally
0. Prerequisites
dlt
requires Python 3.8 or higher. Additionally, you need to have the pip
package manager installed, and we recommend using a virtual environment to manage your dependencies. You can learn more about preparing your computer for dlt in our installation reference.
1. Install dlt
First you need to install the dlt
library with the correct extras for AWS S3
:
pip install "dlt[filesystem]"
The dlt
cli has a useful command to get you started with any combination of source and destination. For this example, we want to load data from Zendesk
to AWS S3
. You can run the following commands to create a starting point for loading data from Zendesk
to AWS S3
:
# create a new directory
mkdir zendesk_pipeline
cd zendesk_pipeline
# initialize a new pipeline with your source and destination
dlt init zendesk filesystem
# install the required dependencies
pip install -r requirements.txt
The last command will install the required dependencies for your pipeline. The dependencies are listed in the requirements.txt
:
dlt[filesystem]>=0.3.8
You now have the following folder structure in your project:
zendesk_pipeline/
├── .dlt/
│ ├── config.toml # configs for your pipeline