Python Guide: Load Zendesk Data to AWS S3 using dlt
Join our Slack community or book a call with our support engineer Violetta.
This guide provides technical documentation on how to use the open-source Python library, dlt, to load data from Zendesk to AWS S3. Zendesk is an award-winning customer service software trusted by over 200,000 customers, providing various channels for customer interaction including text, mobile, phone, email, live chat, and social media. On the other hand, AWS S3 is a robust filesystem destination that allows you to store data in formats such as JSONL, Parquet, or CSV, making it ideal for creating datalakes. Further information about Zendesk can be found at https://www.zendesk.de.
dlt Key Features
- Zendesk Verified Source: Detailed guide on how to utilize the Zendesk verified source in the
dltpipeline, including information on data retrieval methods and setting up credentials. Zendesk Verified Source - dlt init Command: Explanation of the
dlt initcommand and how it can be used to initialize adltproject with different sources and destinations. dlt init Command - Importing Zendesk Data to Weaviate: Step-by-step guide on how to import ticket data from the Zendesk API to Weaviate, a vector database, for advanced analysis. Importing Zendesk Data to Weaviate
- Similarity Searching with Qdrant: Tutorial on how to use the
dltsource, Zendesk anddltdestination, Qdrant to conduct a similarity search on your tickets data. Similarity Searching with Qdrant - Filesystem Setup Guide: A guide on how to initialize a
dltproject with a filesystem as the destination. Filesystem Setup Guide
Getting started with your pipeline locally
0. Prerequisites
dlt requires Python 3.8 or higher. Additionally, you need to have the pip package manager installed, and we recommend using a virtual environment to manage your dependencies. You can learn more about preparing your computer for dlt in our installation reference.
1. Install dlt
First you need to install the dlt library with the correct extras for AWS S3:
pip install "dlt[filesystem]"
The dlt cli has a useful command to get you started with any combination of source and destination. For this example, we want to load data from Zendesk to AWS S3. You can run the following commands to create a starting point for loading data from Zendesk to AWS S3:
# create a new directory
mkdir zendesk_pipeline
cd zendesk_pipeline
# initialize a new pipeline with your source and destination
dlt init zendesk filesystem
# install the required dependencies
pip install -r requirements.txt
The last command will install the required dependencies for your pipeline. The dependencies are listed in the requirements.txt:
dlt[filesystem]>=0.3.8
You now have the following folder structure in your project:
zendesk_pipeline/
├── .dlt/
│ ├── config.toml # configs for your pipeline