Transforming data
If you'd like to transform your data after a pipeline load, you have 3 options available to you:
- Using dbt - dlt provides a convenient dbt wrapper to make integration easier.
- Using the
dlt
SQL client - dlt exposes an SQL client to transform data on your destination directly using SQL. - Using Python with DataFrames or Arrow tables - you can also transform your data using Arrow tables and DataFrames in Python.
If you need to preprocess some of your data before it is loaded, you can learn about strategies to:
This is particularly useful if you are trying to remove data related to PII or other sensitive data, you want to remove columns that are not needed for your use case or you are using a destination that does not support certain data types in your source data.
Learn more
🗃️ Transforming data with dbt
2 items
📄️ Transforming data in Python with Arrow tables or DataFrames
Transforming data loaded by a dlt pipeline with pandas dataframes or arrow tables
📄️ Transforming data with SQL
Transforming the data loaded by a dlt pipeline with the dlt SQL client
📄️ Renaming columns
Renaming columns by replacing the special characters
📄️ Pseudonymizing columns
Pseudonymizing (or anonymizing) columns by replacing the special characters
📄️ Removing columns
Removing columns by passing a list of column names