Load Data from SAP HANA
to BigQuery
Using dlt
in Python
Join our Slack community or book a call with our support engineer Violetta.
SAP HANA
(High-performance ANalytic Appliance) is a multi-model database that stores data in its memory instead of on a disk. This column-oriented, in-memory design allows for advanced analytics and high-speed transactions within a single system. BigQuery
is a serverless, cost-effective enterprise data warehouse that operates across clouds and scales with your data. Using the open-source Python library dlt
, you can efficiently load data from SAP HANA
to BigQuery
. This documentation provides a step-by-step guide to facilitate this data transfer process. For more details on SAP HANA
, visit SAP HANA Overview.
dlt
Key Features
- **Easy to get started**: `dlt` is a Python library that is easy to use and understand. It is designed to be simple to use and easy to understand. Type `pip install dlt` and you are ready to go.
- **Google BigQuery Integration**: Seamlessly integrate with Google BigQuery for data loading and management. [Learn more](https://dlthub.com/docs/dlt-ecosystem/destinations/bigquery).
- **DuckDB Integration**: Efficiently load and manage data with DuckDB using `dlt`. [Learn more](https://dlthub.com/docs/dlt-ecosystem/destinations/duckdb).
- **Data Governance**: Robust governance support through pipeline metadata, schema enforcement, and schema change alerts. [Learn more](https://dlthub.com/docs/build-a-pipeline-tutorial).
- **Data Transformations**: Perform transformations after loading data using dbt, the `dlt` SQL client, or Pandas. [Learn more](https://dlthub.com/docs/build-a-pipeline-tutorial).
Getting started with your pipeline locally
0. Prerequisites
dlt
requires Python 3.8 or higher. Additionally, you need to have the pip
package manager installed, and we recommend using a virtual environment to manage your dependencies. You can learn more about preparing your computer for dlt in our installation reference.
1. Install dlt
First you need to install the dlt
library with the correct extras for BigQuery
:
pip install "dlt[bigquery]"
The dlt
cli has a useful command to get you started with any combination of source and destination. For this example, we want to load data from SAP HANA
to BigQuery
. You can run the following commands to create a starting point for loading data from SAP HANA
to BigQuery
:
# create a new directory
mkdir sql_database_hana_pipeline
cd sql_database_hana_pipeline
# initialize a new pipeline with your source and destination
dlt init sql_database bigquery
# install the required dependencies
pip install -r requirements.txt
The last command will install the required dependencies for your pipeline. The dependencies are listed in the requirements.txt
:
sqlalchemy>=1.4
dlt[bigquery]>=0.4.7
You now have the following folder structure in your project:
sql_database_hana_pipeline/