Skip to main content
Version: devel

Snowflake Plus

dlt+

This page is for dlt+, which requires a license. Join our early access program for a trial license.

Snowflake Plus is an experimental extension of the Snowflake destination that adds Apache Iceberg tables creation and related features. This destination is available starting from dlt+ version 0.9.0. It fully supports all the functionality of the standard Snowflake destination, plus:

  1. The ability to create Iceberg tables in Snowflake by setting force_iceberg to true in your config.toml file or dlt.yml file.
  2. Additional configuration for Iceberg tables in Snowflake via:
    • external_volume: The external volume name where Iceberg data is stored.
    • catalog: The catalog name in which Iceberg tables are created. Defaults to "SNOWFLAKE".
    • base_location: The base path that Snowflake uses for storing the table. If omitted, the table name is used.
    • catalog_sync: The name of a catalog integration configured for Snowflake Open Catalog. If specified, Snowflake syncs Snowflake-managed Iceberg tables in the database with an external catalog in your Snowflake Open Catalog account.

Installation

Install the dlt-plus package with the snowflake extra:

pip install "dlt-plus[snowflake]"

Once the snowflake extra is installed, you can configure a pipeline to use snowflake_plus exactly the same way you would use the snowflake destination.

Setup

  1. Configure your Snowflake credentials
  2. Set up a database user and permissions
  3. Configure an external volume in Snowflake
  4. Grant usage on the external volume to the role you are using to load data:
GRANT USAGE ON EXTERNAL VOLUME <external_volume_name> TO ROLE <role_name>;
  1. Configure the snowflake_plus destination. For a dlt+ project (in dlt.yml) or for a Python script (in config.toml):

If you don't have a dlt+ project yet, initialize one in the current working directory. Replace sql_database with the source of your choice:

dlt project init sql_database snowflake_plus

This will create a Snowflake Plus destination in your dlt.yml file:

destinations:
snowflake:
type: snowflake_plus

To enable Iceberg table creation, set the force_iceberg option to true and external_volume to the name of the external volume you created in step 3.

destinations:
snowflake:
type: snowflake_plus
external_volume: "<external_volume_name>"
force_iceberg: true

Configuration

The snowflake_plus destination extends the standard Snowflake configuration with additional options:

OptionDescriptionRequiredDefault
force_icebergWhether to force the creation of Iceberg tablesNoFalse
external_volumeThe external volume to store Iceberg metadataYesNone
catalogThe catalog to use for Iceberg tablesNo"SNOWFLAKE"
base_locationCustom base location for Iceberg data in the external volumeNo<dataset_name>/<table_name>
catalog_syncThe name of a catalog integration for syncing Iceberg tables to an external catalog in Snowflake Open CatalogNoNone

Configure these options in your config.toml file under the [destination.snowflake] section or in dlt.yml file under the destinations.snowflake_plus section.

Write dispositions

All standard write dispositions (append, replace, and merge) are supported for both regular Snowflake tables and Iceberg tables.

Data types

The Snowflake Plus destination supports all standard Snowflake destination data types, with additional type mappings for Iceberg tables:

dlt TypeIceberg Type
textstring
bigintlong, int
doubledouble
boolboolean
timestamptimestamp
datedate
timetime
decimaldecimal
binarybinary
jsonstring

Syncing Snowflake-managed Iceberg tables to Snowflake Open Catalog

To enable querying of Snowflake-managed Iceberg tables by third-party engines (e.g., Apache Spark) via an external catalog (Snowflake Open Catalog), use the catalog_sync configuration option. This setting specifies a catalog integration that syncs Iceberg tables to the external catalog.

Setup

  1. Create an external catalog in Snowflake Open Catalog.

  2. Create a catalog integration in Snowflake. Example:

  CREATE OR REPLACE CATALOG INTEGRATION my_open_catalog_int
CATALOG_SOURCE = POLARIS
TABLE_FORMAT = ICEBERG
REST_CONFIG = (
CATALOG_URI = 'https://<orgname>-<my-snowflake-open-catalog-account-name>.snowflakecomputing.com/polaris/api/catalog'
CATALOG_NAME = 'myOpenCatalogExternalCatalogName'
)
REST_AUTHENTICATION = (
TYPE = OAUTH
OAUTH_CLIENT_ID = 'myClientId'
OAUTH_CLIENT_SECRET = 'myClientSecret'
OAUTH_ALLOWED_SCOPES = ('PRINCIPAL_ROLE:ALL')
)
ENABLED = TRUE;

Refer to the Snowflake documentation for detailed setup instructions.

  1. Configure the catalog_sync option:
[destination.snowflake]
# ... other configuration
catalog_sync = "my_open_catalog_int"

Additional Resources

For more information on basic Snowflake destination functionality, please refer to the Snowflake destination documentation.

This demo works on codespaces. Codespaces is a development environment available for free to anyone with a Github account. You'll be asked to fork the demo repository and from there the README guides you with further steps.
The demo uses the Continue VSCode extension.

Off to codespaces!

DHelp

Ask a question

Welcome to "Codex Central", your next-gen help center, driven by OpenAI's GPT-4 model. It's more than just a forum or a FAQ hub – it's a dynamic knowledge base where coders can find AI-assisted solutions to their pressing problems. With GPT-4's powerful comprehension and predictive abilities, Codex Central provides instantaneous issue resolution, insightful debugging, and personalized guidance. Get your code running smoothly with the unparalleled support at Codex Central - coding help reimagined with AI prowess.