Skip to main content
Version: devel

dlt.destinations.impl.databricks.databricks_adapter

databricks_adapter

def databricks_adapter(
data: Any,
cluster: Union[TColumnNames, Literal["AUTO"]] = None,
partition: TColumnNames = None,
table_format: Literal["DELTA", "ICEBERG"] = "DELTA",
table_comment: Optional[str] = None,
table_tags: Optional[List[Union[str, Dict[str, str]]]] = None,
table_properties: Optional[Dict[str, Union[str, int, bool, float]]] = None,
column_hints: Optional[TDatabricksTableSchemaColumns] = None
) -> DltResource

View source on GitHub

Prepares data for loading into Databricks.

This function takes data, which can be raw or already wrapped in a DltResource object, and prepares it for Databricks by optionally specifying clustering, partitioning, and table description.

Arguments:

  • data Any - The data to be transformed. This can be raw data or an instance of DltResource. If raw data is provided, the function will wrap it into a DltResource object.
  • cluster Union[TColumnNames, Literal["AUTO"]], optional - A column name, list of column names, or "AUTO" to cluster the Databricks table by. Use "AUTO" to let Databricks automatically determine the best clustering.
  • partition TColumnNames, optional - A column name or list of column names to partition the Databricks table by. Partitioning divides the table into separate files based on the partition column values.
  • table_format Literal["DELTA", "ICEBERG"], optional - The table format to use. Defaults to "DELTA". Use "ICEBERG" to create Apache Iceberg tables for better schema evolution and time travel capabilities.
  • table_comment str, optional - A description for the Databricks table.
  • table_tags List[Union[str, Dict[str, str]]], optional - A list of tags for the Databricks table. Can contain a mix of strings and key-value pairs as dictionaries.
  • Example - ["production", {"environment": "prod"}, "employees"]
  • table_properties Dict[str, Union[str, int, bool, float]], optional - A dictionary of table properties to be added to the Databricks table using TBLPROPERTIES. These are key-value pairs for metadata and Delta Lake optimization settings. Example: {"delta.appendOnly": True, "delta.logRetentionDuration": "30 days"}
  • column_hints TTableSchemaColumns, optional - A dictionary of column hints. Each key is a column name, and the value is a dictionary of hints. The supported hints are:
    • column_comment - adds a comment to the column. Supports basic markdown format basic-syntax.
    • column_tags - adds tags to the column. Supports a list of strings and/or key-value pairs.

Returns:

A DltResource object that is ready to be loaded into Databricks.

Raises:

  • ValueError - If any hint is invalid or none are specified.

Examples:

    data = [{"name": "Marcel", "description": "Raccoon Engineer", "date_hired": 1700784000}]
databricks_adapter(data, cluster="date_hired", table_comment="Employee Data",

... table_tags=["production", {"environment": "prod"}, "employees"])

    # Use AUTO clustering
databricks_adapter(data, cluster="AUTO", table_comment="Auto-clustered table")
# Use partitioning
databricks_adapter(data, partition=["year", "month"], cluster="customer_id")
# Create Iceberg table
databricks_adapter(data, table_format="ICEBERG", cluster="customer_id")

This demo works on codespaces. Codespaces is a development environment available for free to anyone with a Github account. You'll be asked to fork the demo repository and from there the README guides you with further steps.
The demo uses the Continue VSCode extension.

Off to codespaces!

DHelp

Ask a question

Welcome to "Codex Central", your next-gen help center, driven by OpenAI's GPT-4 model. It's more than just a forum or a FAQ hub – it's a dynamic knowledge base where coders can find AI-assisted solutions to their pressing problems. With GPT-4's powerful comprehension and predictive abilities, Codex Central provides instantaneous issue resolution, insightful debugging, and personalized guidance. Get your code running smoothly with the unparalleled support at Codex Central - coding help reimagined with AI prowess.