dlt.destinations.impl.databricks.databricks_adapter
databricks_adapter
def databricks_adapter(
data: Any,
cluster: TColumnNames = None,
table_comment: Optional[str] = None,
table_tags: Optional[List[Union[str, Dict[str, str]]]] = None,
column_hints: Optional[TDatabricksTableSchemaColumns] = None
) -> DltResource
Prepares data for loading into Databricks.
This function takes data, which can be raw or already wrapped in a DltResource object, and prepares it for Databricks by optionally specifying clustering and table description.
Arguments:
data
Any - The data to be transformed. This can be raw data or an instance of DltResource. If raw data is provided, the function will wrap it into aDltResource
object.cluster
TColumnNames, optional - A column name or list of column names to cluster the Databricks table by.table_comment
str, optional - A description for the Databricks table.table_tags
List[Union[str, Dict[str, str]]], optional - A list of tags for the Databricks table. Can contain a mix of strings and key-value pairs as dictionaries.Example
- ["production", {"environment": "prod"}, "employees"]column_hints
TTableSchemaColumns, optional - A dictionary of column hints. Each key is a column name, and the value is a dictionary of hints. The supported hints are:column_comment
- adds a comment to the column. Supports basic markdown format basic-syntax.column_tags
- adds tags to the column. Supports a list of strings and/or key-value pairs.
Returns:
A DltResource
object that is ready to be loaded into Databricks.
Raises:
ValueError
- If any hint is invalid or none are specified.
Examples:
data = [{"name": "Marcel", "description": "Raccoon Engineer", "date_hired": 1700784000}]
databricks_adapter(data, cluster="date_hired", table_comment="Employee Data",
... table_tags=["production", {"environment": "prod"}, "employees"])