dlt.destinations.impl.clickhouse.clickhouse_adapter
clickhouse_adapter
def clickhouse_adapter(data: Any,
table_engine_type: Optional[TTableEngineType] = None,
sort: Optional[TSQLExprOrColumnSeq] = None,
partition: Optional[TSQLExprOrColumnSeq] = None,
settings: Optional[TMergeTreeSettings] = None,
codecs: Optional[TColumnCodecs] = None) -> DltResource
Adapts the given data by applying Clickhouse-specific hints.
Arguments:
dataAny - The data to be transformed. It can be raw data or an instance of DltResource. If raw data, the function wraps it into a DltResource object.table_engine_typeTTableEngineType, optional - The table index type used when creating the Clickhouse table.sortTSQLExprOrColumnSeq, optional - Sorting key SQL expression or sequence of column names. Used to generatedORDER BYclause of table creation statement. If passing a SQL expression, use normalized column names when referring to columns.partitionTSQLExprOrColumnSeq, optional - Partition key SQL expression or sequence of column names. Used to generatedPARTITION BYclause of table creation statement. If passing a SQL expression, use normalized column names when referring to columns.settingsTMergeTreeSettings, optional - Dictionary of MergeTree settings to apply to the table. Will be added toSETTINGSclause of table creation statement.codecsTColumnCodecs, optional - Dictionary of codecs to apply to the table's columns. Will be added asCODECclauses in column definitions of table creation statement.
Returns:
DltResource- A resource with applied Clickhouse-specific hints.
Raises:
ValueError- If input fortable_engine_typeis invalid.TypeError- If input types forsort,partition,settings, orcodecsare invalid.
Examples:
Set table engine type:
data = [{"name": "Alice", "description": "Software Developer"}]
clickhouse_adapter(data, table_engine_type="merge_tree")
Set sort and partition keys:
data = [{"date": "2024-01-01", "town": "Springfield", "street": "Evergreen Terrace"}]
clickhouse_adapter(
data,
sort=["town", "street"], # can also be SQL expression
partition="toYYYYMM(date)" # can also be sequence of column names
)
Set MergeTree settings:
clickhouse_adapter(
data,
settings={"allow_nullable_key": True, "max_suspicious_broken_parts": 500}
)
Set column codecs:
clickhouse_adapter(
data,
codecs={"town": "LZ4HC", "street": "Delta, ZSTD(2)"}
)
set_column_hints_from_table_hint
def set_column_hints_from_table_hint(
columns: TTableSchemaColumns, hint: TSQLExprOrColumnSeq,
hint_name: Literal["sort", "partition"]) -> TTableSchemaColumns
Sets column hints based on provided table hint.
Modifies columns in place and returns it.
When it's a sort table hint, it sets sort column hints.
When it's a partition table hint, it sets partition column hints.
Principles: table hint takes precedence over column hints.
Rules:
- sets/overrides column hint to True for each column in table hint, even if user provided False
- sets nullability to False for each column in table hint, unless user provided nullable=True
- removes column hint if it's set to True but not in table hint
- retains any user-provided nullability