destinations.impl.athena.athena_adapter
PartitionTransformation Objects
class PartitionTransformation()
template
Template string of the transformation including column name placeholder. E.g. bucket(16, {column_name})
column_name
Column name to apply the transformation to
athena_partition Objects
class athena_partition()
Helper class to generate iceberg partition transformations
E.g. athena_partition.bucket(16, "id")
will return a transformation with template bucket(16, {column_name})
This can be correctly rendered by the athena loader with escaped column name.
year
@staticmethod
def year(column_name: str) -> PartitionTransformation
Partition by year part of a date or timestamp column.
month
@staticmethod
def month(column_name: str) -> PartitionTransformation
Partition by month part of a date or timestamp column.
day
@staticmethod
def day(column_name: str) -> PartitionTransformation
Partition by day part of a date or timestamp column.
hour
@staticmethod
def hour(column_name: str) -> PartitionTransformation
Partition by hour part of a date or timestamp column.
bucket
@staticmethod
def bucket(n: int, column_name: str) -> PartitionTransformation
Partition by hashed value to n buckets.
truncate
@staticmethod
def truncate(length: int, column_name: str) -> PartitionTransformation
Partition by value truncated to length.
athena_adapter
def athena_adapter(
data: Any,
partition: Union[str, PartitionTransformation,
Sequence[Union[str, PartitionTransformation]]] = None
) -> DltResource
Prepares data for loading into Athena
Arguments:
data
- The data to be transformed. This can be raw data or an instance of DltResource. If raw data is provided, the function will wrap it into aDltResource
object.partition
- Column name(s) or instances ofPartitionTransformation
to partition the table by. To use a transformation it's best to use the methods of the helper classathena_partition
to generate correctly escaped SQL in the loader.
Returns:
A DltResource
object that is ready to be loaded into BigQuery.
Raises:
ValueError
- If any hint is invalid or none are specified.
Examples:
data = [{"name": "Marcel", "department": "Engineering", "date_hired": "2024-01-30"}]
athena_adapter(data, partition=["department", athena_partition.year("date_hired"), athena_partition.bucket(8, "name")])
[DltResource with hints applied]