dlt.pipeline.helpers
retry_load
def retry_load(retry_on_pipeline_steps: Sequence[TPipelineStep] = (
"load", )) -> Callable[[BaseException], bool]
A retry strategy for Tenacity that, with default setting, will repeat load
step for all exceptions that are not terminal
Use this condition with tenacity retry_if_exception
. Terminal exceptions are exceptions that will not go away when operations is repeated.
Examples: missing configuration values, Authentication Errors, terminally failed jobs exceptions etc.
data = source(...)
for attempt in Retrying(stop=stop_after_attempt(3), retry=retry_if_exception(retry_load(())), reraise=True):
with attempt:
p.run(data)
Arguments:
retry_on_pipeline_steps
Tuple[TPipelineStep, ...], optional - which pipeline steps are allowed to be repeated. Default: "load"
pipeline_drop Objects
class pipeline_drop()
__init__
def __init__(pipeline: "Pipeline",
resources: Union[Iterable[Union[str, TSimpleRegex]],
Union[str, TSimpleRegex]] = (),
schema_name: Optional[str] = None,
state_paths: TAnyJsonPath = (),
drop_all: bool = False,
state_only: bool = False) -> None
Prepares pipeline drop for a particular source/schema and executes it when called. You can inspect the modifications
collected during the dry run by inspecting the info
property.
Drop operation is a special pipeline run with attached commands to drop tables corresponding to resources,
associated pipeline state and to replace schema versions in the destination.
Arguments:
pipeline
- Pipeline to drop tables and state fromresources
- List of resources to drop. If empty, no resources are dropped unlessdrop_all
is Trueschema_name
- Name of the schema to drop tables from. If not specified, the default schema is usedstate_paths
- JSON path(s) relative to the source state to dropdrop_all
- Drop all resources and tables in the schema (supersedesresources
list)state_only
- Drop only state, not tables
prepare_refresh_source
def prepare_refresh_source(
pipeline: "Pipeline", source: DltSource,
refresh: TRefreshMode) -> TLoadPackageDropTablesState
Prepare refresh mode on the given source, updating the provided schema
and pipeline state
to reflect tables being dropped/truncated and associated resources state.
Behavior depends in refresh
mode:
drop_sources
- all tables generated by resources insource
will be dropped, source state will be wiped.drop_resources
- tables generated by resources that are actually selected insource
will be dropped and associated resources stare will be wiped.drop_data
- tables generated by resources that are actually selected insource
will be truncated and associated resources stare will be wiped.
Returns:
The new load package state containing tables that need to be dropped/truncated.