Skip to main content
Version: 1.5.0 (latest)

extract.utils

get_data_item_format

def get_data_item_format(items: TDataItems) -> TDataItemFormat

[view_source]

Detect the format of the data item from items.

Reverts to object for empty lists

Returns:

The data file format.

resolve_column_value

def resolve_column_value(column_hint: TTableHintTemplate[TColumnNames],
item: TDataItem) -> Union[Any, List[Any]]

[view_source]

Extract values from the data item given a column hint. Returns either a single value or list of values when hint is a composite.

ensure_table_schema_columns

def ensure_table_schema_columns(
columns: TAnySchemaColumns) -> TTableSchemaColumns

[view_source]

Convert supported column schema types to a column dict which can be used in resource schema.

Arguments:

  • columns - A dict of column schemas, a list of column schemas, or a pydantic model

ensure_table_schema_columns_hint

def ensure_table_schema_columns_hint(
columns: TTableHintTemplate[TAnySchemaColumns]
) -> TTableHintTemplate[TTableSchemaColumns]

[view_source]

Convert column schema hint to a hint returning TTableSchemaColumns. A callable hint is wrapped in another function which converts the original result.

reset_pipe_state

def reset_pipe_state(pipe: SupportsPipe,
source_state_: Optional[DictStrAny] = None) -> None

[view_source]

Resets the resource state for a pipe and all its parent pipes

simulate_func_call

def simulate_func_call(
f: Union[Any, AnyFun], args_to_skip: int, *args: Any, **kwargs: Any
) -> Tuple[inspect.Signature, inspect.Signature, inspect.BoundArguments]

[view_source]

Simulates a call to a resource or transformer function before it will be wrapped for later execution in the pipe

Returns a tuple with a f signature, modified signature in case of transformers and bound arguments

wrap_iterator

def wrap_iterator(gen: Iterator[TDataItems]) -> Iterator[TDataItems]

[view_source]

Wraps an iterator into a generator

wrap_async_iterator

def wrap_async_iterator(
gen: AsyncIterator[TDataItems]
) -> Generator[Awaitable[TDataItems], None, None]

[view_source]

Wraps an async generator into a list of awaitables

wrap_parallel_iterator

def wrap_parallel_iterator(f: TAnyFunOrGenerator) -> TAnyFunOrGenerator

[view_source]

Wraps a generator for parallel extraction

wrap_compat_transformer

def wrap_compat_transformer(name: str, f: AnyFun, sig: inspect.Signature,
*args: Any, **kwargs: Any) -> AnyFun

[view_source]

Creates a compatible wrapper over transformer function. A pure transformer function expects data item in first argument and one keyword argument called meta

wrap_resource_gen

def wrap_resource_gen(name: str, f: AnyFun, sig: inspect.Signature, *args: Any,
**kwargs: Any) -> AnyFun

[view_source]

Wraps a generator or generator function so it is evaluated on extraction

This demo works on codespaces. Codespaces is a development environment available for free to anyone with a Github account. You'll be asked to fork the demo repository and from there the README guides you with further steps.
The demo uses the Continue VSCode extension.

Off to codespaces!

DHelp

Ask a question

Welcome to "Codex Central", your next-gen help center, driven by OpenAI's GPT-4 model. It's more than just a forum or a FAQ hub – it's a dynamic knowledge base where coders can find AI-assisted solutions to their pressing problems. With GPT-4's powerful comprehension and predictive abilities, Codex Central provides instantaneous issue resolution, insightful debugging, and personalized guidance. Get your code running smoothly with the unparalleled support at Codex Central - coding help reimagined with AI prowess.