common.storages.configuration
SchemaStorageConfiguration Objects
@configspec
class SchemaStorageConfiguration(BaseConfiguration)
schema_volume_path
path to volume with default schemas
import_schema_path
path from which to import a schema into storage
export_schema_path
path to which export schema from storage
external_schema_format
format in which to expect external schema
NormalizeStorageConfiguration Objects
@configspec
class NormalizeStorageConfiguration(BaseConfiguration)
normalize_volume_path
path to volume where normalized loader files will be stored
ensure_canonical_az_url
def ensure_canonical_az_url(bucket_url: str,
target_scheme: str,
storage_account_name: str = None,
account_host: str = None) -> str
Converts any of the forms of azure blob storage into canonical form of {target_scheme}://<container_name>@<storage_account_name>.{account_host}/path
azure_storage_account_name
is optional only if not present in bucket_url, account_host
assumes "<azure_storage_account_name>.dfs.core.windows.net" by default
make_fsspec_url
def make_fsspec_url(scheme: str, fs_path: str, bucket_url: str) -> str
Creates url from fs_path
and scheme
using bucket_url as an url
template
Arguments:
scheme
str - scheme of the resulting urlfs_path
str - kind of absolute path that fsspec uses to locate resources for particular filesystem.bucket_url
str - an url template. the structure of url will be preserved if possible
FilesystemConfiguration Objects
@configspec
class FilesystemConfiguration(BaseConfiguration)
A configuration defining filesystem location and access credentials.
When configuration is resolved, bucket_url
is used to extract a protocol and request corresponding credentials class.
- s3
- gs, gcs
- az, abfs, adl, abfss, azure
- file, memory
- gdrive
- sftp
read_only
Indicates read only filesystem access. Will enable caching
max_state_files
Maximum number of pipeline state files to keep; 0 or negative value disables cleanup.
protocol
@property
def protocol() -> str
bucket_url
protocol
fingerprint
def fingerprint() -> str
Returns a fingerprint of bucket schema and netloc.
Returns:
str
- Fingerprint.
make_url
def make_url(fs_path: str) -> str
Makes a full url (with scheme) form fs_path which is kind-of absolute path used by fsspec to identify resources.
This method will use bucket_url
to infer the original form of the url.
__str__
def __str__() -> str
Return displayable destination location
is_local_path
@staticmethod
def is_local_path(url: str) -> bool
Checks if url
is a local path, without a schema
make_local_path
@staticmethod
def make_local_path(file_url: str) -> str
Gets a valid local filesystem path from file:// scheme. Supports POSIX/Windows/UNC paths
Returns:
str
- local filesystem path
make_file_url
@staticmethod
def make_file_url(local_path: str) -> str
Creates a normalized file:// url from a local path
netloc is never set. UNC paths are represented as file://host/path