Skip to main content
Version: 1.4.0 (latest)

destinations.impl.weaviate.weaviate_adapter

TTokenizationSetting

Maps column names to tokenization types supported by Weaviate

weaviate_adapter

def weaviate_adapter(data: Any,
vectorize: TColumnNames = None,
tokenization: TTokenizationSetting = None) -> DltResource

[view_source]

Prepares data for the Weaviate destination by specifying which columns should be vectorized and which tokenization method to use.

Vectorization is done by Weaviate's vectorizer modules. The vectorizer module can be configured in dlt configuration file under [destination.weaviate.vectorizer] and [destination.weaviate.module_config]. The default vectorizer module is text2vec-openai. See also: https://weaviate.io/developers/weaviate/modules/retriever-vectorizer-modules

Arguments:

  • data Any - The data to be transformed. It can be raw data or an instance of DltResource. If raw data, the function wraps it into a DltResource object.
  • vectorize TColumnNames, optional - Specifies columns that should be vectorized. Can be a single column name as a string or a list of column names.
  • tokenization TTokenizationSetting, optional - A dictionary mapping column names to tokenization methods supported by Weaviate. The tokenization methods are one of the values in TOKENIZATION_METHODS:
    • 'word',
    • 'lowercase',
    • 'whitespace',
    • 'field'.

Returns:

  • DltResource - A resource with applied Weaviate-specific hints.

Raises:

  • ValueError - If input for vectorize or tokenization is invalid or neither is specified.

Examples:

    data = [{"name": "Alice", "description": "Software developer"}]
weaviate_adapter(data, vectorize="description", tokenization={"description": "word"})

[DltResource with hints applied]

This demo works on codespaces. Codespaces is a development environment available for free to anyone with a Github account. You'll be asked to fork the demo repository and from there the README guides you with further steps.
The demo uses the Continue VSCode extension.

Off to codespaces!

DHelp

Ask a question

Welcome to "Codex Central", your next-gen help center, driven by OpenAI's GPT-4 model. It's more than just a forum or a FAQ hub – it's a dynamic knowledge base where coders can find AI-assisted solutions to their pressing problems. With GPT-4's powerful comprehension and predictive abilities, Codex Central provides instantaneous issue resolution, insightful debugging, and personalized guidance. Get your code running smoothly with the unparalleled support at Codex Central - coding help reimagined with AI prowess.