destinations.impl.lancedb.lancedb_client
write_records
def write_records(records: DATA,
*,
db_client: DBConnection,
table_name: str,
write_disposition: Optional[TWriteDisposition] = "append",
merge_key: Optional[str] = None,
remove_orphans: Optional[bool] = False,
filter_condition: Optional[str] = None) -> None
Inserts records into a LanceDB table with automatic embedding computation.
Arguments:
records
- The data to be inserted as payload.db_client
- The LanceDB client connection.table_name
- The name of the table to insert into.merge_key
- Keys for update/merge operations.write_disposition
- The write disposition - one of 'skip', 'append', 'replace', 'merge'.remove_orphans
bool - Whether to remove orphans after insertion or not (only merge disposition).filter_condition
str - If None, then all such rows will be deleted. Otherwise, the condition will be used as an SQL filter to limit what rows are deleted.
Raises:
ValueError
- If the write disposition is unsupported, orid_field_name
is not provided for update/merge operations.
LanceDBClient Objects
class LanceDBClient(JobClientBase, WithStateSync)
LanceDB destination handler.
model_func
The embedder callback used for each chunk.
create_table
@lancedb_error
def create_table(table_name: str,
schema: TArrowSchema,
mode: str = "create") -> "lancedb.table.Table"
Create a LanceDB Table from the provided LanceModel or PyArrow schema.
Arguments:
schema
- The table schema to create.table_name
- The name of the table to create.mode
str - The mode to use when creating the table. Can be either "create" or "overwrite". By default, if the table already exists, an exception is raised. If you want to overwrite the table, use mode="overwrite".
delete_table
def delete_table(table_name: str) -> None
Delete a LanceDB table.
Arguments:
table_name
- The name of the table to delete.
query_table
def query_table(
table_name: str,
query: Union[List[Any], NDArray, Array, ChunkedArray, str, Tuple[Any],
None] = None
) -> LanceQueryBuilder
Query a LanceDB table.
Arguments:
table_name
- The name of the table to query.query
- The targeted vector to search for.
Returns:
A LanceDB query builder.
drop_storage
@lancedb_error
def drop_storage() -> None
Drop the dataset from the LanceDB instance.
Deletes all tables in the dataset and all data, as well as sentinel table associated with them.
If the dataset name wasn't provided, it deletes all the tables in the current schema.
extend_lancedb_table_schema
@lancedb_error
def extend_lancedb_table_schema(table_name: str,
field_schemas: List[pa.Field]) -> None
Extend LanceDB table schema with empty columns.
Arguments:
table_name
- The name of the table to create the fields on.field_schemas
- The list of PyArrow Fields to create in the target LanceDB table.
get_stored_state
@lancedb_error
def get_stored_state(pipeline_name: str) -> Optional[StateInfo]
Retrieves the latest completed state for a pipeline.
get_stored_schema
@lancedb_error
def get_stored_schema(schema_name: str = None) -> Optional[StorageSchemaInfo]
Retrieves newest schema from destination storage.