satorbis_kit.vector_operation.wherobots_vector_data_ingestion module

Wherobots Vector Data Ingestion Operations.

This module submits a Wherobots Cloud job to ingest vector data into the managed vector catalog, optionally streams logs, and waits for completion. It also performs a client-side S3 size check before submitting the job.

satorbis_kit.vector_operation.wherobots_vector_data_ingestion.vector_data_ingestion(s3_path: str, database: str, table: str, partition_column: str | None, unique_columns: List[str], region: str | None = None, column_renames: List[str] | None = None, zorder_columns: List[str] | None = None, format_version: str = '3', geohash_precision: int = 2, wait_for_completion: bool = False, poll_interval: int = 20, log_page_size: int = 200, job_name_prefix: str = 'vector-data-ingestion') dict[source]

Submit a vector data ingestion job to Wherobots Cloud.

Parameters:
  • s3_path – S3 prefix containing shapefile components.

  • database – Destination database name. Must be vector_catalog.

  • table – Destination table name within vector_catalog.

  • partition_column – Column to partition by. If None, geohash is used.

  • unique_columns – Columns used as the MERGE key.

  • region – Wherobots region override (defaults to configured region).

  • column_renames – Optional column renames in key=value format.

  • zorder_columns – Optional columns for Z-order rewrite.

  • format_version – Iceberg table format version.

  • geohash_precision – Precision for geohash partitioning.

  • wait_for_completion – If True, stream logs and wait for completion.

  • poll_interval – Poll interval in seconds for status/logs.

  • log_page_size – Log page size per API call.

  • job_name_prefix – Prefix for the Wherobots job name.

Returns:

Response dictionary from the Wherobots API.