satorbis_kit.vector_operation.wherobots_merge module

Wherobots Vector Merge Operations

Functions for merging vector datasets on Wherobots Cloud.

satorbis_kit.vector_operation.wherobots_merge.merge_vectors_wherobots(input_base_paths: List[str], output_base_path: str, vector_types: List[str] | None = None, api_key: str | None = None, region: str | None = None, script_base_uri: str | None = None, runtime: str | None = None, timeout_seconds: int | None = None, job_name_prefix: str = 'vector-merge') dict[source]

Submit a vector merge job to Wherobots Cloud with simplified interface.

This function abstracts away API configuration details. Users only need to provide input/output paths.

Parameters:
  • input_base_paths – List of base path patterns for input data (e.g., [“s3://bucket/path1///”, “s3://bucket/path2///”])

  • output_base_path – Base output path (e.g., “s3://bucket/output/”)

  • vector_types – List of vector types to process. If None, processes all types. Options: [“building”, “habitation”, “imaged_area”, etc.] or None for all

  • api_key – Wherobots API key. If None, uses hardcoded default.

  • region – Wherobots region. If None, uses hardcoded default.

  • script_base_uri – Base URI where merge scripts are stored. If None, uses hardcoded default.

  • runtime – Runtime size. Default: “large”

  • timeout_seconds – Job timeout in seconds. Default: 14400 (4 hours)

  • job_name_prefix – Prefix for job name. Default: “vector-merge”

Returns:

Dictionary with job submission result, including ‘id’ (run_id)

Example

>>> result = merge_vectors_wherobots(
...     input_base_paths=[
...         "s3://bucket/QC_PASSED/matched/*/*/*",
...         "s3://bucket/QC_PASSED/unmatched/*/*/*",
...     ],
...     output_base_path="s3://bucket/merged/",
... )
>>> run_id = result["id"]
satorbis_kit.vector_operation.wherobots_merge.merge_vectors_wherobots_simple(input_paths: List[str], output_path: str, api_key: str | None = None, region: str | None = None, script_base_uri: str | None = None, runtime: str | None = None, timeout_seconds: int | None = None, job_name_prefix: str = 'vector-merge') dict[source]

Submit a vector merge job to Wherobots Cloud with any list of input paths.

This function accepts any list of file paths (including regex/wildcards) and merges them.

Parameters:
  • input_paths – List of input file paths/patterns (can include wildcards/regex) (e.g., [“s3://bucket/path1/.parquet”, “s3://bucket/path2//*.parquet”])

  • output_path – Full output path where merged result will be saved

  • api_key – Wherobots API key (optional, uses hardcoded default if None)

  • region – Wherobots region (optional, uses hardcoded default if None)

  • script_base_uri – Base URI for scripts (optional, uses hardcoded default if None)

  • runtime – Runtime size (optional, defaults to “medium”)

  • timeout_seconds – Job timeout (optional, defaults to 7200)

  • job_name_prefix – Job name prefix (optional)

Returns:

Dictionary with job submission result

Example

>>> result = merge_vectors_wherobots_simple(
...     input_paths=[
...         "s3://bucket/path1/*/*/*_building.parquet",
...         "s3://bucket/path2/*/*/*_building.parquet",
...     ],
...     output_path="s3://bucket/merged/building_footprint_polygon/",
... )