satorbis_kit.vector_operation package¶
Vector Operations Module
This module contains Wherobots Cloud operations for vector data processing.
- satorbis_kit.vector_operation.buffer_points_wherobots(input_path: str, output_path: str, buffer_distance: float, distance_unit: str = 'meter', api_key: str | None = None, region: str | None = None, script_base_uri: str | None = None, runtime: str | None = None, timeout_seconds: int | None = None, job_name_prefix: str = 'buffer-points') dict[source]¶
Submit a buffer generation job to Wherobots Cloud.
- Parameters:
input_path – Input GeoParquet file path (must contain point geometries)
output_path – Output GeoParquet file path
buffer_distance – Buffer distance in the specified unit
distance_unit – Unit for buffer distance (meter, kilometer, foot, mile)
api_key – Wherobots API key (optional, uses hardcoded default if None)
region – Wherobots region (optional, uses hardcoded default if None)
script_base_uri – Base URI for scripts (optional, uses hardcoded default if None)
runtime – Runtime size (optional, defaults to “tiny”)
timeout_seconds – Job timeout (optional, defaults to 3600)
job_name_prefix – Job name prefix (optional)
- Returns:
Dictionary with job submission result
Example
>>> result = buffer_points_wherobots( ... input_path="s3://bucket/points.parquet", ... output_path="s3://bucket/buffered.parquet", ... buffer_distance=1000, ... distance_unit="meter", ... )
- satorbis_kit.vector_operation.cancel_job(api_key: str, run_id: str) None[source]¶
Cancel a Wherobots job run.
- Parameters:
api_key – Wherobots API key
run_id – Job run ID
- Raises:
ImportError – If requests library is not available
requests.HTTPError – If the API request fails
- satorbis_kit.vector_operation.dissolve_simplify_wherobots(input_path: str, output_path: str, dissolve_by: str | None = None, simplify_tolerance: float | None = None, api_key: str | None = None, region: str | None = None, script_base_uri: str | None = None, runtime: str | None = None, timeout_seconds: int | None = None, job_name_prefix: str = 'dissolve-simplify') dict[source]¶
Submit a dissolve and/or simplify job to Wherobots Cloud.
- Parameters:
input_path – Input GeoParquet file path
output_path – Output GeoParquet file path
dissolve_by – Column name to dissolve by (optional)
simplify_tolerance – Simplify tolerance (optional, Douglas-Peucker algorithm)
api_key – Wherobots API key (optional, uses hardcoded default if None)
region – Wherobots region (optional, uses hardcoded default if None)
script_base_uri – Base URI for scripts (optional, uses hardcoded default if None)
runtime – Runtime size (optional, defaults to “tiny”)
timeout_seconds – Job timeout (optional, defaults to 3600)
job_name_prefix – Job name prefix (optional)
- Returns:
Dictionary with job submission result
Example
>>> result = dissolve_simplify_wherobots( ... input_path="s3://bucket/input.parquet", ... output_path="s3://bucket/output.parquet", ... dissolve_by="region_name", ... simplify_tolerance=0.001, ... )
- satorbis_kit.vector_operation.geojson_to_geoparquet_wherobots(input_paths: List[str], output_path: str, api_key: str | None = None, region: str | None = None, script_base_uri: str | None = None, runtime: str | None = None, timeout_seconds: int | None = None, job_name_prefix: str = 'geojson-to-geoparquet') dict[source]¶
Submit a GeoJSON to GeoParquet conversion job to Wherobots Cloud.
- Parameters:
input_paths – List of input GeoJSON file paths (can include wildcards)
output_path – Output GeoParquet file path
api_key – Wherobots API key (optional, uses hardcoded default if None)
region – Wherobots region (optional, uses hardcoded default if None)
script_base_uri – Base URI for scripts (optional, uses hardcoded default if None)
runtime – Runtime size (optional, defaults to “tiny”)
timeout_seconds – Job timeout (optional, defaults to 3600)
job_name_prefix – Job name prefix (optional)
- Returns:
Dictionary with job submission result
Example
>>> result = geojson_to_geoparquet_wherobots( ... input_paths=["s3://bucket/input.geojson"], ... output_path="s3://bucket/output.parquet", ... )
- satorbis_kit.vector_operation.get_job_logs(api_key: str, run_id: str, cursor: int = 0, size: int = 100) dict[source]¶
Get logs for a Wherobots job run.
- Parameters:
api_key – Wherobots API key
run_id – Job run ID
cursor – Pagination cursor
size – Number of log entries to fetch
- Returns:
Logs dictionary with items, current_page, and next_page
- Raises:
ImportError – If requests library is not available
requests.HTTPError – If the API request fails
- satorbis_kit.vector_operation.get_job_status(api_key: str, run_id: str) dict[source]¶
Get the status of a Wherobots job run.
- Parameters:
api_key – Wherobots API key
run_id – Job run ID
- Returns:
Job run details dictionary
- Raises:
ImportError – If requests library is not available
requests.HTTPError – If the API request fails
- satorbis_kit.vector_operation.list_jobs(api_key: str, region: str | None = None, status: List[str] | None = None, name: str | None = None, size: int = 50) dict[source]¶
List Wherobots job runs.
- Parameters:
api_key – Wherobots API key
region – Filter by region (optional)
status – Filter by status list (optional)
name – Filter by name pattern (optional)
size – Number of results per page
- Returns:
Dictionary with items list and pagination info
- Raises:
ImportError – If requests library is not available
requests.HTTPError – If the API request fails
- satorbis_kit.vector_operation.merge_vectors_wherobots(input_base_paths: List[str], output_base_path: str, vector_types: List[str] | None = None, api_key: str | None = None, region: str | None = None, script_base_uri: str | None = None, runtime: str | None = None, timeout_seconds: int | None = None, job_name_prefix: str = 'vector-merge') dict[source]¶
Submit a vector merge job to Wherobots Cloud with simplified interface.
This function abstracts away API configuration details. Users only need to provide input/output paths.
- Parameters:
input_base_paths – List of base path patterns for input data (e.g., [“s3://bucket/path1///”, “s3://bucket/path2///”])
output_base_path – Base output path (e.g., “s3://bucket/output/”)
vector_types – List of vector types to process. If None, processes all types. Options: [“building”, “habitation”, “imaged_area”, etc.] or None for all
api_key – Wherobots API key. If None, uses hardcoded default.
region – Wherobots region. If None, uses hardcoded default.
script_base_uri – Base URI where merge scripts are stored. If None, uses hardcoded default.
runtime – Runtime size. Default: “large”
timeout_seconds – Job timeout in seconds. Default: 14400 (4 hours)
job_name_prefix – Prefix for job name. Default: “vector-merge”
- Returns:
Dictionary with job submission result, including ‘id’ (run_id)
Example
>>> result = merge_vectors_wherobots( ... input_base_paths=[ ... "s3://bucket/QC_PASSED/matched/*/*/*", ... "s3://bucket/QC_PASSED/unmatched/*/*/*", ... ], ... output_base_path="s3://bucket/merged/", ... ) >>> run_id = result["id"]
- satorbis_kit.vector_operation.merge_vectors_wherobots_simple(input_paths: List[str], output_path: str, api_key: str | None = None, region: str | None = None, script_base_uri: str | None = None, runtime: str | None = None, timeout_seconds: int | None = None, job_name_prefix: str = 'vector-merge') dict[source]¶
Submit a vector merge job to Wherobots Cloud with any list of input paths.
This function accepts any list of file paths (including regex/wildcards) and merges them.
- Parameters:
input_paths – List of input file paths/patterns (can include wildcards/regex) (e.g., [“s3://bucket/path1/.parquet”, “s3://bucket/path2//*.parquet”])
output_path – Full output path where merged result will be saved
api_key – Wherobots API key (optional, uses hardcoded default if None)
region – Wherobots region (optional, uses hardcoded default if None)
script_base_uri – Base URI for scripts (optional, uses hardcoded default if None)
runtime – Runtime size (optional, defaults to “medium”)
timeout_seconds – Job timeout (optional, defaults to 7200)
job_name_prefix – Job name prefix (optional)
- Returns:
Dictionary with job submission result
Example
>>> result = merge_vectors_wherobots_simple( ... input_paths=[ ... "s3://bucket/path1/*/*/*_building.parquet", ... "s3://bucket/path2/*/*/*_building.parquet", ... ], ... output_path="s3://bucket/merged/building_footprint_polygon/", ... )
- satorbis_kit.vector_operation.submit_job(api_key: str, region: str, script_uri: str, script_args: List[str] | None = None, runtime: str = 'tiny', name: str = 'wherobots-job', version: str = 'latest', timeout_seconds: int = 3600, dependencies: List[dict] | None = None, spark_configs: dict | None = None) dict[source]¶
Submit a Python job to Wherobots Cloud.
- Parameters:
api_key – Wherobots API key
region – Compute region (e.g., ‘aws-ap-south-1’, ‘aws-us-west-2’)
script_uri – S3 URI to the Python script
script_args – List of command-line arguments for the script
runtime – Runtime size (tiny, small, medium, large, etc.)
name – Job run name
version – Wherobots version (‘latest’ or ‘preview’)
timeout_seconds – Job timeout in seconds
dependencies – List of dependency objects (PyPI or FILE)
spark_configs – Dictionary of Spark configuration key-value pairs
- Returns:
Response dictionary from Wherobots API
- Raises:
ImportError – If requests library is not available
requests.HTTPError – If the API request fails
- satorbis_kit.vector_operation.vector_data_ingestion(s3_path: str, database: str, table: str, partition_column: str | None, unique_columns: List[str], region: str | None = None, column_renames: List[str] | None = None, zorder_columns: List[str] | None = None, format_version: str = '3', geohash_precision: int = 2, wait_for_completion: bool = False, poll_interval: int = 20, log_page_size: int = 200, job_name_prefix: str = 'vector-data-ingestion') dict[source]¶
Submit a vector data ingestion job to Wherobots Cloud.
- Parameters:
s3_path – S3 prefix containing shapefile components.
database – Destination database name. Must be
vector_catalog.table – Destination table name within
vector_catalog.partition_column – Column to partition by. If None, geohash is used.
unique_columns – Columns used as the MERGE key.
region – Wherobots region override (defaults to configured region).
column_renames – Optional column renames in key=value format.
zorder_columns – Optional columns for Z-order rewrite.
format_version – Iceberg table format version.
geohash_precision – Precision for geohash partitioning.
wait_for_completion – If True, stream logs and wait for completion.
poll_interval – Poll interval in seconds for status/logs.
log_page_size – Log page size per API call.
job_name_prefix – Prefix for the Wherobots job name.
- Returns:
Response dictionary from the Wherobots API.
Submodules¶
- satorbis_kit.vector_operation.wherobots_config module
- satorbis_kit.vector_operation.wherobots_geojson module
- satorbis_kit.vector_operation.wherobots_geometry module
- satorbis_kit.vector_operation.wherobots_merge module
- satorbis_kit.vector_operation.wherobots_status module
- satorbis_kit.vector_operation.wherobots_vector_data_ingestion module