satorbis_kit.raster package

satorbis_kit.raster.generate_patches(input_s3_file: str, aws_account_id: str = '686585748973', tile_width: int = 1024, tile_height: int = 1024, overlap: int = 256, output_resolution_cm: int | None = None, s3_intermediate_path: str = 'sentinel_patches/50cm', s3_bucket: str = 'satsure-airflow-pipelines', compress: str = 'LZW', output_format: str = 'tif', job_title: str | None = None, wait_for_completion: bool = False, cost_customer: str = 'agriculture-team', cost_project: str = 'crop-monitoring') openeo.rest.job.BatchJob[source]

Generate raster patches from large GeoTIFF files using OpenEO backend with Dask support.

This function splits a large raster file into smaller, manageable patches (tiles) for processing. It leverages the OpenEO backend with Dask-enabled parallel processing for improved performance and supports S3 storage for both input and output.

Parameters:
  • input_s3_file (str) – S3 URI to input raster file. Example: “s3://satsure-geoscape-staging/GEOSCAPE_URBAN_BFP/M0631_Perth_Block1A_85mm_UCEM2f210_MGA2020z50_28-Oct-2024/20250109_M0631_Perth_Block1A_85mm_UCEM2f210_MGA2020z50_28-Oct-2024_tile_0001.tif”

  • aws_account_id (str, optional) – AWS account ID used by the backend for S3 access. Defaults to “686585748973”.

  • tile_width (int, optional) – Width of each patch in pixels. Defaults to 1024.

  • tile_height (int, optional) – Height of each patch in pixels. Defaults to 1024.

  • overlap (int, optional) – Overlapping pixel count between adjacent tiles. Provides context at tile boundaries for ML/AI tasks. Defaults to 256.

  • output_resolution_cm (int, optional) – Output resolution in centimeters. Use 1000 for 10m, 500 for 5m, 50 for 50cm. Can be None to preserve original resolution. Defaults to None.

  • s3_intermediate_path (str, optional) – S3 path prefix for output patches. Defaults to “sentinel_patches/502cm”.

  • s3_bucket (str, optional) – S3 bucket name for output storage. Defaults to “satsure-airflow-pipelines”.

  • compress (str, optional) – Compression method (LZW, DEFLATE, NONE). Defaults to “LZW”.

  • output_format (str, optional) – Output file format (“tif”, “jpeg”, “png”). Defaults to “tif”.

  • job_title (str, optional) – Custom job title. If None, auto-generated from input filename with Dask support indication.

  • wait_for_completion (bool, optional) – If True, blocks until job completes. If False, returns immediately after starting. Defaults to False.

  • cost_customer (str, optional) – Customer for cost tracking. Defaults to “agriculture-team”.

  • cost_project (str, optional) – Project for cost tracking. Defaults to “crop-monitoring”.

Returns:

OpenEO batch job object that can be monitored

and managed. Use job.status() to check status.

Return type:

openeo.rest.job.BatchJob

Raises:
  • ConnectionError – If unable to connect to OpenEO backend.

  • ValueError – If S3 paths are invalid.

Examples

Basic usage with defaults and Dask support:

>>> from satorbis_kit.raster import generate_patches
>>> job = generate_patches(
...     input_s3_file="s3://satsure-geoscape-staging/GEOSCAPE_URBAN_BFP/.tif"
... )
>>> print(job.status())

Custom tile size with preserved resolution:

>>> job = generate_patches(
...     input_s3_file="s3://bucket/input.tif",
...     tile_width=2048,
...     tile_height=2048,
...     overlap=512,
...     output_resolution_cm=None,  # Preserve original resolution
...     s3_intermediate_path="patches/original_res",
...     wait_for_completion=True
... )

Monitor job progress with Dask:

>>> job = generate_patches(input_s3_file="s3://bucket/input.tif")
>>> print(job.status())  # Check status
>>> logs = job.logs()    # View logs
>>> job.start_and_wait() # Wait for completion

Using OpenEO directly with Dask-enabled process:

>>> import openeo
>>> from satorbis_kit.constants import Constants
>>> con = openeo.connect(Constants.DEV_URL)
>>> cube = con.datacube_from_process(
...     process_id="generate_patches",  # Uses Dask-enabled function
...     input_s3_file="s3://bucket/input.tif",
...     tile_width=1024,
...     tile_height=1024,
...     overlap=256,
...     output_resolution_cm=None,
...     s3_intermediate_path="sentinel_patches/502cm",
...     s3_bucket="satsure-airflow-pipelines",
...     compress="LZW"
... )
>>> job = cube.create_job(title="Patch Generation Job with Overlap and Dask",
...                       job_options={
...                           "cost_customer": "agriculture-team",
...                           "cost_project": "crop-monitoring",
...                       })
>>> job.start_and_wait()

Note

  • Requires AWS credentials configured for S3 access

  • OpenEO backend accessible via Constants.DEV_URL

  • Output patches are stored in S3 at: s3://{s3_bucket}/{s3_intermediate_path}/

  • Uses Dask for parallel processing to improve performance

  • Cost tracking is enabled through job_options

satorbis_kit.raster.merge_patches(input_s3_path: str, output_s3_path: str, compress: str = 'LZW', job_title: str | None = None, wait_for_completion: bool = False) openeo.rest.job.BatchJob[source]

Merge previously generated raster patches back into a single GeoTIFF file.

This function takes a directory of raster patches (tiles) and merges them into a single continuous raster file using the OpenEO backend. It handles proper georeferencing and seamless stitching of patches.

Parameters:
  • input_s3_path (str) – S3 URI to directory containing patches to merge. Must end with ‘/’. Example: “s3://bucket/patches/”

  • output_s3_path (str) – S3 URI for merged output file. Example: “s3://bucket/merged.tif”

  • compress (str, optional) – Compression method for output (LZW, DEFLATE, NONE). Defaults to “LZW”.

  • job_title (str, optional) – Custom job title. If None, auto-generated from output filename.

  • wait_for_completion (bool, optional) – If True, blocks until job completes. If False, returns immediately after starting. Defaults to False.

Returns:

OpenEO batch job object that can be monitored

and managed. Use job.status() to check status.

Return type:

openeo.rest.job.BatchJob

Raises:
  • ConnectionError – If unable to connect to OpenEO backend.

  • ValueError – If S3 paths are invalid or patches directory is empty.

Examples

Basic merge operation:

>>> from satorbis_kit.raster import merge_patches
>>> job = merge_patches(
...     input_s3_path="s3://satsure-airflow-pipelines/sentinel_patches/100cm/",
...     output_s3_path="s3://satsure-airflow-pipelines/outputs/merged.tif"
... )
>>> print(job.status())

Wait for completion:

>>> job = merge_patches(
...     input_s3_path="s3://bucket/patches/",
...     output_s3_path="s3://bucket/merged.tif",
...     compress="DEFLATE",
...     wait_for_completion=True
... )
>>> print(f"Merge complete: {job.status()}")

Complete workflow (generate + merge):

>>> from satorbis_kit.raster import generate_patches, merge_patches
>>>
>>> # Generate patches
>>> gen_job = generate_patches(
...     input_s3_file="s3://bucket/large_image.tif",
...     tile_width=1024,
...     tile_height=1024,
...     s3_intermediate_path="temp_patches/",
...     wait_for_completion=True
... )
>>>
>>> # Merge patches back
>>> merge_job = merge_patches(
...     input_s3_path="s3://satsure-airflow-pipelines/temp_patches/",
...     output_s3_path="s3://bucket/final_output.tif",
...     wait_for_completion=True
... )

Using OpenEO directly:

>>> import openeo
>>> from satorbis_kit.constants import Constants
>>> con = openeo.connect(Constants.DEV_URL)
>>> cube = con.datacube_from_process(
...     process_id="merge_patches",
...     input_s3_path="s3://bucket/patches/",
...     output_s3_path="s3://bucket/merged.tif",
...     compress="LZW"
... )
>>> job = cube.create_job(title="Patch Merge Job")
>>> job.start_and_wait()

Note

  • Requires AWS credentials configured for S3 access

  • OpenEO backend accessible via Constants.DEV_URL

  • Input directory must contain valid raster patches

  • Patches must have compatible georeferencing and overlap

Submodules