satorbis_kit.pgstac.manager.ingestion_manager module

Core STAC Ingestion Manager implementation.

This module contains the core BaseSTACIngestionManager class that provides the base implementation for submitting raster ingestion jobs to STAC via Airflow or compute engine.

class satorbis_kit.pgstac.manager.ingestion_manager.BaseSTACIngestionManager(client: AbstractSTACClient, upload_handler: AbstractRasterUploadHandler | None = None)[source]

Bases: object

Base manager for STAC raster ingestion workflows.

This is the base class that provides the core implementation for STAC ingestion. End users should use STACIngestionManager instead, which adds factory methods for creating manager instances.

This class provides: - Input validation and batch processing - Cloud storage configuration - Raster asset uploads with STAC-compliant naming - Job submission and status tracking

client

STAC client instance for API communication (Airflow or OpenEO)

Note

This is a base class. Use from_airflow() or from_spatial_engine() to create instances.

build_stac_remote_path(collection: str, filename: str, subfolder: str | None = None) str[source]

Return the STAC-compliant remote path for a raster.

configure_storage(provider: str, **kwargs: Any) CloudObjectStore[source]

Initialize cloud storage credentials for uploading local rasters.

Parameters:
  • provider – Either “aws” or “azure”.

  • **kwargs – Provider specific keyword arguments.

Returns:

Configured CloudObjectStore instance.

get_job_status(job_id: str) Dict[str, Any][source]

Get status of a submitted ingestion job.

Parameters:

job_id – Job ID returned from ingest_rasters

Returns:

Dictionary with job status information

Raises:

APIError – If API request fails

ingest_rasters(raster_s3_urls: ~typing.List[str], collection: str, ingestion_batch_size: int | None = None, convert_to_cog: bool | None = None, cog_profile: str | None = None, cog_profile_options: ~typing.Dict[str, ~typing.Any] | None = None, cog_overview_level: int | None = None, lineage: ~typing.Any | None = None, ttl: ~typing.Any = <object object>) str | List[str][source]

Submit raster stacking and STAC ingestion job.

This method validates all inputs, builds the configuration, and submits the job to the configured client. It returns a job ID that can be used to track progress.

If the number of raster S3 URLs exceeds 1000, the list is split into chunks of 1000, and multiple jobs are submitted. In this case, a list of job IDs is returned.

Parameters:
  • raster_s3_urls – List of S3 URLs to raster files

  • collection – STAC collection name

  • ingestion_batch_size – Batch size for ingestion (optional, default: 100)

  • convert_to_cog – Whether to convert to COG format (optional)

  • cog_profile – COG profile name (e.g., ‘lzw’, ‘deflate’) (optional)

  • cog_profile_options – Profile options for cog_translate (optional)

  • cog_overview_level – Number of overview levels (optional)

  • lineage – Optional lineage information

  • ttl – Time To Live in days (optional). Integer representing days (e.g., ttl=10 means 10 days). If not provided, defaults to 30 days. If None, TTL is skipped for permanent items. The value is stored as an expiry date string in properties[“ss:ttl”] in STAC metadata, computed as today’s date + ttl days (ISO format YYYY-MM-DD).

Returns:

Unique job ID(s) for tracking the ingestion workflow. Returns a single string if one job is submitted, or a list of strings if multiple jobs are submitted due to large input size (>1000 URLs).

Return type:

Union[str, List[str]]

Raises:
set_upload_handler(handler: AbstractRasterUploadHandler) None[source]

Override the default STAC upload path builder.

upload_raster_asset(local_path: str | Path, *, collection: str, acquisition_date: str | date | datetime, tile: str | None = None, subfolder: str | None = None, overwrite: bool = False, ensure_unique: bool = True) str[source]

Upload a raster by deriving the STAC filename from acquisition data.

Parameters:
  • local_path – Path to local raster file

  • collection – STAC collection name

  • acquisition_date – Date used to build the filename

  • tile – Optional tile identifier appended to the filename

  • subfolder – Optional nested folder under the collection

  • overwrite – Whether to overwrite existing remote objects

  • ensure_unique – When True (default) a UUID is appended to avoid collisions

Returns:

Remote cloud URL of uploaded file

Return type:

str