satorbis_kit.pgstac package

STAC ingestion API for satorbis_kit.

This module provides functionality for submitting raster stacking and STAC ingestion workflows to Airflow. It supports:

  • Input validation with detailed error messages

  • COG (Cloud Optimized GeoTIFF) conversion with customizable parameters

  • Batch processing configuration

  • Job tracking and status monitoring

  • TTL (Time To Live) support for STAC items (specified in days)

## Quick Start

### Simple Usage (Function-based API)

Submit via Airflow: ```python from satorbis_kit import stack_rasters_and_ingest_via_airflow job_id = stack_rasters_and_ingest_via_airflow(

s3_urls=[“s3://bucket/TEST_COL/path/20240401.tif”], collection_name=”TEST_COL”, airflow_base_url=”https://airflow.example.com”, airflow_username=”admin”, airflow_password=”secret”, ttl=10, # TTL in days (defaults to 30 if not provided)

)

Submit via Spatial Engine: ```python from satorbis_kit import stack_rasters_and_ingest_via_spatial_engine job_id = stack_rasters_and_ingest_via_spatial_engine(

s3_urls=[“s3://bucket/TEST_COL/path/20240401.tif”], collection_name=”TEST_COL”, base_url=”https://api.example.com”, api_key=”your-key”, ttl=10, # TTL in days (defaults to 30 if not provided)

)

### Advanced Usage (Class-based API)

```python from satorbis_kit import STACIngestionManager

manager = STACIngestionManager(

airflow_base_url=”https://airflow.example.com”, airflow_username=”admin”, airflow_password=”secret”,

)

# With default TTL (30 days) job_id = manager.ingest_rasters(

raster_s3_urls=[“s3://bucket/TEST_COL/path/20240401.tif”], collection=”TEST_COL”, convert_to_cog=True, cog_profile=”lzw”, cog_profile_options={

“blockxsize”: 512, “blockysize”: 512, “predictor”: 2,

},

)

# With custom TTL (10 days) job_id = manager.ingest_rasters(

raster_s3_urls=[“s3://bucket/TEST_COL/path/20240401.tif”], collection=”TEST_COL”, ttl=10, # TTL in days

)

# Skip TTL for permanent items job_id = manager.ingest_rasters(

raster_s3_urls=[“s3://bucket/TEST_COL/path/20240401.tif”], collection=”TEST_COL”, ttl=None, # Skip TTL - item will never expire

)

### TTL (Time To Live) Support

TTL allows you to specify how long STAC items should be retained. The TTL value is specified in days, but is stored in STAC metadata as an expiry date string in properties[“ss:ttl”].

  • Default TTL: If ttl is not provided, items default to 30 days TTL (today + 30 days)

  • Custom TTL: Provide an integer value (e.g., ttl=10 for 10 days), stored as today’s date + 10 days in ISO date format (YYYY-MM-DD)

  • Skip TTL: Pass ttl=None for items that should never expire (permanent items). In this case properties[“ss:ttl”] is omitted.

class satorbis_kit.pgstac.STACIngestionManager(client: AbstractSTACClient, upload_handler: AbstractRasterUploadHandler | None = None)[source]

Bases: BaseSTACIngestionManager

Manager for STAC raster ingestion workflows.

This is the main public API for STAC ingestion. It provides factory methods for creating manager instances configured for different backends: - Airflow: Direct DAG triggering for smaller jobs - Spatial Engine: SQS-based queueing for large-scale jobs

The manager handles: - Input validation - Batch submission and chunking - Cloud storage configuration - Raster asset uploads with STAC-compliant naming

See also

classmethod from_airflow(airflow_base_url: str, airflow_username: str, airflow_password: str, upload_handler: AbstractRasterUploadHandler | None = None) STACIngestionManager[source]

Create manager for direct Airflow DAG triggering.

Use this method when you want to trigger Airflow DAGs directly. Suitable for smaller batch jobs.

Parameters:
  • airflow_base_url – Base URL for Airflow API (required)

  • airflow_username – Username for Airflow authentication (required)

  • airflow_password – Password for Airflow authentication (required)

  • upload_handler – Optional custom path builder for uploads

Returns:

STACIngestionManager configured for Airflow

Example

>>> manager = STACIngestionManager.from_airflow(
...     airflow_base_url="https://airflow.example.com",
...     airflow_username="admin",
...     airflow_password="secret",
... )
>>> job_id = manager.ingest_rasters(
...     raster_s3_urls=["s3://bucket/COL/20240401.tif"],
...     collection="TEST_COL",
... )
classmethod from_spatial_engine(base_url: str | None = None, api_key: str | None = None, timeout: int = 30, upload_handler: AbstractRasterUploadHandler | None = None) STACIngestionManager[source]

Create manager for spatial engine API with SQS queueing.

Use this method when you want to submit jobs through a spatial engine API (e.g., OpenEO) which handles SQS queueing. Better for large batch jobs and scalability.

Parameters:
  • base_url – Base URL for spatial engine API (defaults to https://dev.openeo.satsure.co)

  • api_key – Optional API key for Bearer token authentication

  • timeout – Request timeout in seconds (default: 30)

  • upload_handler – Optional custom path builder for uploads

Returns:

STACIngestionManager configured for spatial engine

Example

>>> manager = STACIngestionManager.from_spatial_engine(
...     base_url="https://api.example.com",
...     api_key="your-api-key",
... )
>>> job_id = manager.ingest_rasters(
...     raster_s3_urls=["s3://bucket/COL/20240401.tif"],
...     collection="TEST_COL",
... )
satorbis_kit.pgstac.stack_rasters_and_ingest_via_airflow(s3_urls: List[str], collection_name: str, airflow_base_url: str, airflow_username: str, airflow_password: str, convert_to_cog: bool | None = None, cog_profile: str | None = None, cog_profile_options: Dict[str, Any] | None = None, cog_overview_level: int | None = None, ingestion_batch_size: int | None = None, ttl: int | None = None, **kwargs: Any) str | List[str][source]

Convenience function to submit raster ingestion via Airflow.

This function provides a simple, function-based interface for basic use cases using direct Airflow DAG triggering.

For OpenEO/SQS-based submission, use stack_rasters_and_ingest_via_spatial_engine(). For more control and advanced features, use STACIngestionManager class directly.

Parameters:
  • s3_urls – List of S3 URLs to raster files

  • collection_name – STAC collection name

  • airflow_base_url – Airflow API URL (required)

  • airflow_username – Username for authentication (required)

  • airflow_password – Password for authentication (required)

  • convert_to_cog – Whether to convert to COG format (optional)

  • cog_profile – COG profile name (e.g., ‘lzw’, ‘deflate’) (optional)

  • cog_profile_options – Profile options for cog_translate (optional)

  • cog_overview_level – Number of overview levels (optional)

  • ingestion_batch_size – Batch size for ingestion (optional, default: 100)

  • ttl – Time To Live in days (optional). Integer representing days (e.g., ttl=10 means 10 days). If not provided, defaults to 30 days. Note: In this function API, passing ttl=None will also default to 30 days. To explicitly skip TTL for permanent items, use the class-based API with ttl=None. The value is stored as an expiry date string in properties[“ss:ttl”] in STAC metadata, computed as today’s date + ttl days (ISO format YYYY-MM-DD).

  • **kwargs – Additional keyword arguments (for future extensions)

Returns:

Job ID(s) for tracking the ingestion workflow.

Return type:

Union[str, List[str]]

Raises:
satorbis_kit.pgstac.stack_rasters_and_ingest_via_spatial_engine(s3_urls: List[str], collection_name: str, base_url: str | None = None, api_key: str | None = None, convert_to_cog: bool | None = None, cog_profile: str | None = None, cog_profile_options: Dict[str, Any] | None = None, cog_overview_level: int | None = None, ingestion_batch_size: int | None = None, timeout: int = 30, ttl: int | None = None, **kwargs: Any) str | List[str][source]

Convenience function to submit raster ingestion via spatial engine API.

This function provides a simple, function-based interface for submitting jobs through a spatial engine API (e.g., OpenEO) with SQS queueing. Better for large batch jobs.

For direct Airflow triggering, use stack_rasters_and_ingest(). For more control and advanced features, use STACIngestionManager class directly.

Parameters:
  • s3_urls – List of S3 URLs to raster files

  • collection_name – STAC collection name

  • base_url – Spatial engine API base URL (defaults to https://dev.openeo.satsure.co)

  • api_key – Optional API key for Bearer token authentication

  • convert_to_cog – Whether to convert to COG format (optional)

  • cog_profile – COG profile name (e.g., ‘lzw’, ‘deflate’) (optional)

  • cog_profile_options – Profile options for cog_translate (optional)

  • cog_overview_level – Number of overview levels (optional)

  • ingestion_batch_size – Batch size for ingestion (optional, default: 100)

  • timeout – Request timeout in seconds (default: 30)

  • ttl – Time To Live in days (optional). Integer representing days (e.g., ttl=10 means 10 days). If not provided, defaults to 30 days. Note: In this function API, passing ttl=None will also default to 30 days. To explicitly skip TTL for permanent items, use the class-based API with ttl=None. The value is stored as properties[“ss:ttl”] in STAC metadata.

  • **kwargs – Additional keyword arguments (for future extensions)

Returns:

Job ID(s) for tracking the ingestion workflow.

Return type:

Union[str, List[str]]

Raises:

Subpackages