satorbis_kit.pgstac package¶
STAC ingestion API for satorbis_kit.
This module provides functionality for submitting raster stacking and STAC ingestion workflows to Airflow. It supports:
Input validation with detailed error messages
COG (Cloud Optimized GeoTIFF) conversion with customizable parameters
Batch processing configuration
Job tracking and status monitoring
TTL (Time To Live) support for STAC items (specified in days)
## Quick Start
### Simple Usage (Function-based API)
Submit via Airflow: ```python from satorbis_kit import stack_rasters_and_ingest_via_airflow job_id = stack_rasters_and_ingest_via_airflow(
s3_urls=[“s3://bucket/TEST_COL/path/20240401.tif”], collection_name=”TEST_COL”, airflow_base_url=”https://airflow.example.com”, airflow_username=”admin”, airflow_password=”secret”, ttl=10, # TTL in days (defaults to 30 if not provided)
)¶
Submit via Spatial Engine: ```python from satorbis_kit import stack_rasters_and_ingest_via_spatial_engine job_id = stack_rasters_and_ingest_via_spatial_engine(
s3_urls=[“s3://bucket/TEST_COL/path/20240401.tif”], collection_name=”TEST_COL”, base_url=”https://api.example.com”, api_key=”your-key”, ttl=10, # TTL in days (defaults to 30 if not provided)
)¶
### Advanced Usage (Class-based API)
```python from satorbis_kit import STACIngestionManager
- manager = STACIngestionManager(
airflow_base_url=”https://airflow.example.com”, airflow_username=”admin”, airflow_password=”secret”,
)
# With default TTL (30 days) job_id = manager.ingest_rasters(
raster_s3_urls=[“s3://bucket/TEST_COL/path/20240401.tif”], collection=”TEST_COL”, convert_to_cog=True, cog_profile=”lzw”, cog_profile_options={
“blockxsize”: 512, “blockysize”: 512, “predictor”: 2,
},
)
# With custom TTL (10 days) job_id = manager.ingest_rasters(
raster_s3_urls=[“s3://bucket/TEST_COL/path/20240401.tif”], collection=”TEST_COL”, ttl=10, # TTL in days
)
# Skip TTL for permanent items job_id = manager.ingest_rasters(
raster_s3_urls=[“s3://bucket/TEST_COL/path/20240401.tif”], collection=”TEST_COL”, ttl=None, # Skip TTL - item will never expire
)¶
### TTL (Time To Live) Support
TTL allows you to specify how long STAC items should be retained. The TTL value is specified in days, but is stored in STAC metadata as an expiry date string in properties[“ss:ttl”].
Default TTL: If ttl is not provided, items default to 30 days TTL (today + 30 days)
Custom TTL: Provide an integer value (e.g., ttl=10 for 10 days), stored as today’s date + 10 days in ISO date format (YYYY-MM-DD)
Skip TTL: Pass ttl=None for items that should never expire (permanent items). In this case properties[“ss:ttl”] is omitted.
- class satorbis_kit.pgstac.STACIngestionManager(client: AbstractSTACClient, upload_handler: AbstractRasterUploadHandler | None = None)[source]¶
Bases:
BaseSTACIngestionManagerManager for STAC raster ingestion workflows.
This is the main public API for STAC ingestion. It provides factory methods for creating manager instances configured for different backends: - Airflow: Direct DAG triggering for smaller jobs - Spatial Engine: SQS-based queueing for large-scale jobs
The manager handles: - Input validation - Batch submission and chunking - Cloud storage configuration - Raster asset uploads with STAC-compliant naming
See also
from_airflow(): Create manager for Airflow backendfrom_spatial_engine(): Create manager for spatial engine backendstack_rasters_and_ingest_via_airflow(): Function-based API for Airflowstack_rasters_and_ingest_via_spatial_engine(): Function-based API for spatial engine
- classmethod from_airflow(airflow_base_url: str, airflow_username: str, airflow_password: str, upload_handler: AbstractRasterUploadHandler | None = None) STACIngestionManager[source]¶
Create manager for direct Airflow DAG triggering.
Use this method when you want to trigger Airflow DAGs directly. Suitable for smaller batch jobs.
- Parameters:
airflow_base_url – Base URL for Airflow API (required)
airflow_username – Username for Airflow authentication (required)
airflow_password – Password for Airflow authentication (required)
upload_handler – Optional custom path builder for uploads
- Returns:
STACIngestionManager configured for Airflow
Example
>>> manager = STACIngestionManager.from_airflow( ... airflow_base_url="https://airflow.example.com", ... airflow_username="admin", ... airflow_password="secret", ... ) >>> job_id = manager.ingest_rasters( ... raster_s3_urls=["s3://bucket/COL/20240401.tif"], ... collection="TEST_COL", ... )
- classmethod from_spatial_engine(base_url: str | None = None, api_key: str | None = None, timeout: int = 30, upload_handler: AbstractRasterUploadHandler | None = None) STACIngestionManager[source]¶
Create manager for spatial engine API with SQS queueing.
Use this method when you want to submit jobs through a spatial engine API (e.g., OpenEO) which handles SQS queueing. Better for large batch jobs and scalability.
- Parameters:
base_url – Base URL for spatial engine API (defaults to https://dev.openeo.satsure.co)
api_key – Optional API key for Bearer token authentication
timeout – Request timeout in seconds (default: 30)
upload_handler – Optional custom path builder for uploads
- Returns:
STACIngestionManager configured for spatial engine
Example
>>> manager = STACIngestionManager.from_spatial_engine( ... base_url="https://api.example.com", ... api_key="your-api-key", ... ) >>> job_id = manager.ingest_rasters( ... raster_s3_urls=["s3://bucket/COL/20240401.tif"], ... collection="TEST_COL", ... )
- satorbis_kit.pgstac.stack_rasters_and_ingest_via_airflow(s3_urls: List[str], collection_name: str, airflow_base_url: str, airflow_username: str, airflow_password: str, convert_to_cog: bool | None = None, cog_profile: str | None = None, cog_profile_options: Dict[str, Any] | None = None, cog_overview_level: int | None = None, ingestion_batch_size: int | None = None, ttl: int | None = None, **kwargs: Any) str | List[str][source]¶
Convenience function to submit raster ingestion via Airflow.
This function provides a simple, function-based interface for basic use cases using direct Airflow DAG triggering.
For OpenEO/SQS-based submission, use
stack_rasters_and_ingest_via_spatial_engine(). For more control and advanced features, useSTACIngestionManagerclass directly.- Parameters:
s3_urls – List of S3 URLs to raster files
collection_name – STAC collection name
airflow_base_url – Airflow API URL (required)
airflow_username – Username for authentication (required)
airflow_password – Password for authentication (required)
convert_to_cog – Whether to convert to COG format (optional)
cog_profile – COG profile name (e.g., ‘lzw’, ‘deflate’) (optional)
cog_profile_options – Profile options for cog_translate (optional)
cog_overview_level – Number of overview levels (optional)
ingestion_batch_size – Batch size for ingestion (optional, default: 100)
ttl – Time To Live in days (optional). Integer representing days (e.g., ttl=10 means 10 days). If not provided, defaults to 30 days. Note: In this function API, passing ttl=None will also default to 30 days. To explicitly skip TTL for permanent items, use the class-based API with ttl=None. The value is stored as an expiry date string in properties[“ss:ttl”] in STAC metadata, computed as today’s date + ttl days (ISO format YYYY-MM-DD).
**kwargs – Additional keyword arguments (for future extensions)
- Returns:
Job ID(s) for tracking the ingestion workflow.
- Return type:
Union[str, List[str]]
- Raises:
ValidationError – If any input parameters are invalid
APIError – If Airflow API request fails
- satorbis_kit.pgstac.stack_rasters_and_ingest_via_spatial_engine(s3_urls: List[str], collection_name: str, base_url: str | None = None, api_key: str | None = None, convert_to_cog: bool | None = None, cog_profile: str | None = None, cog_profile_options: Dict[str, Any] | None = None, cog_overview_level: int | None = None, ingestion_batch_size: int | None = None, timeout: int = 30, ttl: int | None = None, **kwargs: Any) str | List[str][source]¶
Convenience function to submit raster ingestion via spatial engine API.
This function provides a simple, function-based interface for submitting jobs through a spatial engine API (e.g., OpenEO) with SQS queueing. Better for large batch jobs.
For direct Airflow triggering, use
stack_rasters_and_ingest(). For more control and advanced features, useSTACIngestionManagerclass directly.- Parameters:
s3_urls – List of S3 URLs to raster files
collection_name – STAC collection name
base_url – Spatial engine API base URL (defaults to https://dev.openeo.satsure.co)
api_key – Optional API key for Bearer token authentication
convert_to_cog – Whether to convert to COG format (optional)
cog_profile – COG profile name (e.g., ‘lzw’, ‘deflate’) (optional)
cog_profile_options – Profile options for cog_translate (optional)
cog_overview_level – Number of overview levels (optional)
ingestion_batch_size – Batch size for ingestion (optional, default: 100)
timeout – Request timeout in seconds (default: 30)
ttl – Time To Live in days (optional). Integer representing days (e.g., ttl=10 means 10 days). If not provided, defaults to 30 days. Note: In this function API, passing ttl=None will also default to 30 days. To explicitly skip TTL for permanent items, use the class-based API with ttl=None. The value is stored as properties[“ss:ttl”] in STAC metadata.
**kwargs – Additional keyword arguments (for future extensions)
- Returns:
Job ID(s) for tracking the ingestion workflow.
- Return type:
Union[str, List[str]]
- Raises:
ValidationError – If any input parameters are invalid
APIError – If spatial engine API request fails
Subpackages¶
- satorbis_kit.pgstac.clients package
- satorbis_kit.pgstac.exceptions package
- satorbis_kit.pgstac.manager package
- satorbis_kit.pgstac.models package
- satorbis_kit.pgstac.uploader package
- satorbis_kit.pgstac.validations package