satorbis_kit.pgstac.validations.inputs module

Input validation for STAC ingestion workflows.

class satorbis_kit.pgstac.validations.inputs.IngestionInput(*args: Any, **kwargs: Any)[source]

Bases: BaseModel

Validated input parameters for STAC ingestion.

This Pydantic model validates all user inputs before submitting to Airflow. It performs comprehensive validation including: - S3 URL format and structure - Filename patterns matching STAC GenericMetaExtractor - Date validation in filenames - Collection name format - COG profile and options - Batch size constraints - TTL (Time To Live) validation

All validation errors are raised with detailed, actionable messages.

raster_s3_urls

List of S3 URLs to raster files (required)

Type:

List[str]

collection

STAC collection name (required)

Type:

str

ingestion_batch_size

Batch size for ingestion (optional, default: 100)

Type:

int | None

convert_to_cog

Whether to convert to COG format (optional)

Type:

bool | None

cog_profile

COG compression profile (optional)

Type:

str | None

cog_profile_options

Profile options for cog_translate (optional)

Type:

Dict[str, Any] | None

cog_overview_level

Number of overview levels (optional)

Type:

int | None

lineage

Lineage information (optional)

Type:

Any | None

ttl

Time To Live in days (optional). If None, TTL is skipped for permanent items. If not provided, defaults to 30 days. Value must be a positive integer.

Type:

int | None

Example

>>> validated = IngestionInput(
...     raster_s3_urls=["s3://bucket/TEST_COL/path/20240401.tif"],
...     collection="TEST_COL",
...     convert_to_cog=True,
...     cog_profile="lzw",
... )
>>> # Also accepts .tiff extension
>>> validated = IngestionInput(
...     raster_s3_urls=["s3://bucket/TEST_COL/path/20240401.tiff"],
...     collection="TEST_COL",
... )
cog_overview_level: int | None = None
cog_profile: str | None = None
cog_profile_options: Dict[str, Any] | None = None
collection: str = Ellipsis
convert_to_cog: bool | None = None
ingestion_batch_size: int | None = None
lineage: Any | None = None
model_config = {'extra': 'forbid', 'str_strip_whitespace': True}
raster_s3_urls: List[str] = Ellipsis
ttl: int | None = None
validate_cog_profile()

Validate COG profile name.

Parameters:

v – COG profile name to validate

Returns:

Validated COG profile name

Raises:

ValueError – If profile is invalid

validate_cog_profile_options()

Validate COG profile options.

Parameters:

v – COG profile options dictionary to validate

Returns:

Validated options dictionary

Raises:

ValueError – If any option is invalid

validate_collection_name()

Validate collection name format.

Parameters:

v – Collection name to validate

Returns:

Validated collection name

Raises:

ValueError – If collection name is invalid

validate_s3_urls()

Validate each S3 URL for format, structure, and filename patterns.

Parameters:

v – List of S3 URLs to validate

Returns:

Validated list of S3 URLs

Raises:

ValueError – If any URL fails validation

validate_ttl()

Validate TTL value.

Parameters:

v – TTL value in days to validate (before type coercion)

Returns:

Validated TTL value

Raises:

ValueError – If TTL is invalid

satorbis_kit.pgstac.validations.inputs.validate_inputs(**kwargs) IngestionInput[source]

Validate ingestion inputs using Pydantic.

This is a convenience function that wraps IngestionInput validation and converts Pydantic errors to custom ValidationError.

Parameters:

**kwargs – Keyword arguments for ingestion (see IngestionInput for fields)

Returns:

Validated IngestionInput model

Raises:

ValidationError – If validation fails with detailed error message

Example

>>> validated = validate_inputs(
...     raster_s3_urls=["s3://bucket/TEST_COL/path/20240401.tif"],
...     collection="TEST_COL",
...     ingestion_batch_size=100,
... )