This document explains the fields of a SpatioTemporal Asset Catalog (STAC) Item
. Each is a GeoJSON feature,
plus a few required fields that identify the time range and assets of the item. An Item
is the core
granular entity in a STAC, containing the core metadata that enables any client to search or crawl online
catalogs of spatial 'assets' - satellite imagery, derived data, DEM's, etc.
The same Item
definition is used in both 'static catalogs' and 'catalog apis'. Static catalogs are simply
sets of items that are linked online, generally served by simple web servers and used for crawling data.
Catalog API's enable dynamic queries, for example selecting all Item
s in Hawaii on June 3, 2015, but the
results they return are FeatureCollections of items.
Item
s are very flexible - any JSON object that contains all the required fields is a valid STAC Item
. There
are emerging best practices, which in time will evolve in to specification extensions for particular domains or
uses.
This is still an early version of the STAC spec, expect that there may be some changes before everything is finalized.
Implementations are encouraged, however, as good effort will be made to not change anything too drastically. Using the specification now will ensure that needed changes can be made before everything is locked in. So now is an ideal time to implement, as your feedback will be directly incorporated.
element | type info | name | description |
---|---|---|---|
id | string | Provider ID | Provider ID for the item |
geometry | geojson | Geometry | Bounding Box + Footprint of the item in lat/long (EPSG 4326) |
start | date and time | Date & Time Start | First date/time in UTC (Combined date and time representation) |
end | date and time | Date & Time End | Last date/time in UTC (Combined date and time representation) |
links | array | Resource Links | Array of link objects to resources and related URLs (self and thumbnail required) |
assets | array | Assets | Array of asset objects that can be be download (at least one required) |
provider | string | Provider (optional) | Provider name |
license | string | Data License (optional) | Item's license name based on SPDX License List or following guidelines for non-SPDX licenses |
The Item
is a JSON object, specifically a GeoJSON Feature
, that validates against the stac_item
JSON Schema document. Listed below are a set of instructions to validate implementations, but any JSON Schema validation using the
STAC Item schema works great.
id uses the GeoJSON id field. As most geospatial assets are already defined by some identification scheme by the data provider it is recommended to simply use that ID. In time there may be a best practice recommendation for creating new ID's, or a method for globally unique identifiers, but for now data providers are advised to include sufficient information to make their ID's globally unique, including things like unique satellite id's.
geometry defines the full footprint of the asset represented by this item, formatted according to RFC7946 - GeoJSON. The footprint should be the default GeoJSON geometry, though additional geometries can be included. All geometries should
be either Polygons or MultiPolygons, as assets represent an area, not a line or point. Bounding Boxes are required, on the 'Feature'
level in GeoJSON, and most software can easily generate BBOX's for footprints. This is to enable more naive clients to easily index
and search geospatially. GeoJSON is specified in Long/Latitude - EPSG code 4326, and the 'geometry' element of all STAC Items
should be the same.
start and end are the date-times that represent the beginning and end times of the capture of the asset. They are formatted
according to RFC 3339, section 5.6. It is acceptable to have start
and end
to
be the same as one another, generally representing a single capture. Many satellites do record the beginning and end of their
capture, down to milliseconds, so if those are available then it is recommended to use them. But things like drone captures are often
composites over several hours, and mosaics can use data from months or even years to make a coherent capture.
links are used primarily to represent relationships with other entities, relying on the rel
attribute on links to describe relationships. Link objects require an 'href' to provide the actual link - relative and
absolute links are both allowed. And they also require a 'rel' field, which describes the relationship of the link.
This can be filled with any string, and best practices are emerging and will be defined in the future. Implementors are
advised to be liberal with the links section, to describe things like the catalog an item is in, related items, parent or
child items (modeled in different ways, like an 'acquisition' or derived data). The 'self' link is required, to represent the
location that the Item
can be found online. This is particularly useful when in a download package that includes Item
metadata, so that the downstream user can know where the data has come from.
thumbnail is a special required link
that is a downsampled image of the core asset, for direct display online in a web page or
interactive search application. Even assets that are less easily translated in to a visual image should provide some visual representation,
that users can browse. For example a SAR asset can render an elevation mask or hillshade for display.
assets consists of links to data associated with the Item
that can be downloaded. This should include the main asset, as well
as any 'sidecar' files that are related and help a client make sense of the data. Examples of this include extended metadata (in XML,
JSON, etc), unusable data masks, satellite ephemeris data, etc. Some assets (like Landsat data) are represented by multiple files -
all should be linked to. It is generally recommended that different processing levels or formats are not exhaustively listed in an
Item
, but instead are represented by related Items
that are linked to, but the best practices around this are still emerging.
An asset object currently just requires a href field, which can be a relative and absolute links, to provide a link to the
asset for download or streaming access. The 'name' field is optional, and is generally the display name for clients & users.
The asset definition will likely evolve soon to be able to explain itself more.
provider is an optional field that lists the name of the provider. This field will likely evolve a bit to be useful for querying and filtering, but for now is just the name.
license specifies the license that the data is available under. For public data licenses this should be an identifier from the SPDX License list. The practice for non-public data is still evolving, but it is recommended to link to the actual license text the data is available under.
Currently the JSON schema for links does not require a URI formatting, to give the option for implementors to provide relative links. In general Catalog API's should aim for absolute links whenever possible. But Static Catalogs are potentially more portable if they can be implemented with relative links, so that every link doesn't need to be rewritten when the data is copied.
See the sample.json for a minimal example, as well as sample-full.json for a more fleshed example that contains a number of current best practices. There are also a few more real world inspired samples in the examples/ folder.
Any JSON Schema validation tool can be used, just run the json data to test against the stac-item.json schema, and be sure to include geojson.json schema in the testing.
See the README for instructions on how to install and run a javascript validator.
Warning - Not all validation is fully complete. The validator does not yet check for self links. The href checking is probably too loose right now, it just checks for a string, see the 'relative vs absolute links' section above for reasons why.