Skip to content

3. Latimer Core Schemes

Kate Webbink edited this page Aug 23, 2024 · 7 revisions

LtC does not impose a single structure on its implementations. Instead it provides a series of classes that provide users with a standardised methodology for documenting the model that they choose to use.

3.1 Introduction to Latimer Core Schemes

When describing large collections it is anticipated that the same collections can be described using different schemes for different purposes. For instance a museum collection may be described based on “famous named” collections or collectors (e.g., Darwin, Spruce) if an aggregator has the need to “find” lost specimens from previously formed collections. The same collection may be described in whole or part based on taxonomic or geographic properties for the purpose of environmental or taxonomic research or funding. For potential examples of existing use cases that could be defined as Latimer Core Schemes, see the Use Cases section later in this document.

The ltc:LatimerCoreScheme class, and the supporting ltc:SchemeTerm and ltc:SchemeMeasurementOrFact classes are intended to provide some parameters around the purpose and expectations of the descriptions and to indicate if objects within the descriptions are assigned attributes that will cause errors in metrics if not explicitly noted.

Using these three classes enables you to build a ‘profile’ for your LtC implementation, so that you may:

  1. describe the purpose of your Latimer Core Scheme (using the LatimerCoreScheme.basisOfScheme property)
  2. define whether the ObjectGroups within the scheme overlap (i.e., a single object might be represented in more than one ObjectGroup) or are distinct (using the LatimerCoreScheme.isDistinctObjects property`)
  3. apply restrictions on which terms within the overall LtC standard can be included, and which are mandatory (using the SchemeTerm class)
  4. define the metrics that you want to be included in the scheme via the MeasurementOrFact class (using the SchemeMeasurementOrFact class)

Essentially, LtC is a fairly broad and flexible standard which can be applied in multiple ways. While this allows it to support a broad range of collection description use cases, it also presents a risk that if its use isn’t constrained appropriately to fit the use case, data coherency and usability may be compromised. In particular, defining 1. common metrics and 2. controlled vocabularies for appropriate terms are vital steps for making sure that the data are consistent and interoperable. The Latimer Core Scheme concept and related LtC standard terms are intended to help to support this process.

3.2 Defining a Latimer Core Scheme - an example process

Below is an example of steps that you can take to begin defining a new Latimer Core Scheme in the LtC standard, using the LatimerCoreScheme, SchemeTerm and SchemeMeasurementOrFact classes and properties.

Step 1: Use LatimerCoreScheme for the basic definition of your scheme

{
    "@context": {
        "ltc": "http://rs.tdwg.org/ltc/terms/"
    },

    "@type": "ltc:LatimerCoreScheme",
    "schemeName": "NHM London departmental collections",
    "basisOfScheme": "Collections inventory",
    "isDistinctObjects": true
}

This provides a name for the scheme, allowing it to be distinguished from other schemes that might be in the same dataset, and what the scheme is intended to be for. It also dictates that no single object is expected to be represented in more than one ObjectGroup within the scheme, so it should be safe to aggregate metrics within the scheme without the risk of, for example, counting the same object multiple times.

Step 2: add SchemeTerm to define the terms that are allowed in the dataset

{
    "@context": {
        "ltc": "http://rs.tdwg.org/ltc/terms/"
    },

    "@type": "ltc:LatimerCoreScheme",
    "schemeName": "NHM London departmental collections",
    "basisOfScheme": "Collections inventory",
    "isDistinctObjects": true,

    "ltc:hasSchemeTerm": [
        {
            "@type": "ltc:SchemeTerm",
            "termName": "ObjectGroup.collectionName",
            "isMandatoryTerm": true,
            "isRepeatableTerm": false
        },{
            "@type": "ltc:SchemeTerm",
            "termName": "ObjectGroup.preservationMethod",
            "isMandatoryTerm": false,
            "isRepeatableTerm": true
        },{
            "@type": "ltc:SchemeTerm",
            "termName": "Taxon",
            "isMandatoryTerm": false,
            "isRepeatableTerm": true
        },{
            "@type": "ltc:SchemeTerm",
            "termName": "ObjectGroup.Identifier",
            "isMandatoryTerm": true,
            "isRepeatableTerm": false
        },{
            "@type": "ltc:SchemeTerm",
            "termName": "ObjectGroup.Identifier.identifierValue",
            "isMandatoryTerm": true,
            "isRepeatableTerm": false
        },{
            "@type": "ltc:SchemeTerm",
            "termName": "OrganisationalUnit",
            "isMandatoryTerm": true,
            "isRepeatableTerm": false
        },{
            "@type": "ltc:SchemeTerm",
            "termName": "OrganisationalUnit.organisationalUnitName",
            "isMandatoryTerm": true,
            "isRepeatableTerm": false
        },{
            "@type": "ltc:SchemeTerm",
            "termName": "OrganisationalUnit.organisationalUnitType",
            "isMandatoryTerm": true,
            "isRepeatableTerm": false
        },{
            "@type": "ltc:SchemeTerm",
            "termName": "StorageLocation",
            "isMandatoryTerm": true,
            "isRepeatableTerm": false
        },{
            "@type": "ltc:SchemeTerm",
            "termName": "StorageLocation.locationName",
            "isMandatoryTerm": true,
            "isRepeatableTerm": false
        },{
            "@type": "ltc:SchemeTerm",
            "termName": "StorageLocation.locationType",
            "isMandatoryTerm": true,
            "isRepeatableTerm": false
        }
    ]
}

In human-readable terms, this says:

"We want the collection to be broken down by department and building, so each subcollection must have one, and only one, department (OrganisationalUnit) and building (StorageLocation) attached. We need to have the type and name for each of those things so that we know what they are. Each subcollection should also have a single short name (ObjectGroup.collectionName) and an identifier (ObjectGroup.Identifier) so that humans and machines can tell them apart. It may be useful, but not critical, to get an idea of the taxa represented and preservation methods used in each of those subcollections. However not everyone will have the time to add that data, so we’ll make that option available but not force it at this stage."

This has implications on the structure of the data, as the isMandatoryTerm = “true” and isRepeatableTerm = “false” values for OrganisationalUnit and StorageLocation dictate that there should be one ObjectGroup created for every combination of (in this example) department and building. More information on LtC standard modeling approaches can be found in the ObjectGroups and relationships section.

As shown in the example above, dot notation may be used in the ltc:termName values to be explicit about which a. which class a property belongs to (“<class name>.<property name>”) and b. which class a generic class should be attached to (“<class name>”.”<class name>”).

For example, the Identifier class may be attached to multiple other classes within LtC. By using “ObjectGroup.Identifier” and “ObjectGroup.Identifier.identifierValue” when defining those terms as being mandatory in the LatimerCoreScheme above, we can be clear that these SchemeTerms are setting rules about identifiers that are linked to an ObjectGroup, not identifiers that are linked to (for example) a Person or a Taxon.

Step 3: add SchemeMeasurementOrFact to define the quantitative and qualitative measures that we want to include in the dataset

{
    "@context": {
        "ltc": "http://rs.tdwg.org/ltc/terms/"
    },

    "@type": "ltc:LatimerCoreScheme",
    "schemeName": "NHM London departmental collections",
    "basisOfScheme": "Collections inventory",
    "isDistinctObjects": true,

    "ltc:hasSchemeTerm": [...],

    "ltc:hasSchemeMeasurementOrFact": [
        {
            "@type": "ltc:SchemeMeasurementOrFact",
            "schemeMeasurementType": "Object count",
            "isMandatoryMetric": true,
            "isRepeatableMetric": false
        },{
            "@type": "ltc:SchemeMeasurementOrFact",
            "schemeMeasurementType": "Percentage barcoded",
            "isMandatoryMetric": true,
            "isRepeatableMetric": false
        },{
            "@type": "ltc:SchemeMeasurementOrFact",
            "schemeMeasurementType": "Historical narrative",
            "isMandatoryMetric": false,
            "isRepeatableMetric": true
        }
  ],
}

In human-readable terms, this says:

"For every subcollection, we expect people to provide one, and only one, estimate or count of the number of objects in that subcollection, and the same for the percentage of those objects that have been barcoded. If people have the time, we’ll also provide the facility to add a historical narrative to describe the subcollection, but make that optional."

The implication of this is that we would expect to see two instances of the MeasurementOrFact class, one with measurementType of “Object count” and one of “Percentage barcoded”, for every ObjectGroup linked to the LatimerCoreScheme, and can validate against that expectation.

The principle of defining a new Latimer Core Scheme using the LatimerCoreScheme classes as demonstrated in the example above is similar to (and simpler than) constructs such as JSON Schema and RDF Schema. JSON Schema (json-schema.org) and RDF Schema (www.w3.org/TR/rdf-schema) constructs include their own respective "built-in" methods for defining mandatory and repeatable terms in a given record structure, so LtC implementations in JSON or RDF could use either of those respective Schema formats. However, the LatimerCoreScheme class enables schema-definitions in different data serialisation formats that do not have "built-in" schema validation methods.

Examples

In the example below a LtC description record for the Insects and Invertebrate Zoology collections at the Field Museum is created and its three-term LatimerCoreScheme is included.



Figure 2: An example record structure that might be a useful scheme for an institution’s contribution to a global registry of collections like GRSciColl.



Figure 3: Another example record structure of a way to describe all of the “famous” collections within a larger collection. In this instance, the OrganisationalUnit class is omitted, as it’s implicit that this is a Field Museum of Natural History (FMNH) Latimer Core Scheme and so all collections in the dataset belong to the same institution.

In both of the above examples the isDistinctObjects term is ‘true’, because there is no overlap in objects between the two ObjectGroups, and so we can be sure that if the ‘Specimen count’ metric is being aggregated, nothing would be counted twice. However, if the two examples (“FMNH Collections” and “FMNH recognized Named Collections”) were to be combined as a single Latimer Core Scheme in the same dataset, the LatimerCoreScheme.isDistinctObjects term needs to be ‘false’ (Figure 4).



Figure 4: An example of a record-structure that combines ObjectGroups from the above examples, and has overlapping “Specimen count” measurements.

The isDistinctObjects term becomes ‘false’ because the “Darwin Beetles” and “Strecker Collection” ObjectGroups are actually contained within the “Insect Collections” ObjectGroup, and so if we aggregate the ‘Specimen count’ metric across all of the ObjectGroups in the Latimer Core Scheme, the objects in those two smaller ObjectGroups would end up being counted twice. isDistinctObjects being set to false provides a warning that this is the case.

For this reason, combining the two schemes into one in this scenario is likely in this example to be detrimental, as it’s no longer easy to extract accurate aggregations of metrics from the dataset. It’s possible to maintain multiple LatimerCoreSchemes within the same LtC dataset, and if queries incorporate the LatimerCoreScheme into their logic (e.g., aggregate ‘Specimen count’ for all ObjectGroups in the “FMNH Collections” LatimerCoreScheme), then the issues with double-counting can still be avoided.

In addition, as shown in Figure 5, this makes it possible to add semantic relationships between ObjectGroups across the two schemes using the ResourceRelationship class for more detailed reporting - for example, to assert that the “Darwin Beetles” ObjectGroup in the “FMNH recognized Named Collections” LatimerCoreScheme ‘is part of’ the “Insect Collections” ObjectGroup in the “FMNH Collections” LatimerCoreScheme. More information on linking ObjectGroups can be found in the Linking ObjectGroups section.



Figure 5: Example of maintaining two Latimer Core Schemes in parallel, with ObjectGroup to ObjectGroup relationships across schemes using the ResourceRelationship class.