-
Notifications
You must be signed in to change notification settings - Fork 0
/
index.qmd
165 lines (118 loc) · 9.5 KB
/
index.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
---
title: CIDOC-CRM in RDF Application Profile
---
## Background and Motivation
The [CIDOC Conceptual Reference Model (CRM)](https://www.cidoc-crm.org/) is a conceptual data model used in the cultural heritage domain. The [Resource Description Framework (RDF)](https://www.w3.org/TR/rdf11-concepts/) is a graph-based data format. Both CRM and RDF have been created independently for integration of information. RDF is a good fit to express CRM in data and it has been used to do so. The expression of CRM in RDF is not trivial though, so some guidelines are needed. This document is going to provide an application profile to best use CRM in RDF for integration with other RDF data within NFDI4Objects. The document is going to be compared and aligned with similar recommendations such as @Doerr2020.
The document is managed in a git repository at <https://github.com/nfdi4objects/crm-rdf-ap>. Contributions and feedback is very welcome!
### Difficulties in expressing CRM and RDF
CRM defines abstract types of entities (CRM classes) such as events, measurements, places, and actors with relationship types (CRM properties) to connect instances of these entity types. RDF and its most common extensions define how to identify entities (resources), entity types (RDF classes) and relationship types (CRM properties) with URIs and values with Unicode strings (RDF literals) optionally having a language or a data type to encode values such as numbers and dates. RDF is used with ontologies that define RDF classes, properties, and constraints. CRM looks like an ontology or like it could directly be mapped to an RDF ontology, but this is not the case. CRM is agnostic to data formats: CRM classes are not RDF classes and CRM has no concept of data types and values, so any expression of CRM in RDF comes with choices of design. It is possible to express the same information modeled with CRM in different forms of RDF, so data cannot be integrated flawlessly.
## Guidelines
### Primitive values
[E59 Primitive Value] and its subclasses are not expressed as RDF classes. Instead
- instances of [E62 String](http://www.cidoc-crm.org/cidoc-crm/E62) are expressed as RDF literals with optional language tag, and
- instances of [E60 Number](http://www.cidoc-crm.org/cidoc-crm/E60) are expressed as RDF literals with numeric data type such as `xsd:integer`
The CRM classes [E61 Time Primitive], [E94 Space Primitive], and [E95 Spacetime Primitive] are both subclasses of [E59 Primitive Value] and of [E41 Appellation], so [the latter](#e41-appelation) can be used when a mapping to established RDF data types is not applicable.
Instances of **[E61 Time Primitive] and [E52 Time-Span]** are better expressed as RDF literals of type `xsd:date`, `xsd:time`, `xsd:dateTime`, or `xsd:dateTimeStamp` if applicable. More complex time values should be expressed using the [Extended Date/Time Format (EDTF)](https://www.loc.gov/standards/datetime/) but there is no established method to calculate with dates in RDF yet.^[See discussion to extend SPARQL [for simple dates](https://github.com/w3c/sparql-dev/blob/main/SEP/SEP-0002/sep-0002.md) and [EDTF in RDF](https://periodo.github.io/edtf-ontology/).] CRM includes its own classes and properties to model more complex temporal values so this has not been decided yet.
~~~ttl
@prefix edtf: <http://id.loc.gov/datatypes/edtf/>
@prefix unit: <http://qudt.org/vocab/unit/> .
<TitanticSinking> a crm:E81_Transformation ;
crm:P124_transformed <RMSTitanic> ;
crm:P123_resulted_in <TitanticWreck> .
crm:P4_has_time-span
"1912-04-15"^^xsd:date , # or ^^edtf:EDTF (subsumes xsd:date)
[
a crm:E52_Time-Span ;
crm:P82_at_some_time_within "1912-04-15"^^xsd:date
] ;
# TODO: add exact time of sinking (02:38–05:18 GMT)
.
~~~
Instances of **[E94 Space Primitive]** should be expressed using [GeoSPARQL] Ontology as instance of `geo:hasGeometry`, compatible with various geographic data formats (WKT, GeoJSON, GML...).^[See also CRM Geo draft at <http://www.cidoc-crm.org/extensions/crmgeo/>, defining superclasses of `geo:Geometry`.] CRM Property [P168 place is defined by] should be expressed with RDF property `geo:hasGeometry`. CRM Properties [P171 at some place within], and [P172 contains] can be used as RDF properties to link places ([E53 Place]) to outer and inner geometries but `geo:hasBoundingBox` and `geo:hasCentroid` should be preferred, if applicable.
~~~ttl
<TitanticWreckLocation> a crm:E53_Place ;
crm:P89_falls_within <AtlanticOcean> ;
geo:hasGeometry [ a geo:Geometry ;
geo:asGeoJSON '{"type": "Point","coordinates": [-49.946944,41.7325,-3803]}' ;
geo:asWKT "POINT (-49.946944 41.7325 -3803)" ;
] .
~~~
GeoSPARQL properties `geo:hasMetricSpatialResolution` and/or `geo:hasSpatialAccuracy` can be used to indicate level of detail.
Instances of [E95 Spacetime Primitive] ... (TODO)
[GeoSPARQL]: https://www.ogc.org/de/publications/standard/geosparql/
[P171 at some place within]: http://www.cidoc-crm.org/cidoc-crm/P171
[P172 contains]: http://www.cidoc-crm.org/cidoc-crm/P172
[P168 place is defined by]: http://www.cidoc-crm.org/cidoc-crm/P168
### Authority files and types
CRM class [E32 Authority Document] and CRM property [P71 lists] should not be used in RDF but corresponding SKOS RDF classes `skos:ConceptScheme` and `skos:inScheme`. Applications may define `skos:ConceptScheme` as subclass of [E31 Document] and `skos:inScheme` as subproperty of [P67 refers to] if needed.
CRM also defines class [E55 Type] with properties [P127 has broader term] and [P127i has narrower term]. These can be mapped to `skos:Concept` and `skos:broader`/`skos:narrower` or to individual RDF classes, connected with `rdfs:subClassOf`, depending on use case (types are used with CRM property P2, P137, P177, P135, P125, P32, P42).
[P127 has broader term]: http://www.cidoc-crm.org/cidoc-crm/P127
[P127i has narrower term]: http://www.cidoc-crm.org/cidoc-crm/P127i_has_narrower_term
[P71 lists]: http://www.cidoc-crm.org/cidoc-crm/P71
[P67 refers to]: http://www.cidoc-crm.org/cidoc-crm/P67
### CRM Classes to use with caution
#### E58 Measurement Unit
Defintion of instances of [E58 Measurement Unit] should be avoided but either taken from an established vocabulary of units such as QUDT or expressed as RDF value with UCUM datatype.^[See [cdt:ucum](https://ci.mines-stetienne.fr/lindt/v4/custom_datatypes#ucum) and [QUDT](https://qudt.org/).]
~~~ttl
@prefix unit: <http://qudt.org/vocab/unit/> .
@prefix cdt: <https://w3id.org/cdt/> .
<TitanticSinking>
crm:P191_had_duration [ a crm:E54_Dimension ;
crm:P90_has_value 160 ; crm:P91_has_unit unit:MIN ; # value and QUDT unit
rdf:value "7 min"^^cdt:ucum # UCUM string
] .
~~~
#### E41 Appellation
**[E41 Appellation]** and its subclasses ([E35 Title] and [E42 Identifier]) should be avoided (see [above](#primitive-values) for additional subclasses [E61 Time Primitive], [E94 Space Primitive], and [E94 Space Primitive]), unless a name cannot uniquely be identified with a sequence of Unicode characters and an optional language tag:
~~~ttl
<RMSTitantic>
crm:P102_has_title "RMS Titanic"@en ;
crm:P1_is_identified_by [
rdfs:value "MGY" ;
crm:P2_has_type <http://www.wikidata.org/entity/Q353659> # call sign
] .
~~~
If there are multiple names with one preferred name per language and optional name alias, use `skos:prefLabel` and `skos:altLabel`:
~~~ttl
<RMSTitantic>
skos:prefLabel "RMS Titanic"@en ;
skos:altLabel "Titanic"@en, "Royal Mail Steamship Titanic"@en .
~~~
The RDF property `skos:prefLabel` should not be confused with [P48 has preferred identifier] to be used for identifiers only.
If information about the act of naming is required, use [E13 Attribute Assignment] for simple appelations or [E15 Identifier Assignment] for identifiers.
If an identifier **[E42 Identifier] is an URI** meant to identify an RDF resource, dont use plain strings but resource URIs in RDF. If a resource happens to have multiple equivalent URIs, choose a preferred URI and use `owl:sameAs` to record aliases:
~~~ttl
<RMSTitantic> a crm:E18_Physical Thing ;
owl:sameAs
<http://www.wikidata.org/entity/Q3018259> ,
<http://kbpedia.org/kko/rc/RMS-Titanic-TheShip> .
~~~
instead of
~~~ttl
<RMSTitanic> a crm:E18_Physical Thing .
crm:P1_is_identified_by
[ a crm:E42_Identifier ;
crm:P190_has_symbolic_content "http://www.wikidata.org/entity/Q3018259" ] ,
[ a crm:E42_Identifier ;
crm:P190_has_symbolic_content "http://kbpedia.org/kko/rc/RMS-Titanic-TheShip" ] .
~~~
### Deprecated CRM classes
CRM is constantly evolving, so some CRM classes have been renamed or replaced. Outdated classes and properties MUST be supported nevertheless.
## References
::: {#refs}
:::
[E13 Attribute Assignment]: http://www.cidoc-crm.org/cidoc-crm/E13
[E15 Identifier Assignment]: http://www.cidoc-crm.org/cidoc-crm/E15
[E31 Document]: http://www.cidoc-crm.org/cidoc-crm/E31
[E32 Authority Document]: http://www.cidoc-crm.org/cidoc-crm/E32
[E35 Title]: http://www.cidoc-crm.org/cidoc-crm/E35
[E42 Identifier]: http://www.cidoc-crm.org/cidoc-crm/E42
[E55 Type]: http://www.cidoc-crm.org/cidoc-crm/E55
[E58 Measurement Unit]: http://www.cidoc-crm.org/cidoc-crm/E58
[E52 Time-Span]: http://www.cidoc-crm.org/cidoc-crm/E52
[E53 Place]: http://www.cidoc-crm.org/cidoc-crm/E53
[E41 Appellation]: http://www.cidoc-crm.org/cidoc-crm/E41
[E59 Primitive Value]: http://www.cidoc-crm.org/cidoc-crm/E59
[E61 Time Primitive]: http://www.cidoc-crm.org/cidoc-crm/E61
[E94 Space Primitive]: http://www.cidoc-crm.org/cidoc-crm/E94
[E95 Spacetime Primitive]: http://www.cidoc-crm.org/cidoc-crm/E95