Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multilinguality in CIM? #73

Open
VladimirAlexiev opened this issue Sep 19, 2024 · 2 comments
Open

Multilinguality in CIM? #73

VladimirAlexiev opened this issue Sep 19, 2024 · 2 comments
Assignees
Labels
ontology Pertains to ontology representation typography whitespace, HTML etc in definitions and values

Comments

@VladimirAlexiev
Copy link
Collaborator

This section was provoked by pondering the difference between cim:String and profcim:StringFixedLanguage.

AFAIK, CIM does not allow (and has not considered?) multilinguality

Eg cim:IdentifiedObject.name doesn't allow multiple values:

ido:IdentifiedObject.name-cardinality
        rdf:type        sh:PropertyShape;
        sh:description  "This constraint validates the cardinality of the property (attribute).";
        sh:group        ido:CardinalityIO;
        sh:message      "Missing required property (attribute).";
        sh:maxCount     1;
        sh:minCount     1;
        sh:name         "IdentifiedObject.name-cardinality";
        sh:order        0.1;
        sh:path         cim:IdentifiedObject.name;
        sh:severity     sh:Violation .

I think it would be better to allow multiple values
but impose a sh:uniqueLang constraint (skos:prefLabel has the same restriction).
In that way CIM data could accommodate multilinguality.
Eg looking at some random properties:

  • cim:IdentifiedObject.mRID: always string
  • cim:IdentifiedObject.description: string or langString
  • cim:IdentifiedObject.name: string or langString
  • nc:AssessedElementWithContingency.mRID: always string
  • nc:AssessedElement.normalTargetRemainingAvailableMarginJustification: string or langString

Unfortunately, cim:String is used even for props that should not allow langString,
i.e. no distinction is made between these two cases:

  • Names/descriptions could be string or langString
  • But identifiers should only be string

So for the time being I think CIM implicitly forbids the use of langString:
if you cannot have multiple uniqueLang values, there's not much use for lang tags.
Also, allowing lang tags may cause some disturbance in some receiving system.

So I'll map cim:String to xsd:string

rdf:PlainLiteral

The EU eProcurement Ontology allows multilingual data, and used rdfs:Literal.
But that datatype is way too broad, so I raised an issue:
OP-TED/ted-rdf-mapping#407

The datatype hierarchy is like this: rdfs:Literal > rdf:PlainLiteral > (xsd:string, rdf:langString).
What a text field needs to be mapped to depends on its nature:

  • xsd:string is appropriate for codes that are never translated to multiple langs
  • rdf:langString is appropriate for texts that are always translated to multiple langs (if not now, then in the future): so a lang tag is required
  • rdf:PlainLiteral is appropriate for texts that may but don't have to be translated, i.e. lang tag is not required. It is defined at https://w3.org/TR/rdf-plain-literal , and means string or langString.

If you want cim:String to allow langStrings, then we should map it to rdf:PlainLiteral.

@VladimirAlexiev VladimirAlexiev added ontology Pertains to ontology representation typography whitespace, HTML etc in definitions and values labels Sep 19, 2024
@VladimirAlexiev
Copy link
Collaborator Author

hi @aries2004-bit is this a honest mistake, or are you some bot leading us to some malware?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ontology Pertains to ontology representation typography whitespace, HTML etc in definitions and values
Projects
None yet
Development

No branches or pull requests

3 participants
@VladimirAlexiev @Sveino and others