Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change term - individualCount #285

Open
danstowell opened this issue Sep 21, 2020 · 18 comments
Open

Change term - individualCount #285

danstowell opened this issue Sep 21, 2020 · 18 comments
Labels
Class - Occurrence Controversial The solution for the issue has not reached a consensus. normative Term - change

Comments

@danstowell
Copy link

danstowell commented Sep 21, 2020

Change term

Current Term definition: https://dwc.tdwg.org/terms/#dwc:individualCount

Proposed new attributes of the term:

  • Term name (in lowerCamelCase): individualCount
  • Organized in Class (e.g. Location, Taxon): Occurrence
  • Definition of the term: (unchanged): The number of individual Organisms sharing the properties of an Occurrence.
  • Usage comments (recommendations regarding content, etc.): This term is not meant to indicate the number of individual Organisms from a Collecting Event present in a collection except insofar as that number happens to be the same number collected at the time and place of the Occurrence. An individualCount of 0 can be used to indicate the absence of any Organism with the characteristics given in the Occurrence, however, recommended best practice in this case is to populate the term occurrenceStatus with absent. To distinguish between the numbers of individuals with distinct characteristics within an Event, it is recommended to separate these into distinct Occurrences. For example, if 3 females and one male were observed at a given place and time, then recommended best practice is to provide one Occurrence record for males with an individualCount of 1 and another Occurrence record for females with an individualCount of 3. A number in the individualCount field is the equivalent of populating the terms organismQuantity with that same number and organismQuantityType with the value "individuals". For a non-numeric indication of the quantity of individuals (such as "many"), it is recommended to use the term organismQuantity.
  • Examples: 0 (a recorded absence), 1, 25
  • Refines (identifier of the broader term this term refines, if applicable): None
  • Replaces (identifier of the existing term that would be deprecated and replaced by this term, if applicable): http://rs.tdwg.org/dwc/terms/version/individualCount-2017-10-06
  • ABCD 2.06 (XPATH of the equivalent term in ABCD or EFG, if applicable): Not in ABCD

From the original comment:

Change term

  • Submitter: Dan Stowell
  • Justification: In Audubon Core, we are considering importing the term dwc:individualCount. However, we discussed whether it represents the number of individuals present, or the number of individuals represented. For example: I have a group of 4 birds, and I record their vocal activity, which happens to be the vocal interaction of the 2 singing males. Should the value of dwc:individualCount be 2 or 4? The specific problem is that the definition,

"The number of individuals represented present at the time of the Occurrence"

is not grammatical according to the everyday linguistic use of those two adjectives. Should it be interpreted to mean "represented AND present", "represented", "represented AS BEING present", "represented COMMA present" or something else?

Proposed new attributes of the term:

  • Definition of the term: "The number of individuals represented as being present at the time of the Occurrence"

(This proposed rewording is very much open to whatever the DWC Maintenance Group's consensus is on the proper interpretation of the term.)

@debpaul
Copy link

debpaul commented Sep 22, 2020 via email

@MattBlissett
Copy link
Member

GBIF is now using this field to assert "ABSENCE" (which means the
specimen is absent?)

I don't know what that would mean. We have about 4,000 occurrences where the basis of record is PreservedSpecimen/FossilSpecimen/Specimen, individualCount is zero, and occurrenceStatus is absent. Perhaps an attempt to say the specimen is lost? (This should use disposition=lost.)

Interpreting the two fields individualCount and occurrenceStatus is a recent addition. We are interpreting individualCount as a whole number and occurrenceStatus as either "present" or "absent" (usual flexibility on language, spelling etc). Our interpretation considers individualCount=0 ⇔ occurrenceStatus=ABSENT.

This revealed some specimen datasets [258k results today] where individualCount=0, and occurrenceStatus was not provided. For the bases of record PreservedSpecimen/FossilSpecimen/LivingSpecimen, we intend to set occurrenceStatus=Present and add a minor issue flag. For HumanObservation/MachineObservation/MaterialSample/Occurrence/Unknown we will set occurrenceStatus=Absent and add a different minor issue flag.

(Further comments on GBIF's changes to interpretation for this are best made here: gbif/pipelines#392 )

@danstowell
Copy link
Author

I'd say it shouldn't be a required term to populate: in some items it is not possible to discern the number of individuals represented/present, other than to establish it is more than zero.

(Related to this, in our Audubon Core discussion, someone raised whether there could be any way of tagging individualCount=many. However, that should probably be a separate issue, it's beyond this current request, which is about clarifying the definition and perhaps usage documentation.)

@tucotuco
Copy link
Member

tucotuco commented Sep 23, 2020

The complete record of the currently recommended individualCount term can be found at http://rs.tdwg.org/dwc/terms/version/individualCount-2017-10-06.htm. There you will see the definition "The number of individuals represented present at the time of the Occurrence" and no comments. If you look back through the replaced term versions you'll see this has been the definition for the term since before ratification.

The ancient history of the term may not be relevant, but just in case, it used to refer specifically to the number of individuals under a given cataloged item in a collection, "The number of individuals present in the lot or container. Not to be used for observations."

The change in the standard version was specifically to make the term relevant to all Occurrence records, including observations, and there the meaning changed to be the number of individuals that were present in the time and place the Occurrence record represents.
The current wording is poor, and the term would benefit greatly from a relevant comment in addition, because the examples must be valid values for the term, and as just numbers, they don't convey any semantics. It would be good if the comments said what to do for an absence record as well.

This issue can probably be fixed as an erratum, with a definition amendment to something like, "The number of individual Organisms of the given Taxon present at the Location when the Event occurred."

@baskaufs
Copy link

It may be that there needs to be a dwciri: term that points to a controlled value term rather than a literal number. That would allow for creating a richer concept scheme that could deal with the complexities like "absent", "one", "more than one", "many", etc.

@tucotuco
Copy link
Member

@debpaul @danstowell In what context are you taking about it being required. The Darwin Core standard doesn't "do" requirements.

@tucotuco tucotuco changed the title Change term - clarify individualCount definition "represented present" Change term - individualCount Apr 17, 2021
@tucotuco tucotuco added this to the The Rush of the April Fools milestone Apr 17, 2021
@tucotuco
Copy link
Member

I have changed the title of the issue and prepended a templated term change request to the original comment so as not to have to make a separate issue and relate it to the discussion in this one.

@baskaufs
Copy link

Currently, dwc:individualCount is expected to only have a literal value, so a dwciri: analog does not apply here.

I believe that a proposal to have a controlled vocabulary for values like "many", "more than one", etc., while discussed here, was not actually included in this proposal, so I don't see that there should be a dwciri: analog unless we create a separate proposal for a term expected to use a controlled vocabulary.

@ekrimmel
Copy link

ekrimmel commented May 28, 2021

Would it be possible to amend the suggested new definition to be "The number of individuals sharing the properties of an Occurrence"? I.e., remove the word "organisms" from the definition?

For paleo collections, asserting that individual objects belonging to a single occurrence record are part of a single organism is not always straightforward and not always true, e.g., a paleobotany specimen consisting of multiple plant parts on a single slab. The terms organismQuantity and organismQuantityType are intrinsically defined as being related to an organism, but individualCount is not and we would prefer to keep it that way. For some additional context to this discussion, see comments from @eclites on #185. For a very nice overview of counting things in paleo collections, see these slide decks put together by @RogerBurkhalter and Margaret Landis: Part I, Part II. We talk explicit about these terms in Part II.

Erica Krimmel, Holly Little (@hollyel), and Talia Karim (@tkarim) (on behalf of the Paleo Data Working Group)

@EstebanMH-SiB
Copy link

We endorse this proposal on behalf of @SiBColombia

@deepreef
Copy link

This is a very important point. The term Organism has a very specific meaning in DwC. When that term was being discussed, one of the key questions was whether it should be called "Individual". Indeed, the original impetus for such a class in DwC sprang from a now-deprecated term individualID (which was organized within the Occurrence class).

I believe the "individual" aspect of dwc:individualCount stems from the same idea that individualID came from; and hence, would logically be synonymous with organismCount. However, as noted by @ekrimmel (and also during the Organism class discussions), there are subtle but potentially important differences between the word "organism" and the word "individual".

In our implementation, "Organism" is a subclass of "Individual", the latter encompassing more than just instances of living things (e.g., vehicles and other non-living objects). But I think the distinction suggested here by @ekrimmel is slightly different (i.e., that it's often not possible to accurately enumerate the count of dwc:Organism units when all you have are assorted parts of what once was one or more organisms).

Taking all of this into account, I agree with @ekrimmel that the definition should incorporate the word "individual" instead of "organism". This also keeps the term itself consistent with its definition (i.e., "individual").

This also raises another issue, which is that logically organismQuantity and organismQuantityType really ought to be organized in the Organism class -- but that's a non-normative change, and probably best delegated to the broader forthcoming discussion on MaterialSample/etc.

@tucotuco tucotuco added normative Controversial The solution for the issue has not reached a consensus. and removed non-normative labels May 28, 2021
@tucotuco
Copy link
Member

Would it be possible to amend the suggested new definition to be "The number of individuals sharing the properties of an Occurrence"? I.e., remove the word "organisms" from the definition?

It can. The primary purpose of the proposal was to fix the grammatical ambiguity in the original definition from
The number of individuals represented present at the time of the Occurrence.
to
The number of individuals present at the time of the Occurrence.

This much is certainly just an erratum and therefore non-normative. Your proposed revision is closer to that corrected definition than the one actually proposed. The problem is that the one proposed is the one being evaluated so far. The other problem is that in practice the term is being used for multiple disjunct purposes, and therefore attempts to clarify the correct use meet with a sustainability issue, because it would suddenly become obvious that some usages are incorrect. Based on that, the actual proposal should be normative rather than non-normative, because though it was meant to clarify, it actually has semantic implications. I have changed the label from non-normative to normative to reflect these implications. I suspect that the usage notes amendments are not all satisfactory based on the usage you describe either.

For paleo collections, asserting that individual objects belonging to a single occurrence record are part of a single organism is not always straightforward and not always true, e.g., a paleobotany specimen consisting of multiple plant parts on a single slab. The terms organismQuantity and organismQuantityType are intrinsically defined as being related to an organism, but individualCount is not and we would prefer to keep it that way. For some additional context to this discussion, see comments from @eclites on #185. For a very nice overview of counting things in paleo collections, see these slide decks put together by @RogerBurkhalter and Margaret Landis: Part I, Part II. We talk explicit about these terms in Part II.

Erica Krimmel, Holly Little (@hollyel), and Talia Karim (@tkarim) (on behalf of the Paleo Data Working Group)

In summary then, the counter proposal is to keep the definition vague (not qualifying what individual refers to) so that it can continue to be used in multiple ways. That is less clarity than what I was hoping for, but at least it has no adverse effect on current practices. I have labeled the proposal as controversial to indicate that there isn't consensus about it as proposed. The non-normative erratum will be included in the new release if no further resolution is reached.

@tucotuco
Copy link
Member

tucotuco commented Jun 2, 2021

This proposal has been labeled as 'Controversial'. It will remain open for public review in pursuit of a consensus solution for another 30 days, but will not be included in the release to be prepared from the public review of 2021-05-01/2021-05/31.

@tucotuco
Copy link
Member

tucotuco commented Jul 1, 2021

Public review of this issue has now concluded with objections to the proposed change. The issue will remain open for discussion and potential resolution.

@dshorthouse
Copy link

dshorthouse commented Aug 2, 2021

The number of individuals present at the time of the Occurrence.

As in members of the collecting party?

@deepreef
Copy link

deepreef commented Aug 2, 2021

As in members of the collecting party?

Actually... yeah. I haven't fully committed to that implementation yet; but I can see a clean pathway to representing a count of organisms for each taxon present at an event. Some number of organisms of an insect taxon, feeding on some number of organisms of a plant taxon, observed and recorded by some number of organisms of a primate taxon.

As I've always said, it started out as a joke... but over time it's less and less funny, and more and more in the realm of "Huh... I wonder...."

@tucotuco tucotuco removed this from the Public Review 2021-05-01 milestone Aug 25, 2021
@tucotuco tucotuco added this to the Post TDWG 2021 Public Review milestone Nov 3, 2021
@Jegelewicz
Copy link

See our dilemmas - ArctosDB/arctos#4032

The bottom line - we don't have a "catalog of occurrences" our number of individuals will actually be "the number of individuals represented by this catalog record" and we have no way to record "The number of individual Organisms sharing the properties of an Occurrence".

@albenson-usgs
Copy link

@Jegelewicz Maybe organismQuanity and organismQuantityType or measurement or facts would work better for what you're trying to capture?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Class - Occurrence Controversial The solution for the issue has not reached a consensus. normative Term - change
Projects
None yet
Development

No branches or pull requests