Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing link to the CF area type table #550

Open
jesusff opened this issue Oct 10, 2024 · 5 comments · May be fixed by #564
Open

Missing link to the CF area type table #550

jesusff opened this issue Oct 10, 2024 · 5 comments · May be fixed by #564
Labels
CF1.12? We might conclude this issue in time for CF1.12 defect Conventions text meaning not as intended, misleading, unclear, has typos, format or language errors

Comments

@jesusff
Copy link

jesusff commented Oct 10, 2024

Title

Missing link to the CF area type table

Moderator

TBD

Requirement Summary

Include link to CF area type table and make explicit the need to use standardized area type strings in Section 7.3.3

Benefits

  • Easier understanding of the area types
  • Direct access from the conventions to the table of permitted values.

Associated pull request

No pull request yet, but could be generated from branch https://github.com/jesusff/cf-conventions/tree/area-types-text

Detailed Proposal

Section 7.3.3 introduces two conventions to indicate that an statistic has been applied only to a portion of a cell. The first convention mentions the:

strings permitted for a variable with a standard_name of area_type

which are only detailed in the second convention (next paragraph):

typevar is a string-valued auxiliary coordinate variable or string-valued scalar coordinate variable (see Section 6.1, "Labels") with a standard_name of area_type.

Here, it does not specify that only a standardized set of string values are permitted. To learn that, one needs to check the area_type standard name:

A variable with the standard_name of area_type contains either strings which indicate the nature of the surface e.g. land, sea, sea_ice, or flags which can be translated to strings using flag_values and flag_meanings attributes. These strings are standardised. Values must be taken from the area_type table.

This table is not linked anywhere in the CF conventions text, unlike others, such as the standardized region names, which are linked in Section 6.1.1. Its existence is mentioned in Section 3.3 Standard Name, though:

Some standard names (e.g. region and area_type) are used to indicate quantities which are permitted to take only certain standard values. This is indicated in the definition of the quantity in the standard name table, accompanied by a list or a link to a list of the permitted values.

The proposal is to make explicit the fact that, by any of the cell_method conventions to apply an statistic to a portion of a cell, area type strings must be selected from the standardized set defined in the CF Area Type Table and add the missing link to it.

A potential rewording of Section 7.3.3 would be (see jesusff@3b7d263):

By default, the statistical method indicated by cell_methods is assumed to have been evaluated over the entire horizontal area of the cell. Sometimes, however, it is useful to limit consideration to only a portion of a cell (e.g. a mean over the sea-ice area). Cell portions are referred to by means of standardised area_type strings (Section 3.3), maintained in the CF Area Type Table, using one of two conventions.

The first convention is a method that can be used for the common case of a single area-type. In this case, the cell_methods attribute may include a string of the form "name: method where type". Here name could, for example, be area and type may be any of the standardised area_type strings. As an example, if the method were mean and the area_type were sea_ice, then the data would represent a mean over only the sea ice portion of the grid cell. If the data writer expects type to be interpreted as one of the standard area_type strings, then none of the variables in the netCDF file should be given a name identical to that of the string (because the second convention, described in the next paragraph, takes precedence).

The second convention is the more general. In this case, the cell_methods entry is of the form "name: method where typevar".
Here typevar is a string-valued auxiliary coordinate variable or string-valued scalar coordinate variable (see <>) with a standard_name of area_type. The variable typevar contains the name(s) of the selected portion(s) of the grid cell to which the method is applied. These name(s) must be a subset of the standardised area_type strings. This convention can accommodate cases in which a method is applied to more than one area type and the result is stored in a single data variable (with a dimension which ranges across the various area types). It provides a convenient way to store output from land surface models, for example, since they deal with many area types within each surface gridbox (e.g., vegetation, bare_ground, snow, etc.).

@jesusff jesusff added the defect Conventions text meaning not as intended, misleading, unclear, has typos, format or language errors label Oct 10, 2024
@taylor13
Copy link

Thanks @jesusff for these suggestions. I support adding links to the current list of "area_type" strings.

@JonathanGregory
Copy link
Contributor

I support this proposal too. Thanks, @jesusff.

I think sect 7.3.3 is the primary reference for area types. Hence, I don't think it's necessary to refer to sect 3.3 in your new sentence, "Cell portions are referred to by means of standardised area_type strings ...", because sect 3.3 doesn't add any more information. On the other hand, it might well be useful to make sect 3.3 refer to sect 7.3.3 e.g. "Some standard names (e.g. region, <<geographic-regions>>, and area_type, <<statistics-applying-portions>>) are used to indicate quantities ...", where I've inserted the corresponding ref to section 6.1.1 as well. Do you think that would be useful?

Best wishes, Jonathan

@JonathanGregory JonathanGregory added the CF1.12? We might conclude this issue in time for CF1.12 label Oct 20, 2024
@JonathanGregory
Copy link
Contributor

This proposal has enough support to be included in CF 1.12. It will be accepted unless someone raises concerns in the next three weeks (by 30th November). @jesusff, do you have time to make a PR?

@JonathanGregory
Copy link
Contributor

I gather from his autoreply that @jesusff is away at the moment. In order to get this change into CF1.12, I have copied his branch, made the minor changes I suggested above, and created PR #564 to implement it.

@jesusff
Copy link
Author

jesusff commented Nov 20, 2024

Great, thank you very much. The PR looks good to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CF1.12? We might conclude this issue in time for CF1.12 defect Conventions text meaning not as intended, misleading, unclear, has typos, format or language errors
Projects
None yet
3 participants