Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarification of time coordinates, especially leap seconds, define utc and tai calendars and leap_seconds in units_metadata #542

Open
JonathanGregory opened this issue Sep 16, 2024 · 29 comments · May be fixed by #541
Labels
CF1.12? We might conclude this issue in time for CF1.12 enhancement Proposals to add new capabilities, improve existing ones in the conventions, improve style or format new contributor This issue was worked on by new contributors to the CF conventions

Comments

@JonathanGregory
Copy link
Contributor

Summary

This proposal aims to reorganise and clarify the existing text, mostly in section 4.4, about time coordinates, with no change in meaning. It includes a new subsection on leap seconds and their implications for the CF standard calendar, with examples and a diagram, and defines a new use of the units_metadata attribute to remove ambiguity in the interpretation of leap seconds in the standard calendar. It introduces two new CF calendars: utc for UTC with leap seconds properly accounted for, and tai for atomic clock time, used for some satellite data.

Benefits

Several previous lengthy but inconclusive CF discussions have shown that the treatment of leap seconds is unclear and unsatisfactory. In this proposal we hope to provide an acceptable solution to these difficulties.

Moderator

None yet

Associated pull request

#541

Detailed Proposal

A huge amount of hard thought has been spent on previous long discussions about CF calendars and leap seconds (including #148, discuss issue #297, Discussion #304). The last of these went quiet in April.

Since then, we (@davidhassell and @JonathanGregory) have been working on a proposal, on which we'd now like to invite comments. If you are interested, please look at our modified text, especially section 4.4 on time coordinates. You can find this in any of the following:

The main changes are these:

  • Reorganisation and clarification of the existing text, with no change in its meaning. We have put the text about units into its own subsection, including writing down the format of the reference date/time and time zone, which wasn't shown except by an example. We have put the detailed text and examples concerning the none and paleoclimate calendars into their own subsections as well, so that the subsection on calendars is limited to giving the definition of each calendar.

  • Opening statements defining date/times and time coordinates, and an explanation in the subsection on calendars of how they relate to time intervals. These points have been contentious in the past, so we feel it's best to state plainly how they should be understood in CF (according to this proposal).

  • A new subsection on leap seconds, which explains in detail their implications for the CF standard calendar. Difficulties arise because that calendar is, and has always been, used in practice both for data that truly does not have UTC leap seconds in its time axis (e.g. a model which uses the real-world Gregorian calendar with every day having 86400 seconds) and for data which does, or should, have leap seconds but they are ignored in the time coordinates (e.g. observational data recorded with UTC time). Rather than deprecating or prohibiting one or other of these variants, we propose a new convention for the units_metadata attribute to distinguish them, so that they can be handled correctly by the data-user. The units_metadata attribute was recently added to CF to handle the difficulty of degrees_celsius being used in two different ways that require different treatment by data-users, after a very long and difficult discussion. We are hoping that it can work the same magic with leap seconds.

  • A worked example and a diagram for leap seconds. The diagram was inspired by the graph posted by @ChrisBarker-NOAA. We've also produced a table illustrating how a selection date/times and coordinates are related across many CF calendars, inspired by Lars's table. We propose to put this in an appendix to the convention, if this proposal is accepted. Thanks, Lars and Chris, for the ideas.

  • Two new calendars: utc for UTC with leap seconds properly accounted for, and tai for atomic clock time, used for some satellite data. The latter has been requested in previous discussions. The former hasn't explicitly been requested, but many comments imply that it would be preferred to standard for some purposes.

Previous discussions on these matters have evoked disagreements on principle which turned out to be irreconcilable by discussion in the issue, and no conclusion was reached. To avoid that outcome, we'd like to try a different method with the present proposal. If you find something in this proposal which you feel you couldn't possibly accept, even with modification, please say so in this issue. If anyone feels like that, we will convene a group to discuss the disagreements by video meeting, like we've done with a couple of other difficult issues. The group would be charged with reaching a resolution soon enough for some version of this proposal to be accepted for the next release, probably with a deadline in November. If that can't be done, we'll have to start again when someone has a new idea in future.

On the other hand, any suggestions, comments or concerns on clarity, presentation and details of the convention can probably be resolved by discussion in this usual way on this issue. We look forward to hearing what you think!

@JonathanGregory and @davidhassell

@JonathanGregory
Copy link
Contributor Author

In discussion 304, @ChrisBarker-NOAA has given his support to this proposal (thanks, Chris). He writes:

My only real concern is that the UTC calendar is an "attractive nuisance", and there is very little software that handles it properly, and many people use "UTC" imprecisely. But the text is very clear about the leap seconds, so buyer beware, I guess.

Please could anyone who wants to comment on this proposal do so here in this issue, rather than in discussion 304. Thanks.

@JonathanGregory
Copy link
Contributor Author

@ChrisBarker-NOAA has also made some comments on the PR (#541). I'm copying them here, because discussion of "substantive" points in a PR is awkward to follow subsequently. It's easier to have a single record in the issue. Marking typos etc. in a PR is fine, because they don't need discussion or reply.

I've usually seen this spelled datetime or date-time, rather than date/time. I think those forms are a little better. I'm not sure why, but date/time reads to me a bit like date or time, rather than a compound word.

I agree that "date/time" isn't ideal because "/" means "or", but I don't have a strong view on what we should write. We used "date/time" because it appears like that elsewhere in the convention document, especially chapter 7. If there is a consensus on a preferred way to write it, or a different term to use, we could change it throughout the document.

Regarding the sentence, "To mark this distinction, the canonical unit given for quantities used for time coordinates is s since 1958-1-1", just curious -- why 1958? I actually saw this in a file in the wild recently, and was wondering where in the heck it came from! I guess I'd expect 1970-1-1 [as that's the most common epoch used] as canonical, but it's not vital.

UTC and TAI have a complicated history, as described by wikipedia. My understanding is that, to summarise it simply, TAI began in 1958-1-1, with the modern definition of a second in terms of the caesium atomic clock. In 1972 UTC was rebased on TAI, in such a way that they were treated as coincident at 1958-1-1, with 10 leap seconds having been added by 1972. Hence it's convenient to regard UTC as beginning in 1958 as well as TAI. There is a sentence of explanation elsewhere in the CF text, which Chris discovered later. I will put something at the point where this remark was made as well.

[Where we discuss the definition of year and month: insert] "A day is exactly 24 hours (86400 sec). It is not a calendar day." I suggest this because in, e.g. the Python datetime library, a day is a calendar day, rather than 24 hours. I think that only makes a difference during a DST transition, which CF doesn't allow anyway (I hope!) -- but it wouldn't hurt to be extra clear here.

That's fine, thanks. I will insert it. The time zone definitions are plus/minus numbers hours (and minutes), not names - no automatic transitions are implied by them!

[Where we discuss time zones, replace "time zone" with] "time zone offset" -- time zone is the administrative thing, and has a name, and maybe DST transitions -- the timezone offset is the clear and simple.

OK, thanks.

[Concerning the new utc calendar, we have proposed "Date/times in the future are not allowed in this calendar, because it is unknown when future leap seconds will occur." Chris comments: ] I think some warning is given before a leap second is introduced -- so we could go a bit in the future (wikipedia says " leap seconds are announced only six months in advance.") -- but I can't find a formal reference for that -- so I guess ruling out the future altogether is probably wise.

In practice I'm sure it's OK if data-writers produce data for the future which they know it will be correct because of advance warning. The checker will give an error if it finds a date which is the future when the checker is run, but the future becomes the past at the rate of 1 second per second, and the same file will not give an error once that has happened! Should this be a recommendation not to write future UTC, rather than a prohibition?

Thanks for these comments, Chris. I have resolved them in the PR.

@JonathanGregory
Copy link
Contributor Author

Dear Chris

I have made changes (in the PR, html and pdf) following your suggestions. Two of them were more complicated that I had expected. Here are the new versions of various paragraphs:

In 4.4.1

UDUNITS defines a minute as 60 seconds, an hour as 3600 seconds and a day as 86400 seconds. These are not calendar units. When civil clock time changes at the start and end of summer in many countries, the day according to its calendar date lasts for 23 or 25 hours, but the UDUNITS and CF day is always 24 hours. When a leap second is inserted into UTC, the minute, hour and day affected differ by one second from their usual durations according to clock time, but the UDUNITS and CF minute, hour and day do not; they are fixed units of measure.

The default time zone offset is zero. In a time zone with zero offset, time (approximately) equals mean solar time for 0 degrees_east of longitude. (Although this may be exact in a model, in reality the time with zero time zone offset differs by some seconds from mean solar time; see the discussion of UTC and leap seconds in <<4.4.2>>.) If both time and time zone offset are omitted the time is 00:00:00 (midnight, the start of the day). Thus, units = "days since 1990-1-1" means the same as units = "days since 1990-1-1 0:0:0".

For example, seconds since 1992-10-8 15:15:42.5 -6:00 indicates seconds since October 8th, 1992 at 3 hours, 15 minutes and 42.5 seconds in the afternoon, in a time zone where the date/time is six hours behind the default. Subtracting the time zone offset from a given date/time converts it to the equivalent date/time with zero time zone offset e.g. 1989-12-31 18:00:00 -6 identifies the same instant as 1990-1-1 0:0:0.

In 4.4.2

In the real world, the international basis of civil timekeeping is Coordinated Universal Time (UTC). Leap seconds are adjustments occasionally made in UTC, in order to keep it close to mean solar time at 0 degrees_east i.e. the time zone with the default (zero) time zone offset in UDUNITS and CF (see <<4.4.1>>).

Do they look OK?

Cheers

Jonathan

@ChrisBarker-NOAA
Copy link
Contributor

These look greatt -- thanks!

Where are we at with:

I agree that "date/time" isn't ideal because "/" means "or", but I don't have a strong view on what we should write. We used "date/time" because it appears like that elsewhere in the convention document, especially chapter 7. If there is a consensus on a preferred way to write it, or a different term to use, we could change it throughout the document.

I vote for either "datetime" or "date-time" -- but yes, it should be the same everywhere, so if this is too much churn, we can leave it as is.

Maybe wait to see if anyone else has a preference?

@JonathanGregory
Copy link
Contributor Author

Dear @chris-little

Thanks for reviewing the PR. I am glad you found it clear. You commented

You might want to consider removing the word midnight, or replace it with midnight at 0 degrees longitude. It is a bit UK-centric. The ISO 8601 standard removed that word from its content some years ago.

Thanks for this point. I have qualified "midnight" with "at 0 degrees_east" in all the places I could find. It's updated in the PR, but I haven't updated the HTML and PDF.

Best wishes

Jonathan

@JonathanGregory
Copy link
Contributor Author

Enough support has been expressed for this proposal to be accepted, and more than three weeks have passed without any further concern being raised. It would be really good to have this enhancement in CF 1.12, since it we've been needing a solution to this issue for years. However, we're keen that there should be a consensus. Would anyone else like to comment?

@JonathanGregory JonathanGregory added the CF1.12? We might conclude this issue in time for CF1.12 label Oct 20, 2024
@ChrisBarker-NOAA
Copy link
Contributor

I'd still like to see "date/time" replaced with either "date-time" or "datetime" -- anyone else have a preference?

One data point: apparently SQL uses "DATETIME" -- for what that's worth. OH, two: python uses datetime. Of. course, in both those cases, the "-" and "/" are disallowed.

As for "midnight":

As of ISO 8601-1:2019/Amd 1:2022, "00:00:00" may be used to refer to midnight corresponding to the instant at the beginning of a calendar day; and "24:00:00" to refer to midnight corresponding to the instant at the end of a calendar day.[1] ISO 8601-1:2019 as originally published removed "24:00:00" as a representation for the end of day although it had been permitted in earlier versions of the standard.

So midnight is clearly defined -- though is still could be confusing (midnight at the beginning or end of teh day?) why not "zero hours" or "0" or, at least "Midnight at the beginning of the day" -- I'm not sure the "at 0 degrees_east" is needed.

@sethmcg
Copy link
Contributor

sethmcg commented Oct 24, 2024

No strong preference, but I agree that either datetime or date-time reads more smoothly than date/time. I'd lean marginally towards datetime because it's pythonic.

@sethmcg
Copy link
Contributor

sethmcg commented Oct 24, 2024

(Sorry to be late commenting; I've been too swamped to keep up on this issue until now.)

I think the text does an admirable job of sorting out all the complicated details; kudos to the authors.

The one thing that is missing is that I think it needs a high-level overview and summary at the beginning. I would venture that a majority of readers are going to get a short ways into this section, become overwhelmed, and skip over the rest of it. Most users just want to know what, if anything, they should do about leap seconds, so we should provide that guidance up front.

I would suggest something along these lines (perhaps phrased a little more formally), if others think it summarizes things as accurately as it can while glossing over all of the details.

The real-world UTC calendar has leap-seconds. They are added at irregular and unpredictable intervals to adjust for slight variations in the Earth's rotation speed. 27 leap seconds have been added to the calendar since 1958, most recently in 2016, and the practice will be abandoned by or before 2035.

Most people are unaware of leap seconds and ignore their existence; this includes many data producers and the CF standard itself before version 1.12. As a result, the time coordinates of two real-world observational datasets could disagree with one another by up to 27 seconds if they differ in leap-second awareness.

Practically speaking, this means that if you are working with real-world data, and if temporal accuracy at the sub-minute timescale is important, you need to care about leap-seconds; this subsection covers how to address them properly. Otherwise, the only thing you need to know is that it's possible for a dataset to have data at regular minutely / hourly / daily intervals that aren't spaced exactly 60 / 3600 / 86400 seconds apart, if it has taken leap-seconds into account.

@chris-little
Copy link

@sethmcg And perhaps to lay it on a bit thicker, add at the end something along the lines of:
`Consequently, even though time labels may allow data to be correctly ordered, any calculations of durations may be wrong.

@ChrisBarker-NOAA
Copy link
Contributor

Consequently, even though time labels may allow data to be correctly ordered, any calculations of durations may be wrong.

slight word smithing:

Consequently, even though time labels allow data to be correctly ordered, any calculations of durations may be inaccurate by a few seconds.

(is "labels" the right term? -- I don't think I see it in the other text"

Related NOTE:

Over on: https://github.com/orgs/cf-convention/discussions/383

We are discussing the use of floating point types in time variables -- I think that if you want to be accurate to the leap-seconds, you really should be using integer seconds (or less) as your time unit -- using a floating point type makes even less sense when you care about the precision that much.

In terms of this issue, that means we should have the examples use appropriate units / data types (if data types are part of the example). Looking at the PR, I see:

the canonical unit given for quantities used for time coordinates is s since 1958-1-1.

perfect -- seconds is a good unit to use. (can we spell it "seconds" or is that fixed elsewhere?)

"""
In both kinds of time units attribute (with or without since), any unit for measuring time can be used i.e. any unit which is physically equivalent to the SI base unit of time, namely the second.
"""

All good -- - but do we want to add anything about appropriate unit/data type combinations? -- e.g. "days since" with float is not going to get you second precision for very long. And days since with double will for a huge range, but it also has variable precision, depending on where you are on the timescale -- so while calculating leap-seconds precision, you may be off by a nanosecond or two -- but I suppose that's all the usual caveats with floating point types.

(I thought I saw a "days since" in there somewhere, but can't find it now -- so I guess all good?)

But a recommendation may be good: Maybe add something to the conformance doc under:

=== 4.4.1 Time Coordinate Units

Recommendations:

Or is there somewhere else a recommendation could be added?

@chris-little
Copy link

@ChrisBarker-NOAA I'm happy with your word-smithing. I used label as a generic term just in case saying timestamp pressed the wrong button!

@JonathanGregory
Copy link
Contributor Author

Dear all

Thanks for your careful reading of and comments on the draft. David and I have considered these comments, and I have made consequent changes in PR #541. The updated conventions document can be seen as HTML and PDF. Below is a description of the changes. Please let us know any further concerns or suggestions for improvement.

Best wishes

Jonathan

  • Following Chris Barker's suggestion, with which Seth agreed, I have changed "date/time" to "datetime" throughout, including in sect 7, which was where I got "date/time" from in the first place.

  • I have changed "midnight" to "the beginning of the day i.e. midnight", and "the beginning of the day i.e. midnight on 1st January 1958" when explaining the canonical unit "s since 1958-1-1". ISO 8601 prefers "beginning of the day", so I've put that first, but I think stating "midnight" as well for clarity is useful. In some cultures, the "day" does not begin at midnight, despite ISO 8601. I have kept "at 0 degrees_east" as an explicit statement of the default time zone, for clarity. The unit of measure in the canonical units is the second, but stated as s because we use unit abbreviations when giving canonical units in the standard name table.

  • Following a suggestion of Lars's, I have moved a sentence from 4.4.2 to the preamble of 4.4.

  • Lars has pointed out the UDUNITS syntax, which CF follows, differs from ISO 8601 by assuming that the time is midnight if it's not included in a datetime. Therefore in 4.4.1 "Time coordinate units", I've inserted "Note that this interpretation of omitted time, which is an aspect of UDUNITS syntax, is different from ISO 8601, in which omitted time implies lack of precision."

  • At the start of 4.4.2 "Calendar" I have inserted, "Note that the CF meaning of 'calendar' refers to datetimes, whereas the ISO 8601 definition refers only to dates." I rearranged the order of a few other sentences in 4.4.2. I have added a brief CF definition of calendar to 1.3 "Terminology": A CF calendar defines an ordered set of valid datetimes with integer seconds.

  • We already had a paragraph in 4.4.2 about leap seconds. I have merged Seth's first paragraph into it, to produce the following:

    Leap seconds are adjustments made at irregular and unpredictable intervals in Coordinated Universal Time (UTC), the international basis of civil timekeeping in he real world. In response to slight variations in the Earth's rotation speed, positive or negative leap seconds are inserted in order to keep UTC close to mean solar time at 0 degrees_east i.e. the time zone with the default (zero) time zone offset in UDUNITS and CF (see 4.4.1). When a positive leap second is introduced at the end of a minute, that minute contains 61 seconds; 27 leap seconds have been added to UTC since 1958, most recently at the end of 2016. A minute would contain 59 seconds if it included a negative leap second, but none have been introduced up to now. It has been decided that after 2035 no further leap seconds will be added. The CF calendars differ in their treatment of leap seconds (see 4.4.3).

    I know that the decision is "by or before 2035". My simpler statement is a subset but correct as such, and results from my modifying it to be clear that the decision is that no more leap seconds will be added. Some "headline" reports of this decision say things like "leap seconds to be eliminated", which could be misunderstood to mean that the ones which have been added will be removed. I have put text based on Seth's other two paragraphs and comments from Chris Barker and Chris Little at the start of 4.4.3 "Leap seconds":

    This section describes how to deal properly with leap seconds. Most people ignore the existence of leap seconds, including many data producers and the CF standard before version 1.12. As a result, the time coordinates of two real-world observational datasets could disagree by up to 27 seconds if one has taken leap seconds into account and the other has not. Practically speaking, this means that if you are working with real-world data, and if it's important for your time coordinates to be accurate to the second, you need to care about leap seconds. Otherwise, you need only to be aware that the difference between two time coordinates might not exactly equal the duration of the time interval between the two instants, but could be inaccurate by up to 27 seconds, if leap seconds are involved. Relatedly, two instants with the same time of day on different days, which would always be separated by a multiple of 86400 seconds if there were no leap seconds, will have a few more seconds between them if leap seconds intervene.

    Thanks for your text, Seth.

  • Lars has made some comments on possibly misleading or confusing sentences in 4.4.3, which have led to our elaborating the text, especially when describing the example and the figure.

  • The reference I give to ISO 8601 shows only definitions. That's useful, but you can't read the rest of the standard at ISO unless you buy it. Does anyone else make it publicly available? It's good for the geoscientific community that the CF standard isn't behind a paywall.

  • We have prepared a table which examples to show how a given datetime can identify a different instant in different calendars. This table could be added as an appendix. However, it will take time (and possibly dates) to convert to asciidoc, and I expect the discussion of it might be long. Therefore I've removed the reference to it. If the present change goes into 1.12, we will propose it subsequently as an addition.

  • Chris Barker suggested we could offer guidance about the choice of units and data type of the time coordinate variable, in view of the desired precision and the range of datetimes to be covered. I think that's a good idea, but I would rather not tackle it in this issue, because I can imagine it taking a significant amount of discussion.

@larsbarring
Copy link
Contributor

Thanks Jonathan,

Here are first a couple of comments of more technical nature that I think are uncontroversial:

  1. Ch 4.4.1 4th para: Adding "civil time" (or "standard times " as used in ISO8601) into text just adds yet another layer of complication. Normal time (="winter time") and summer time belong to two different calendars, as is evident from what you have to write in the reference time for each (i.e. different numbers for Z). All other calendars are either related to a model world (360_dayand 365_daycalendars) or various attempts to keep the calendars in phase with the sun-earth geometry (both on an annual and daily basis). The normal and summer time can be bette understod as two calendars that run in parallel all the time, only differeing by a one hour (usually). Civil time is nothing else than that our governments (or whoever it is) find it meaningful to shift between these two calendars at certain well specified points in time every year is I suggest beyond the scope, and interest, of CF. Suggest to delete everything related to summer and winter time (or expand the text to properly present the concept, which is a rather deep rabbit hole).

  2. Two para further down: After the four bullet points specifying allowed time zone information, add a fifth point (for clarity):

    • Time zone acronyms, which are commonly used, are not allowed.
  3. Ch. 4.4.2, Description of the tai calendar, second sentence: I find the word "later" easy to misunderstand. I suggest changing to:

    The tai time is ahead of the utc time by the net number of leap seconds introduced since 1958-01-01, which is the same instant in utc and tai, when both of them began. This means that a datetime in the tai calendar represents an instant that is later than the same datetime in the utc calendar.

And here are two further comments of technical nature, but I do not have a concrete suggestion how to fix them:

  1. Description of the utc calendar and the tai calendar: The current difference between the two are 37 seconds, but there have been only 27 leap seconds. Somewhere (=here?) this should be explained. See next point.

  2. Ch. 4.4.3, 1st para: Here the difference of 27 leap seconds appear again. The 37 second difference between utc and tai should be noted here.

@davidhassell
Copy link
Contributor

Dear Jonathan,

Thank you for these new changes. I'm happy with all of them, with the exception of the new paragraph in 4.4.2 about leap seconds: I don't think we should be so explicit about the numbers of leap seconds by which TAI and UTC differ.

UTC is currently 37 seconds (not 27) behind UTC. 27 seconds have been added to UTC since 1972, in increments of 1 second; but 10 seconds were also added to UTC over the period 1958-1971, in various increments of (much) less than 1 second (these were not called leap seconds at the time, but "rubber seconds").

The current difference between TAI and UTC is likely to change, so I don't think we should hardwire it into the conventions.

Similarly, the 2035 no-more-leap seconds date is already a bit vague (as you noted), and doesn't preclude the possibility of 60 leaps seconds being applied at one instant in the future (i.e. when we've drifted by 1 minute).

I propose a new version of this paragraph (original / new text):


Leap seconds are adjustments made at irregular and unpredictable intervals in Coordinated Universal Time (UTC), the international basis of civil timekeeping in the real world. In response to slight variations in the Earth’s rotation speed, positive or negative leap seconds are inserted in order to keep UTC close to mean solar time at 0 degrees_east i.e. the time zone with the default (zero) time zone offset in UDUNITS and CF (see Section 4.4.1, "Time Coordinate Units"). When a positive leap second is introduced at the end of a minute, that minute contains 61 seconds. ; 27 leap seconds have been added to UTC since 1958, most recently at the end of 2016. A minute would contain 59 seconds if it included a negative leap second. , but none have been introduced up to now. It has been decided that after 2035 no further leap seconds will be added. An adjustment to UTC can, however, be any amount of seconds, not just +1 or -1. For instance, prior to 1972-01-01, a total of 10 seconds were added to UTC over multiple adjustments of at most 0.2 seconds. After this date, all adjustments are a positive or negative integer number of seconds. The CF calendars differ in their treatment of leap seconds (see Section 4.4.3, "Leap Seconds").


Thanks,
David

@davidhassell
Copy link
Contributor

(whilst writing mine, I missed Lar's last post, who also picked up in the 27/37 seconds discrepancy!)

@ChrisBarker-NOAA
Copy link
Contributor

I think it's PK as is, but I generally agree with David and Lars -- it could be trimmed down, we only need so much detail, e.g.:

Clearly stating that leap seconds are a thing, and which how calendars handle (or don't) is enough -- folks can go find the details if it matter to them.

Similarly for DST -- just note that it's a thing to keep in mind, that's really all we need.

-CHB

@sethmcg
Copy link
Contributor

sethmcg commented Oct 28, 2024

I agree with @ChrisBarker-NOAA and @larsbarring. Where we can get away with saying less, I think we should.

For instance, we say that adding 1 leap second makes a minute 61 seconds long. Do we really need to also say that subtracting a second makes it 59 seconds long, especially when that case has never actually happened? I think it's safe to leave that unsaid, and that it will make the document easier to absorb and understand if we do.

Likewise, I think we can just say "UTC" without an in-line explanation of what it is. Maybe it would make sense to link it to the Wikipedia page on UTC?

@chris-little
Copy link

@JonathanGregory @davidhassell Just so that everyone is clear about was agreed by the BIPM/IERS/ITU global conference (CGPM). Leap seconds, positive or negative, have not been abolished, but the criteria for declaring one will be (much) looser. The current criteria is (UT1-UTC) ~ 0.9s. The CGPM:
decides that the maximum value for the difference (UT1-UTC) will be increased in, or before, 2035
propose a new maximum value for the difference (UT1-UTC) that will ensure the continuity of UTC for at least a century
prepare a plan to implement by, or before, 2035 the proposed new maximum value for the difference (UT1-UTC)
propose a time period for the review by the CGPM of the new maximum value following its implementation, so that it can maintain control on the applicability and acceptability of the value implemented
draft a resolution including these proposals for agreement at the 28th meeting of the CGPM (2026)

So you might want to fine tune your wording to hedge your bets in 2026 and to avoid this issue in ten years' time, or a hundred! ;-)

@JonathanGregory
Copy link
Contributor Author

Dear all

Thanks for reading this again and for further comments. I have updated the PR #541. By some good magic of Antonio @cofinoa, the corresponding PDF and HTML have been generated automatically by GitHub, also the conformance document PDF and HTML.

  • I have a question about the list of calendars in 4.4.2: should it in alphabetical order? It's quite haphazard at present, except there's a reason for putting standard first, that it's the default.

  • At Lars's suggestion, I have deleted the sentence, "When civil clock time changes at the start and end of summer in many countries, the day according to its calendar date lasts for 23 or 25 hours, but the UDUNITS and CF day is always 24 hours." The purpose of this sentence is to explain that day is a fixed unit of time, not a calendar unit. Lars's argument is that the change of time is a shift of timezone, and in either timezone the day is 24 hours when it happens.

  • Instead of Lars's suggestion of a fifth bullet in the list of time zone formats, I have elaborated the bullet where the time zone is introduced: "Z is the time zone offset. This is an interval of time, specified in one of the formats described below. Time zone names or acronyms are not allowed."

  • Regarding the tai calendar, Lars suggested a rewording, which I had to think about a lot. After having done so, I agree that the tai calendar is ahead of utc, in the same sense that CET is ahead of GMT, and Lars is ahead of Jonathan. This shows that I find "ahead" quite hard to understand in this context! But I agree it's good to say it two ways. How about this: For any given instant, the tai datetime is ahead of the utc datetime, where "ahead" means the same as it does when describing a timezone to the east as being ahead of one to the west. The difference between the two datetimes for a given instant of time is the net number of leap seconds introduced since 1958-01-01. The difference was zero on that instant, when both calendars began. This means that a given datetime in the tai calendar represents an instant that is later than the same datetime in the utc calendar." There is a symmetrical phrase under utc, which becomes: "For any given instant, the utc datetime is behind the tai datetime, where "behind" means the same as it does when describing a timezone to the west as being behind one to the east. The difference between the two datetimes for a given instant of time is the net number of leap seconds introduced since 1958-01-01. The difference was zero on that instant, when both calendars began. This means that a given datetime in the utc calendar represents an instant that is earlier than the same datetime in the tai calendar."

  • Given what David, Seth, Lars and Chris Little have said about the text concerning UTC and the number of leap seconds, I agree that it is better to say less. I propose even less than David's version of the paragraph, adopting Seth's suggestion to refer to Wikipedia for UTC and not the mention 59-second minutes, thus:

    Leap seconds are adjustments made at irregular and unpredictable intervals in Coordinated Universal Time (UTC). In response to slight variations in the Earth's rotation speed, positive or negative leap seconds are inserted in order to keep UTC close to mean solar time at 0 degrees_east i.e. the time zone with the default (zero) time zone offset in UDUNITS and CF (see 4.4.1). When a single positive leap second is introduced at the end of a minute, that minute contains 61 seconds. The net number of leap seconds added to UTC between 1958-1-1 and 2025-1-1 is 37. The CF calendars differ in their treatment of leap seconds (see 4.4.3).

    In section 4.4.3, I have replaced "27 seconds" with "some/a number of seconds". Is that sufficient?

Best wishes

Jonathan

@sethmcg
Copy link
Contributor

sethmcg commented Oct 29, 2024

Looks good to me! Thanks for synthesizing all our comments.

@ChrisBarker-NOAA
Copy link
Contributor

Agreed -- looking good:

"Z is the time zone offset. This is an interval of time, specified in one of the formats described below. Time zone names or acronyms are not allowed."

Folks do often get confused about "time zone" vs "time zone offset" time zone offset is simple and clear -- whereas a "time zone" is a designation of a region that follows certain rules for determining the offset -- CF doesn't deal with those at all.

I think this captures that OK, unless we want to nail the point home:

"Z is the time zone offset. This is an interval of time, specified in one of the formats described below. It is simply the offset from UTC, and not the regional time zone -- thus names or acronyms are not allowed."

or not ....

-CHB

@JonathanGregory
Copy link
Contributor Author

Thanks, @sethmcg and @ChrisBarker-NOAA. For Chris Barker's most recent comment, I have changed the time zone description to read

Z is the time zone offset with respect to UTC. This is an interval of time, specified in one of the formats described below. Only numbers (digits, +, - and :) are allowed in Z, not time zone names or acronyms.

The PR #541 HTML and PDF have been updated.

If no further concerns are raised, this change can accepted in three weeks, on 26th November. @chris-little should be added to the list of contributions to the CF convention. Thanks, Chris.

@JonathanGregory JonathanGregory added the new contributor This issue was worked on by new contributors to the CF conventions label Nov 5, 2024
@ChrisBarker-NOAA
Copy link
Contributor

I'm not suggesting a blocker at this point, but I just noticed something:

Is the UDUNITS time format ISO 8601? (it's close, but is it exactly?) if so, maybe we should say that in the doc somewhere.

The references i see to ISO 8601 at this point are:

  • An item in the bibliography

  • ch04.adoc:Note that this interpretation of omitted time, which is an aspect of UDUNITS syntax, is different from <<ISO_8601>>, in which omitted time implies lack of precision.

  • ch04.adoc: Note that the CF meaning of "calendar" refers to datetimes, whereas the <<ISO_8601>> definition refers only to dates.

It seems a bit odd to talk about teh differences, without having. stated the similarities first.

@JonathanGregory
Copy link
Contributor Author

Lars suggested the first point of comparison in ch04, and I added the second, having looked at the ISO 8601 definitions. We don't need these comparisons for the CF standard, and we should remove them if they're not helpful. I included them for the sake of anyone who is familiar with ISO 8601, to prevent them from making an assumption about what we mean, or to point out that we already know it's different (to forestall concerns or objections that we've made a mistake). I haven't done a thorough comparison.

@chris-little
Copy link

@JonathanGregory @ChrisBarker-NOAA I suggest minimising the references to ISO 8601 and making sure that they are purely informative, as there are superceded versions, and the latest is behind a paywall. ISO is just about to issue Amendment 1 to ISO 8601-1, and the whole standard is slated for full review. Part 3 (semantics) is in the pipeline too, and they are working on lots of overwhelming details on how to do calendrical calculations, possibly in conjunction with IETF people. The standard is probably too flexible with too many options, so that is why I would prefer references to a very restrictive profile of it, such as IETF RFC 3339, W3C's Webtime or even the US National Libary profile. The relevant ISO committee apparently met two weeks ago. HTH

@ChrisBarker-NOAA
Copy link
Contributor

Good reasons all.

It's not a big deal, but I think we can remove the ISO 8601 references altogether , and simply define what a datetme string means in CF by itself. For instance:

I thnk we can remove this altogether:

"Note that this interpretation of omitted time, which is an aspect of UDUNITS syntax, is different from <<ISO_8601>>, in which omitted time implies lack of precision."

-- It's already clear that YYYY-MM-DD means: YYYY-MM-DDT00:00:00

and:

"Note that the CF meaning of "calendar" refers to datetimes, whereas the <<ISO_8601>> definition refers only to dates."

Can simply be -- "Note that the CF meaning of "calendar" refers to datetimes, not only dates"

(I presume that's because with TAI vs UTC, we. do have different times in different calendars....)

This isn't a deal breaker for me -- folks are familiar enough with teh ISO standard (though not its intiecate details), so it may be more clear to provide the contrast.

Note that I can't find any documentation in the UDUNITS pages about how datetime strings are formatted -- maybe it's there, but I can't find it quickly. If it is there, we could put a link in ch04.

If not, we should do something about that -- but that's another topic.

@JonathanGregory
Copy link
Contributor Author

I've deleted the references to UTC 8601 as suggested by @ChrisBarker-NOAA.

"the CF meaning of "calendar" refers to datetimes, not only dates" ... because with TAI vs UTC, we. do have different times in different calendars

That's right.

I can't find any documentation in the UDUNITS pages about how datetime strings are formatted

I couldn't find it either. I don't think it's there. That's why I described it briefly. Since we take the UDUNITS syntax as definitive, it would be helpful if UDUNITS documented it.

@ChrisBarker-NOAA
Copy link
Contributor

and thus: #562

If I get a chance, I'll add a note to that specifically about datetime strings...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CF1.12? We might conclude this issue in time for CF1.12 enhancement Proposals to add new capabilities, improve existing ones in the conventions, improve style or format new contributor This issue was worked on by new contributors to the CF conventions
Projects
None yet
6 participants