Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The frequency in the intake catalog does not account for snapshot data #188

Open
minghangli-uni opened this issue Aug 26, 2024 · 7 comments
Assignees
Labels
enhancement New feature or request

Comments

@minghangli-uni
Copy link

Is your feature request related to a problem? Please describe.

Snapshot diagnostics lack the time_bnds variable, causing the start_date and end_date from the intake-esm datastore to be incorrect.

One example is shown below, with the restart file access-om3.mom6.r.1915-01-01-00000.nc, where the output frequency is 2 hours and the total runlength is 30 hours, the start_date shows as 1915-01-01, 01:00:00 and the end_date is 1915-01-02, 07:00:00, instead of 1915-01-01, 00:00:00 and 1915-01-02 06:00:00.

path realm variable frequency start_date end_date
xx ocean xx 2hr 1915-01-01, 01:00:00 1915-01-02, 07:00:00

Describe the feature you'd like

Properly parse the date info such as,

path realm variable frequency start_date end_date
xx ocean xx 2hr 1915-01-01, 00:00:00 1915-01-02, 06:00:00
@minghangli-uni minghangli-uni added the enhancement New feature or request label Aug 26, 2024
@marc-white
Copy link
Collaborator

@minghangli-uni could you provide the location of that file so I can have a play with it?

@dougiesquire
Copy link
Collaborator

dougiesquire commented Aug 28, 2024

I stupidly didn't think about snapshot data when writing get_timeinfo. There's some logic in there to try and guess the start and end dates when time bounds are not present, but I realise now that this will produce incorrect start and end dates for snapshot data. I guess this logic should probably just be removed (i.e. remove this), though that will mean the start and end dates will be wrong for non-snapshot output that doesn't provided time bounds...

@marc-white marc-white self-assigned this Aug 29, 2024
@marc-white
Copy link
Collaborator

@minghangli-uni could you please point me towards the catalog this file is a part of?

@minghangli-uni
Copy link
Author

Hi @marc-white, I generated those files myself. You can find those at /scratch/tm70/ml0072/access-om3/archive/expt14-perturb-1052c5ae/output003 for your reference.

@rbeucher
Copy link
Member

My take on this is that we shouldn't try to guess the frequency if the time_bounds are missing so I would suggest removing the logic @dougiesquire is pointing out.

I would default to snapshot frequency if the time bounds are missing. I understand this will wrongly flag dataset as snapshots but this is a data definition problem.

@marc-white @charles-turner-1 .

@rbeucher
Copy link
Member

Looks like we may need to handle restart files in a specific way though.

@charles-turner-1
Copy link
Collaborator

Also poses a problem for MOM6 data by the looks of it: see eg. annual data test and #292.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: Backlog
Development

When branches are created from issues, their pull requests are automatically linked.

5 participants