-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Time series in EVA? #166
Comments
We're interested (in a parasitic sort of way) for sure, @Dooruk . We run EVA every cycle as well, but I can't say our plots are fabulous yet! |
@Dooruk There actually is some capability for time series plotting in eva. The As far as data output from eva, the DA monitoring effort will require something like that eventually. We have been focused so far on using eva to create plots from the legacy DA monitor data, leaving the existing data extraction mechanisms in place for now. Replacing the data extraction will be our last step and we haven't yet charted that out, but would certainly be interested in collaboration if possible. |
Thanks @Dooruk . I've thought about this in the past. I was thinking something like the following:
Basically, as part of a workflow, the IODA diagnostic files would go into an IODA reader, compute counts, mean O-F, std dev, whatever else is needed, and write out a file that is "cycle" dimension instead of nlocs. This can either be 1 file/cycle or appended with ncrcat or something like that. Then EVA plots variables from this file on a line plot with minimial changes needed in EVA (maybe only the reader since IODA requires "nlocs", although I guess we can still use that dimension and then have a cycle/time variable?) I guess, what I am saying is, I always saw this more as an IODA problem, not an EVA problem, but if we want to do the preprocessing as part of EVA, I'm open to that approach, I just figured compiled code here would be faster. |
Thank you for the comments. @EdwardSafford-NOAA, IODA files have a single time step at each cycle so not sure if that would work but if I will keep 'MonDataSpace' in mind. @CoryMartin-NOAA, I really like this idea, that would save significant time in terms of creating/writing files. I am not familiar with the inner workings of IODA so not sure how much effort is required to tackle it. I would be interested in helping if/when it comes to that. |
@AndrewEichmann-NOAA kindly "volunteered" to help out put together the oops application @Dooruk . |
I like the idea of having an oops application to perform the time series for observations. Thanks for volunteering effort on this @AndrewEichmann-NOAA and @guillaumevernieres. To have more generic time series capability in eva I think we need to have it at a level above the read and transforms. I started working on this at one point but didn't have time to finish and then wanted to wait until after all of Akira's refactoring, which is now complete. The kind of YAML structure I had in mind was along the lines of: timeseries:
times: [time1, time2, ...]
filenames: [file1, file2, ...]
dataset_template:
name: ...
type: ...
dataset_specific_things: ....
# List here the variable you want to compute at each time and keep
transform_and_keep:
- transform: minmaxmean
metrics: [mean, max]
along_dimension: []
variable: collection::group::variable
# List any variables that you want to keep the entire thing each time. Otherwise everything is deleted
# Somehow add a time dimension, though TBD exactly how.
keep_variables: []
# More transforms if you like
transforms:
# Graphics
... I was thinking that if you had The behaviour would also trigger deleting everything except what is created in Let me know what you think on this approach. I can work on it while EMC work on the OOPS level approach. The advantage of the Eva way would be that it could be applied to any of the data we can read in Eva. For example accumulating all the convergence rates over many cycles or to make Hovmöller type plots. |
@danholdaway thanks for chiming in. In terms of specifying times in the YAML, are you thinking this |
I would like to start a discussion regarding
EVA
handling timeseries and creating plots with time axes. Creating time series plots is something I started doing with JEDI outputs on our end but I would like to do it in a generic way if there is demand/need. I had some brief discussions here with @danholdaway and @asewnath (she suggested that I create an issue) and I would really appreciate EMC's input on this so I'm tagging all who might be interested: @CoryMartin-NOAA @EdwardSafford-NOAA @kevindougherty-noaa @guillaumevernieres @ADCollard.A simple example plot is mean (obs - bkgr) over multiple time steps in observation space.
IodaObsSpace
class is already handling JEDI outputs (location, channel dimensions) in observation space so this would be only a matter of adding adateTime
dimension. The issue with EVA is that there are no time series handling and an improvement toEVA
in this regard would benefit everyone in terms of DA monitoring.EVA
currently reads data, makes necessary transforms, and makes the plots inside a certain folder, saygeos_ocean
for our case, at a single cycle.I have issues in terms of cycling and file storage so I have to erase JEDI outputs frequently for high resolution simulations that spans couple of months. Hence, ideally
EVA
needs to handle files during the active cycle before they get erased. This may or may not be relevant for the other developers.Below is what I'm suggesting (based on our workflow):
For Swell workflow, our
run
directory currently looks like this (for ocean-only DA cycling) :forecast
directory contains GEOS related files/outputs whereasgeos_ocean
has JEDI related configs/outputs.EVA
runs insidegeos_ocean
at each cycle and produces fabulous plots.My suggestion is having an extra folder (call it diagnostics or holding) on the same level as time:
So
EVA
would processIODA
outputs and create netcdf file(s) within thediagnostics
folder with time (and channels etc. if needed) dimension(s). Afterwards, it may append and update these file, or create a new one every so often, you get the idea. This would require some capabilities within EVA in addition to datasets, transforms, graphics, such as write to output (which may be a time sink unless handled in parallel). It would be great addition with theEVA
interactive tool as part of DA monitoring.I'm open to suggestions, thoughts, criticisms..
The text was updated successfully, but these errors were encountered: