Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWIESM2 output data went to "unknown" folder #1239

Open
Mccino opened this issue Oct 23, 2024 · 19 comments
Open

AWIESM2 output data went to "unknown" folder #1239

Mccino opened this issue Oct 23, 2024 · 19 comments
Labels
bug Something isn't working

Comments

@Mccino
Copy link
Collaborator

Mccino commented Oct 23, 2024

Describe the bug
ECHAM output data went to the "Unkown" folder for some simulations and some related questions.

To Reproduce
The correct simulations can be found: /home/a/a270198/work_ab0246/AWIEMS2/HI_EXP/Pre_HIST
The broken simulations can be found: /home/a/a270198/work_ab0246/AWIEMS2/HI_EXP/RES_HIST

System (please complete the following information):

  • Supercomputer: Levente
  • Version: esm_tools, version 6.29.0

Additional context
There is another problem with the broken simulations. ECHAM is a restart simulation, and the output generates two files with similar names. For example, for the year 2010, you can find "RES_HIST_201001.01_echam.nc" and "RES_HIST_201001.02_echam.nc" in the unknown folder. However, only the file with "02" can be read using CDO or ncdump. I’m wondering if anyone can provide some hints on this, or if I did something incorrectly when restarting ECHAM.

@Mccino Mccino added the bug Something isn't working label Oct 23, 2024
@pgierz
Copy link
Member

pgierz commented Oct 23, 2024

Hi @Mccino,

output being moved to the unknown folder happens if the file name is not matched by one of the output patterns. If you have a look at the finished_config.yaml (in the config folder), you'll see which files it knows about. For me, it looks something like this:

$ yq .echam.outdata_targets basic-002_finished_config.yaml
aclcim_nc: /albedo/work/user/pgierz/SciComp/Tutorials/AWIESM_Basics/experiments/basic-002/outdata/echam/basic-002_185901.01_aclcim.nc # no provenance info
g3b1hi_nc: /albedo/work/user/pgierz/SciComp/Tutorials/AWIESM_Basics/experiments/basic-002/outdata/echam/basic-002_185901.01_g3b1hi.nc # no provenance info
g3bday_nc: /albedo/work/user/pgierz/SciComp/Tutorials/AWIESM_Basics/experiments/basic-002/outdata/echam/basic-002_185901.01_g3bday.nc # no provenance info
g3bid_nc: /albedo/work/user/pgierz/SciComp/Tutorials/AWIESM_Basics/experiments/basic-002/outdata/echam/basic-002_185901.01_g3bid.nc # no provenance info
g3bim_nc: /albedo/work/user/pgierz/SciComp/Tutorials/AWIESM_Basics/experiments/basic-002/outdata/echam/basic-002_185901.01_g3bim.nc # no provenance info
glday_nc: /albedo/work/user/pgierz/SciComp/Tutorials/AWIESM_Basics/experiments/basic-002/outdata/echam/basic-002_185901.01_glday.nc # no provenance info
glim_nc: /albedo/work/user/pgierz/SciComp/Tutorials/AWIESM_Basics/experiments/basic-002/outdata/echam/basic-002_185901.01_glim.nc # no provenance info
jsbid_nc: /albedo/work/user/pgierz/SciComp/Tutorials/AWIESM_Basics/experiments/basic-002/outdata/echam/basic-002_185901.01_jsbid.nc # no provenance info
sp6h_nc: /albedo/work/user/pgierz/SciComp/Tutorials/AWIESM_Basics/experiments/basic-002/outdata/echam/basic-002_185901.01_sp6h.nc # no provenance info
spim_nc: /albedo/work/user/pgierz/SciComp/Tutorials/AWIESM_Basics/experiments/basic-002/outdata/echam/basic-002_185901.01_spim.nc # no provenance info
sp_nc: /albedo/work/user/pgierz/SciComp/Tutorials/AWIESM_Basics/experiments/basic-002/outdata/echam/basic-002_185901.01_sp.nc # no provenance info
gl_nc: /albedo/work/user/pgierz/SciComp/Tutorials/AWIESM_Basics/experiments/basic-002/outdata/echam/basic-002_185901.01_gl.nc # no provenance info
g3b_nc: /albedo/work/user/pgierz/SciComp/Tutorials/AWIESM_Basics/experiments/basic-002/outdata/echam/basic-002_185901.01_g3b.nc # no provenance info
scm_nc: /albedo/work/user/pgierz/SciComp/Tutorials/AWIESM_Basics/experiments/basic-002/outdata/echam/basic-002_185901.01_scm.nc # no provenance info
ma_nc: /albedo/work/user/pgierz/SciComp/Tutorials/AWIESM_Basics/experiments/basic-002/outdata/echam/basic-002_185901.01_ma.nc # no provenance info
surf_nc: /albedo/work/user/pgierz/SciComp/Tutorials/AWIESM_Basics/experiments/basic-002/outdata/echam/basic-002_185901.01_surf.nc # no provenance info
cfdiag_nc: /albedo/work/user/pgierz/SciComp/Tutorials/AWIESM_Basics/experiments/basic-002/outdata/echam/basic-002_185901.01_cfdiag.nc # no provenance info
aeropt_nc: /albedo/work/user/pgierz/SciComp/Tutorials/AWIESM_Basics/experiments/basic-002/outdata/echam/basic-002_185901.01_aeropt.nc # no provenance info
co2_nc: /albedo/work/user/pgierz/SciComp/Tutorials/AWIESM_Basics/experiments/basic-002/outdata/echam/basic-002_185901.01_co2.nc # no provenance info
tdiag_nc: /albedo/work/user/pgierz/SciComp/Tutorials/AWIESM_Basics/experiments/basic-002/outdata/echam/basic-002_185901.01_tdiag.nc # no provenance info

More specifically, you can see what sorts of files it knows about in general:

$ yq '.echam | keys | .[] | select(contains("targets"))' basic-002_finished_config.yaml
bin_targets
config_targets
forcing_targets
input_targets
log_targets
outdata_targets
restart_in_targets
restart_out_targets
ignore_targets

There's some more info in the handbook here

@mandresm
Copy link
Contributor

Yes @pgierz. But there is more depth to the problem than it seems:

If some variables are not copied to the unknown then is a matter of adding this variables to the streams but the problem here is that only the first month of some variables generated in a yearly chunk are copied to the outdata. These means to me that the pattern that we have set in the output copying by default it's too restrictive to the start date and that's why only the first month is copied.

I'm sure that at least for some variables, all months will be in outdata if instead of yearly chunks monthly chunks would be run. So the problem probably is a combination of ESM-Tools running yearly chunks but echam outputting monthly.

@mandresm
Copy link
Contributor

I think this is it:

outdata_sources:
"[[streams-->STREAM]]": ${general.expid}_${start_date!syear}*.${start_date!sday}_STREAM
"[[streams-->STREAM]]_codes": ${general.expid}_${start_date!syear}*.${start_date!sday}_STREAM.codes
"[[streamsnc-->STREAM]]_nc": ${general.expid}_${start_date!syear!smonth}.${start_date!sday}_STREAM.nc

That probably explains part of the problem right @pgierz?

@Mccino
Copy link
Collaborator Author

Mccino commented Oct 23, 2024

I checked in detail, and what stands out even more is that the daily output in the outdata folder for ECHAM, for example, /home/a/a270198/work_ab0246/AWIEMS2/HI_EXP/RES_HIST/outdata/echam/RES_HIST_188001.01_glday.nc, only contains data for "1880-01-01." The rest of the year, from "1880-01-02" to "1880-12-31," is saved in the unknown folder at /home/a/a270198/work_ab0246/AWIEMS2/HI_EXP/RES_HIST/unknown/RES_HIST_188001.02_glday.nc.

@pgierz
Copy link
Member

pgierz commented Oct 23, 2024

@mandresm, yes that's the right place. We instead need something that would glob over all months there.

@mandresm
Copy link
Contributor

mandresm commented Oct 23, 2024

I think we need to change the lines above with this:

 outdata_sources: 
         "[[streams-->STREAM]]": ${general.expid}_${start_date!syear}*_STREAM 
         "[[streams-->STREAM]]_codes": ${general.expid}_${start_date!syear}*_STREAM.codes 
         "[[streamsnc-->STREAM]]_nc": ${general.expid}_${start_date!syear}*_STREAM.nc 

or even removing the year:

 outdata_sources: 
         "[[streams-->STREAM]]": ${general.expid}*_STREAM 
         "[[streams-->STREAM]]_codes": ${general.expid}*_STREAM.codes 
         "[[streamsnc-->STREAM]]_nc": ${general.expid}*_STREAM.nc 

@pgierz, do you remember why been so specific about the timestamp was important?

@pgierz
Copy link
Member

pgierz commented Oct 23, 2024

There was probably a reason, but I can't remember

@chrisdane
Copy link
Contributor

Just a comment, your echam produces nc files (out_filetype = 2), not grib files (out_filetype = 1). Maybe streams were never really properly configured for the nc file case since the default echam output is grib files.

Also, maybe the data you need is not lost since the 02 file seems to have saved monthly output for the whole year:

cdo showtimestamp RES_HIST_201001.02_echam.nc
  2010-01-31T23:52:30  2010-02-28T23:52:30  2010-03-31T23:52:30  2010-04-30T23:52:30
  2010-05-31T23:52:30  2010-06-30T23:52:30  2010-07-31T23:52:30  2010-08-31T23:52:30
  2010-09-30T23:52:30  2010-10-31T23:52:30  2010-11-30T23:52:30  2010-12-31T23:52:30

@Mccino
Copy link
Collaborator Author

Mccino commented Oct 24, 2024

Just a comment, your echam produces nc files (out_filetype = 2), not grib files (out_filetype = 1). Maybe streams were never really properly configured for the nc file case since the default echam output is grib files.

Thanks!This could be one reason I guess...

Also, maybe the data you need is not lost since the 02 file seems to have saved monthly output for the whole year:

cdo showtimestamp RES_HIST_201001.02_echam.nc
  2010-01-31T23:52:30  2010-02-28T23:52:30  2010-03-31T23:52:30  2010-04-30T23:52:30
  2010-05-31T23:52:30  2010-06-30T23:52:30  2010-07-31T23:52:30  2010-08-31T23:52:30
  2010-09-30T23:52:30  2010-10-31T23:52:30  2010-11-30T23:52:30  2010-12-31T23:52:30

yes, for monthly output is not lost, but the daily output has been separated... maybe also due to the "out_filetype" issue.

@pgierz
Copy link
Member

pgierz commented Oct 24, 2024

See also #1240. I adjusted it so that the cleanup function should (hopefully) automatically pick up any custom streams you define.

I'm testing it now.

@chrisdane
Copy link
Contributor

That would be great! If you want you could try this non-default echam namelist (on levante):

/home/a/a270073/esm/namelists/echam/6.3.04p1/HIST-CMIP6/namelist.echam

(note some commented out lines in this file ...)

@pgierz
Copy link
Member

pgierz commented Oct 24, 2024

Alright, it should work as expected. The new (or rather, the unregistered) tags get added to a globbing pattern as:

mvstream_dict = {tag: f"{expid}*{tag}" for tag in mvstream_tags}
if namelist["runctl"].get("out_filetype") == 2:
    # Using NetCDF Outputs:
    mvstream_dict = {k: v + ".nc" for k, v in mvstream_dict.items()}
config['echam']["outdata_sources"].update(mvstream_dict)

where mvstreams_tags is extracted from any chapter in namelist.echam which has the heading mvstreamctl and defines something called filetag.

Please test it out on the branch feat/echam-streams.

@pgierz
Copy link
Member

pgierz commented Oct 24, 2024

This is confirmed to work as expected. To test, I have a minimised namelist as follows:

&runctl
    dt_start = 1850, 1, 1
    dt_stop = 1850, 2, 1
    putrerun = 1, 'months', 'first', 0
    lfractional_mask = .false.
    lresume = .false.
    out_datapath = '/work/ab0246/a270077/SciComp/Projects/esm-tools-echam-streams/experiments/test-007-pgierz/run_18500101-18500131/work/'
    out_expname = 'test-007-pgierz'
    rerun_filetype = 4
    delta_time = 450
    putdata = 1, 'months', 'last', 0
    nproma = 8
    npromar = 0
    lcouple = .true.
    getocean = 1, 'days', 'last', 0
    putocean = 1, 'days', 'last', 0
    lcouple_co2 = .true.
    default_output = .false.
    dt_resume = 1850, 1, 1
/

&parctl
    nproca = 24
    nprocb = 24
    nprocar = 0
    nprocbr = 0
/

&submodelctl
    lmethox = .true.
    licb = .false.
/

&submdiagctl
    vphysc_lpost = .false.
/

&radctl
    iaero = 3
    io3 = 4
    isolrad = 6
    ich4 = 3
    in2o = 3
    co2vmr = 0.000284316986084
    ch4vmr = 8.082490234375e-07
    n2ovmr = 2.730210571289e-07
    yr_perp = 1850
    lrad_async = .false.
    lrestart_from_old = .false.
/

&mvstreamctl
    filetag = 'paul_custom'
    source = 'g3b'
    variables = 'temp2:mean>temp2=167'
    interval = 1, 'months', 'last', 0
/

&wisoctl
    lwiso_rerun = .false.
    lwiso = .false.
    nwiso = 0
/

Note, I only have one single output stream, all others are removed, and it is a name that does not appear anywhere in the esm-tools defaults. After the run, in outdata/echam:

a270077 in 🌐 levante5 in test-007-pgierz/outdata/echam via 🐍 v3.10.10 (python-3.10.10)
❯ ls
test-007-pgierz_185001.01_accw        test-007-pgierz_185001.01_paul_custom  test-007-pgierz_185002.01_accw.codes
test-007-pgierz_185001.01_accw.codes  test-007-pgierz_185002.01_accw         test-007-pgierz_185002.01_paul_custom

a270077 in 🌐 levante5 in test-007-pgierz/outdata/echam via 🐍 v3.10.10 (python-3.10.10)
❯ module load cdo

a270077 in 🌐 levante5 in test-007-pgierz/outdata/echam via 🐍 v3.10.10 (python-3.10.10)
❯ cdo -f nc -t echam6 copy test-007-pgierz_185001.01_paul_custom ../../analysis/lala.nc
cdo    copy: Processed 18432 values from 1 variable over 1 timestep [0.06s 22MB].

a270077 in 🌐 levante5 in test-007-pgierz/outdata/echam via 🐍 v3.10.10 (python-3.10.10)
❯ cd ../../analysis/

a270077 in 🌐 levante5 in experiments/test-007-pgierz/analysis via 🐍 v3.10.10 (python-3.10.10)
❯ ncdump -h lala.nc
netcdf lala {
dimensions:
	time = UNLIMITED ; // (1 currently)
	lon = 192 ;
	lat = 96 ;
variables:
	double time(time) ;
		time:standard_name = "time" ;
		time:units = "day as %Y%m%d.%f" ;
		time:calendar = "proleptic_gregorian" ;
		time:axis = "T" ;
	double lon(lon) ;
		lon:standard_name = "longitude" ;
		lon:long_name = "longitude" ;
		lon:units = "degrees_east" ;
		lon:axis = "X" ;
	double lat(lat) ;
		lat:standard_name = "latitude" ;
		lat:long_name = "latitude" ;
		lat:units = "degrees_north" ;
		lat:axis = "Y" ;
	float temp2(time, lat, lon) ;
		temp2:long_name = "2m temperature" ;
		temp2:units = "K" ;
		temp2:code = 167 ;
		temp2:table = 128 ;
		temp2:CDI_grid_type = "gaussian" ;
		temp2:CDI_grid_num_LPE = 48 ;

// global attributes:
		:CDI = "Climate Data Interface version 2.0.5 (https://mpimet.mpg.de/cdi)" ;
		:Conventions = "CF-1.6" ;
		:source = "ECHAM6" ;
		:institution = "Max Planck Institute for Meteorology" ;
		:history = "Thu Oct 24 16:19:21 2024: cdo -f nc -t echam6 copy test-007-pgierz_185001.01_paul_custom ../../analysis/lala.nc" ;
		:CDO = "Climate Data Operators version 2.0.5 (https://mpimet.mpg.de/cdo)" ;
}

@Mccino
Copy link
Collaborator Author

Mccino commented Oct 24, 2024

Thanks, Paul! I'm now trying to use mine namelist with setting 'out_filetype = 2' to see if the changes work. If not, I will also try 'out_filetype = 1' to identify the problem.

@Mccino
Copy link
Collaborator Author

Mccino commented Oct 25, 2024

I tried the new branch feat/echam-streams with setting either out_filetype = 2 or out_filetype = 1. Both of them now worked well for the output files with a filetag. Thanks, @pgierz !

For anyone who is interested, the previous incorrect separation of the output files with 01 and 02 was due to the setting (trigfiles) in the namelist.echam. If one sets it to default, then the output files are correct.

But, a new question arises, the output for echam is renamed as ***1850_185001.01_.01_echam, which should be ***_185001.01_echam. Find the problem in the below path:

  1. with 'out_filetype = 2', path: /home/a/a270198/work_ab0246/AWIEMS2/HI_EXP/t6_nc_rmvtrig/outdata/echam
  2. with 'out_filetype = 1', path: /home/a/a270198/work_ab0246/AWIEMS2/HI_EXP/t5_rmv_trig/outdata/echam

@pgierz
Copy link
Member

pgierz commented Oct 25, 2024

Hmm...that looks like a problem in one of the source/target renaming. If one looks in the config/t6_nc_rmvtrig_filelist_18500101-18501231:

...
Source: /home/a/a270198/work_ab0246/AWIEMS2/HI_EXP//t6_nc_rmvtrig/run_18500101-18501231/work/t6_nc_rmvtrig_185011.01_echam.nc
Exp Tree: /home/a/a270198/work_ab0246/AWIEMS2/HI_EXP/t6_nc_rmvtrig/run_18500101-18501231/outdata/echam/t6_nc_rmvtrig_1850_185011.01_.01_echam
Target: /home/a/a270198/work_ab0246/AWIEMS2/HI_EXP/t6_nc_rmvtrig/outdata/echam/t6_nc_rmvtrig_1850_185011.01_.01_echam

I'll see if I can find out where the extra .01_ is being added to the filename.

@pgierz
Copy link
Member

pgierz commented Oct 25, 2024

I think this line is the problem, but I'm not entirely sure.

"[[streams-->STREAM]]": ${general.expid}_${start_date!syear}*.${start_date!sday}_STREAM

Could you locally try out this instead?

- "[[streams-->STREAM]]": ${general.expid}_${start_date!syear}*.${start_date!sday}_STREAM
+ "[[streams-->STREAM]]": ${general.expid}_${start_date!syear}*STREAM

@mandresm, any ideas?

@Mccino
Copy link
Collaborator Author

Mccino commented Oct 25, 2024

I do not think the above guess is the reason for the problem.
I tried running the simulation with different versions of esm_tools with the same namelist.echam, and the older version did not cause the problem by adding the extra .01_ for the filename.

  1. results using the old version (version 6.29.0), path: /home/a/a270198/work_ab0246/AWIEMS2/HI_EXP/ttt_oldesm/outdata/echam

  2. results for the new branch ('origin/feat/echam-streams'), path: /home/a/a270198/work_ab0246/AWIEMS2/HI_EXP/tt_new_esmtool/outdata/echam

It is a bit odd; the affected filenames end with ‘echam,’ while the others are fine.
I am guessing this might be due to the recent changes at the beginning, something like:

mvstream_dict = {tag: f"{expid}*{tag}" for tag in mvstream_tags}
if namelist["runctl"].get("out_filetype") == 2:
# Using NetCDF Outputs:
mvstream_dict = {k: v + ".nc" for k, v in mvstream_dict.items()}
config['echam']["outdata_sources"].update(mvstream_dict)

@pgierz
Copy link
Member

pgierz commented Oct 25, 2024

Yeah, that's certainly incomplete. It should be something like this instead:

tmp = {}
for k, v in mvstream_dict.items():
    tmp[k] = v if v.endswith(".nc") else f"{v}.nc"
mvstream_dict = tmp

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants