diff --git a/_episodes/09-cmorization.md b/_episodes/09-cmorization.md index 92a888c5..2a10c92d 100644 --- a/_episodes/09-cmorization.md +++ b/_episodes/09-cmorization.md @@ -2,7 +2,7 @@ title: "CMORization: adding new datasets to ESMValTool" teaching: 15 exercises: 45 -compatibility: ESMValTool v2.6.0 +compatibility: ESMValTool v2.11.0 questions: - "CMORization: what is it and why do we need it?" @@ -123,6 +123,12 @@ run the CMORizer scripts: esmvaltool data format --config_file ``` +The options `--start` and `--end` can be added to command above to restrict the +formatting of raw data to a time range. They will be ignored if a specific +dataset does not support this option (i.e. because all the data is provided as a single file). +Valid formats are `YYYY`, `YYYYMM`, `YYYYMMDD`. The same way is also applicable for +the option `esmvaltool data download`. + The ``config-user.yml`` is the file in which we define the different data paths, see the episode on [Configuration]({{ page.root }}{% link _episodes/03-configuration.md %}). In the ``rootpath`` of your ``config-user.yml``, make sure to add the right @@ -141,38 +147,52 @@ name that was created to store the raw observation data files, i.e. If everything is okay, the output should look something like this: ~~~ -... -... Starting the CMORization Tool at time: 2022-07-26 14:02:16 UTC +... Writing program log files to: +/scratch/b/username/esmvaltool_output/data_formatting_20240527_132448/run/main_log.txt +/scratch/b/username/esmvaltool_output/data_formatting_20240527_132448/run/main_log_debug.txt +... Starting the CMORization Tool at time: 2024-05-27 13:24:48 UTC ... ---------------------------------------------------------------------- -... input_dir = /home/peter/data/RAWOBS -... output_dir = /home/peter/esmvaltool_output/data_formatting_20220726_140216 +... input_dir = /work/bd0854/DATA/ESMValTool2/RAWOBS +... output_dir = /scratch/b/username/esmvaltool_output/data_formatting_20240527_132448 ... ---------------------------------------------------------------------- ... Running the CMORization scripts. ... Processing datasets ['FLUXCOM'] -... Input data from: /home/peter/data/RAWOBS/Tier3/FLUXCOM -... Output will be written to: /home/peter/esmvaltool_output/ - data_formatting_20220726_140216/Tier3/FLUXCOM -... Reformat script: /home/peter/mambaforge/envs/esmvaltool/lib/python3.9/ - site-packages/esmvaltool/cmorizers/data/formatters/datasets/fluxcom -... CMORizing dataset FLUXCOM using Python script /home/peter/mambaforge/envs/ - esmvaltool/lib/python3.9/site-packages/esmvaltool/cmorizers/data/formatters/ - datasets/fluxcom.py -... Found input file '/home/peter/data/RAWOBS/Tier3/FLUXCOM/GPP.ANN.CRUNCEPv6.monthly.*.nc' +... Input data from: /work/bd0854/DATA/ESMValTool2/RAWOBS/Tier3/FLUXCOM +... Output will be written to: /scratch/b/username/esmvaltool_output/data_formatting_20240527_132448 + /Tier3/FLUXCOM +... Reformat script: /home/b/username/ESMValTool/ESMValTool/esmvaltool/cmorizers/data/formatters/ + datasets/fluxcom +... CMORizing dataset FLUXCOM using Python script /home/b/username/ESMValTool/ESMValTool/esmvaltool/ + cmorizers/data/formatters/datasets/fluxcom.py +... Found input file '/work/bd0854/DATA/ESMValTool2/RAWOBS/Tier3/FLUXCOM/GPP.ANN.CRUNCEPv6.monthly. + *.nc' ... CMORizing variable 'gpp' ... Lmon ... Var is gpp -... ... UserWarning: Ignoring netCDF variable 'GPP' invalid units 'gC m-2 day-1' +... WARNING /work/bd0854/username/utils/mambaforge/envs/esmvaltool/lib/python3.11/site-packages/ + iris/fileformats/_nc_load_rules/helpers.py:913: _WarnComboIgnoringCfLoad: Ignoring invalid u +nits 'gC m-2 day-1' on netCDF variable 'GPP'. + warnings.warn( ... Fixing time... ... Fixing latitude... ... Fixing longitude... ... Flipping dimensional coordinate latitude... ... Saving file -... Saving: /home/peter/esmvaltool_output/data_formatting_20220726_140216/Tier3/ - FLUXCOM/OBS_FLUXCOM_reanaly_ANN-v1_Lmon_gpp_200001-200012.nc +... Saving: /scratch/b/username/esmvaltool_output/data_formatting_20240527_132448/Tier3/FLUXCOM/ + OBS_FLUXCOM_reanaly_ANN-v1_Lmon_gpp_198001-198012.nc ... Cube has lazy data [lazy is preferred] +... WARNING /work/bd0854/username/utils/mambaforge/envs/esmvaltool/lib/python3.11/site-packages/ + iris/fileformats/netcdf/saver.py:2670: IrisDeprecation: Saving to netcdf with legacy-style a +ttribute handling for backwards compatibility. +This mode is deprecated since Iris 3.8, and will eventually be removed. +Please consider enabling the new split-attributes handling mode, by setting 'iris.FUTURE. +save_split_attrs = True'. + warn_deprecated(message) + ... CMORization of dataset FLUXCOM finished! ... Formatting successful for dataset FLUXCOM + ~~~ {: .output} @@ -193,6 +213,12 @@ You can also see the path where ESMValTool stores the reformatting script: have a look at this file if you want. The script also uses a configuration file: `~/ESMValTool/esmvaltool/cmorizers/data/cmor_config/FLUXCOM.yml`. +To get help on CMORizer commands, run the tool with: + +```bash +esmvaltool data --help +``` + ## Make a test recipe To verify that the data is correctly CMORized, we will make a simple test @@ -617,17 +643,23 @@ If we now run the test recipe on our newly 'CMORized' data, esmvaltool run recipe_check_fluxcom.yml --config_file --log_level debug ``` -it should be able to find the correct file, but it does not succeed yet. The first -thing that the ESMValTool CMOR checker brings up is: +it should be able to find the correct file, but it does not succeed yet. The ESMValTool CMOR checker +brings up is: ~~~ -iris.exceptions.UnitConversionError: Cannot convert from unknown units. The -"units" attribute may be set directly. +esmvalcore.cmor.check.CMORCheckError: There were errors in variable GPP: + GPP: units should be kg m-2 s-1, not unknown + lon: standard_name should be longitude, not None + lat: standard_name should be latitude, not None + lon: units should be degrees_east, not unknown + lon: has values < valid_min = 0.0 + lat: units should be degrees_north, not unknown + GPP: does not match coordinate rank ~~~ {: .error} -If you look closely at the error messages, you can see that this error concerns -the units of the coordinates. ESMValTool tries to fix them automatically, +If you look closely at the error messages, you can see the reasons for these errors +e.g. the units of the coordinates. ESMValTool tries to fix them automatically, but since no units are defined on the coordinates, this fails. The cmorizer utilities also include a function called `fix_coords`, but before @@ -684,7 +716,7 @@ The next error is: ~~~ esmvalcore.cmor.check.CMORCheckError: There were errors in variable GPP: -Variable GPP units unknown can not be converted to kg m-2 s-1 in cube: + GPP: units should be kg m-2 s-1, not unknown ~~~ {: .error}