+
+
+
+
+ Beta
+
+ This lesson is in the beta phase, which means that it is ready for teaching by instructors outside of the original author team.
+
+
+
+
This tutorial is a first introduction to ESMValTool. Before diving
+into the technical steps, let’s talk about what ESMValTool is all
+about.
+
+
+
+
+
+
What is ESMValTool?
+
+
What do you already know about or expect from ESMValTool?
+
+
+
+
+
+
+
+
+
EMSValTool is many things, but in this tutorial we will focus on the
+following traits:
+
✓ A tool to analyse climate data
+
✓ A collection of diagnostics for reproducible climate
+science
+
✓ A community effort
+
+
+
+
+
A tool to analyse climate data
+
ESMValTool takes care of finding, opening, checking, fixing,
+concatenating, and preprocessing CMIP data and several other supported
+datasets.
+
The central component of ESMValTool that we will see in this tutorial
+is the recipe. Any ESMValTool recipe is basically a set
+of instructions to reproduce a certain result. The basic structure of a
+recipe is as follows:
+
+Documentation with relevant (citation)
+information
+
+Datasets that should be analysed
+
+Preprocessor steps that must be applied
+
+Diagnostic scripts performing more specific
+evaluation steps
+
An example recipe could look like this:
+
+
YAML
+
+
documentation:
+title: This is an example recipe.
+description: Example recipe
+authors:
+- lastname_firstname
+
+datasets:
+-{dataset: HadGEM2-ES,project: CMIP5,exp: historical,mip: Amon,
+ensemble: r1i1p1,start_year:1960,end_year:2005}
+
+preprocessors:
+global_mean:
+area_statistics:
+operator: mean
+
+diagnostics:
+hockeystick_plot:
+description: plot of global mean temperature change
+variables:
+temperature:
+short_name: tas
+preprocessor: global_mean
+scripts: hockeystick.py
+
+
+
+
+
+
+
Understanding the different section of the recipe
+
+
Try to figure out the meaning of the different dataset keys. Hint:
+they can be found in the documentation of ESMValTool.
A collection of diagnostics for reproducible climate science
+
More than a tool, ESMValTool is a collection of publicly available
+recipes and diagnostic scripts. This makes it possible to easily
+reproduce important results.
ESMValTool is built and maintained by an active community of
+scientists and software engineers. It is an open source project to which
+anyone can contribute. Many of the interactions take place on GitHub.
+Here, we briefly introduce you to some of the most important pages.
+
+
+
+
+
+
Meet the ESMValGroup
+
+
Go to github.com/ESMValGroup. This
+is the GitHub page of our ‘organization’. Have a look around. How many
+collaborators are there? Do you know any of them?
+
Near the top of the page there are 2 pinned repositories: ESMValTool
+and ESMValCore. Visit each of the repositories. How many people have
+contributed to each of them? Can you also find out how many people have
+contributed to this tutorial?
+
+
+
+
+
+
+
+
+
Issues and pull requests
+
+
Go back to the repository pages of ESMValTool or ESMValCore. There
+are tabs for ‘issues’ and ‘pull requests’. You can use the labels to
+navigate them a bit more. How many open issues are about enhancements of
+ESMValTool? And how many bugs have been fixed in ESMValCore? There is
+also an ‘insights’ tab, where you can see a summary of recent activity.
+How many issues have been opened and closed in the past month?
+
+
+
+
Conclusion
+
This concludes the introduction of the tutorial. You now have a basic
+knowledge of ESMValTool and its community. The following episodes will
+walk you through the installation, configuration and running your first
+recipes.
+
+
+
+
+
+
Key Points
+
+
ESMValTool provides a reliable interface to analyse and evaluate
+climate data
+
A large collection of recipes and diagnostic scripts is already
+available
+
ESMValTool is built and maintained by an active community of
+scientists and developers
+
+
diff --git a/01-quickstart.html b/01-quickstart.html
new file mode 100644
index 00000000..1bc9a7e2
--- /dev/null
+++ b/01-quickstart.html
@@ -0,0 +1,616 @@
+
+ESMValTool Tutorial: Quickstart guide
+ Skip to main content
+
+
+
+
+
+
+
+ Beta
+
+ This lesson is in the beta phase, which means that it is ready for teaching by instructors outside of the original author team.
+
+
+
+
How do I load and check the ESMValTool environment?
+
How do I configure ESMValTool?
+
How do I run a recipe?
+
+
+
+
+
+
+
Objectives
+
Understand the purpose of the quickstart guide
+
Load and check the ESMValTool environment
+
Configure ESMValTool
+
Run a recipe
+
+
+
+
+
+
+
+
+
+
+
What is the purpose of the quickstart guide?
+
+
The purpose of the quickstart guide is to enable a user of
+ESMValTool to run ESMValTool as quickly as possible by making the bare
+minimum number of changes.
+
+
+
+
+
+
+
+
+
How do I load and check the ESMValTool environment?
+
+
For this quickstart guide, an assumption is made that ESMValTool
+has already been installed at the site where ESMValTool will be run. If
+this is not the case, see the [Installation][lesson-installation]
+episode in this tutorial.
+
Load the ESMValTool environment by following the instructions at
+[ESMValTool: Pre-installed versions on HPC clusters / other
+servers][activate-environment].
+
+
Check the ESMValTool environment by accessing the help for
+ESMValTool:
+
+
BASH
+
+
esmvaltool--help
+
+
+
+
+
+
+
+
+
+
+
How do I configure ESMValTool?
+
+
+
Create the ESMValTool user configuration file (the file is
+written by default to ~/.esmvaltool/config-user.yml):
+
+
BASH
+
+
esmvaltool config get_config_user
+
+
+
Edit the ESMValTool user configuration file using your favourite
+text editor to uncomment the lines relating to the site where ESMValTool
+will be run.
+
For more details about the ESMValTool user configuration file see
+the [Configuration][lesson-configuration] episode in this
+tutorial.
+
+
+
+
+
+
+
+
+
How do I run a recipe?
+
+
+
Run the example Python recipe:
+
+
BASH
+
+
esmvaltool run examples/recipe_python.yml
+
+
+
+
Wait for the recipe to complete. If the recipe completes
+successfully, the last line printed to screen at the end of the log will
+look something like:
+
+
BASH
+
+
YYYY-MM-DD HH:mm:SS, NNN UTC [NNNNN] INFO Run was successful
+
+
+
+
View the output of the recipe by opening the HTML file produced
+by ESMValTool (the location of this file is printed to screen near the
+end of the log):
+
+
BASH
+
+
YYYY-MM-DD HH:mm:SS, NNN UTC [NNNNN] INFO Wrote recipe output to:
+file:///$HOME/esmvaltool_output/recipe_python_<date>_<time>/index.html
+
+
+
For more details about running recipes see the [Running your
+first recipe][lesson-recipe] episode in this tutorial.
+
+
+
+
+
+
+
+
+
Key Points
+
+
The purpose of the quickstart guide is to enable a user of
+ESMValTool to run ESMValTool as quickly as possible without having to go
+through the whole tutorial
+
Use the module load command to load the ESMValTool
+environment, see the \[Installation\]\[lesson-installation\] episode for more
+details and use esmvaltool --help to check the ESMValTool
+environment
+
Use esmvaltool config get_config_user to create the
+ESMValTool user configuration file
+
+
diff --git a/02-installation.html b/02-installation.html
new file mode 100644
index 00000000..a6a625cd
--- /dev/null
+++ b/02-installation.html
@@ -0,0 +1,748 @@
+
+ESMValTool Tutorial: Installation
+ Skip to main content
+
+
+
+
+
+
+
+ Beta
+
+ This lesson is in the beta phase, which means that it is ready for teaching by instructors outside of the original author team.
+
+
+
+
What are the prerequisites for installing ESMValTool?
+
How do I confirm that the installation was successful?
+
+
+
+
+
+
+
Objectives
+
Install ESMValTool
+
Demonstrate that the installation was successful
+
+
+
+
+
+
Overview
+
The instructions help with the installation of ESMValTool on
+operating systems like Linux/MacOSX/Windows. We use the Mamba
+package manager to install the ESMValTool. Other installation methods
+are also available; they can be found in the documentation.
+We will first install Mamba, and then ESMValTool. We end this chapter by
+testing that the installation was successful.
+
Before we begin, here are all the possible ways in which you can use
+ESMValTool depending on your level of expertise or involvement with
+ESMValTool and associated software such as GitHub and Mamba.
+
If you have access to a server where ESMValTool is already installed
+as a module, for e.g., the [CEDA JASMIN](https://help.jasmin.ac.uk/article
+/4955-community-software-esmvaltool) server, you can simply load the
+module with the following command:
+
+
BASH
+
+
module load esmvaltool
+
+
After loading esmvaltool, we can start using ESMValTool
+right away. Please see the next
+lesson. 2. If you would like to install ESMValTool as a mamba
+package, then this lesson will tell you how! 3. If you would like to
+start experimenting with existing diagnostics or contributing to
+ESMvalTool, please see the instructions for source installation in the
+lesson Development and
+contribution and in the documentation.
+
+
+
+
+
+
Install ESMValTool on Windows
+
+
ESMValTool does not directly support Windows, but successful usage
+has been reported through the Windows Subsystem
+for Linux(WSL), available in Windows 10. To install the WSL please
+follow the instructions on the
+Windows Documentation page. After installing the WSL, installation
+can be done using the same instructions for Linux/MacOSX.
+
+
+
+
Install ESMValTool on Linux/MacOSX
+
+
Install Mamba
+
ESMValTool is distributed using Mamba. To
+install mamba on Linux or MacOSX, follow the
+instructions below:
+
Please download the installation file for the latest Mamba
+version here.
+
Next, run the installer from the place where you downloaded
+it:
+
On Linux:
+
+
BASH
+
+
bash Mambaforge-Linux-x86_64.sh
+
+
On MacOSX:
+
+
BASH
+
+
bash Mambaforge-MacOSX-x86_64.sh
+
+
Follow the instructions in the installer. The defaults should
+normally suffice.
+
You will need to restart your terminal for the changes to have
+effect.
+
We recommend updating mamba before the esmvaltool installation.
+To do so, run:
+
+
BASH
+
+
mamba update --name base mamba
+
+
Verify you have a working mamba installation by:
+
+
BASH
+
+
which mamba
+
+
This should show the path to your mamba executable,
+e.g. ~/mambaforge/bin/mamba.
+
For more information about installing mamba, see [the mamba
+installation documentation](https://docs.esmvaltool.org/en
+/latest/quickstart/installation.html#mamba-installation).
+
+
+
Install the ESMValTool package
+
The ESMValTool package contains diagnostics scripts in four
+languages: R, Python, Julia and NCL. This introduces a lot of
+dependencies, and therefore the installation can take quite long. It is,
+however, possible to install ‘subpackages’ for each of the languages.
+The following (sub)packages are available:
+
esmvaltool-python
+
esmvaltool-ncl
+
esmvaltool-r
+
+esmvaltool –> the complete package, i.e. the
+combination of the above.
+
For the tutorial, we will install the complete package. Thus, to
+install the ESMValTool package, run
+
+
BASH
+
+
mamba create --name esmvaltool esmvaltool
+
+
On MacOSX ESMValTool functionalities in Julia, NCL, and R are not
+supported. To install a Mamba environment on MacOSX, please refer to
+specific information.
Some ESMValTool diagnostics are written in the Julia programming
+language. If you want a full installation of ESMValTool including Julia
+diagnostics, you need to make sure Julia is installed before installing
+ESMValTool.
+
In this tutorial, we will not use Julia, but for reference, we have
+listed the steps to install Julia below. Complete instructions for
+installing Julia can be found on the Julia
+installation page.
+
+
+
+
+
+
First, open a bash terminal and activate the newly created
+esmvaltool environment.
+
+
BASH
+
+
conda activate esmvaltool
+
+
Next, to install Julia via mamba, you can use the
+following command:
+
+
BASH
+
+
mamba install julia
+
+
To check that the Julia executable can be found, run
+
+
BASH
+
+
which julia
+
+
to display the path to the Julia executable, it should be
+
+
OUTPUT
+
+
~/mambaforge/envs/esmvaltool/bin/julia
+
+
To test that Julia is installed correctly, run
+
+
BASH
+
+
julia
+
+
to start the interactive Julia interpreter. Press Ctrl+D
+to exit.
+
+
+
+
+
+
+
Test that the installation was successful
+
To test that the installation was successful, run
+
+
BASH
+
+
conda activate esmvaltool
+
+
to activate the conda environment called esmvaltool. In
+the shell prompt the active conda environment should have been changed
+from (base) to (esmvaltool).
+
Next, run
+
+
BASH
+
+
esmvaltool--help
+
+
to display the command line help.
+
+
+
+
+
+
Version of ESMValTool
+
+
Can you figure out which version of ESMValTool has been
+installed?
+
+
+
+
+
+
+
+
+
The esmvaltool --help command lists version
+as a command to get the version
+
When you run
+
+
BASH
+
+
esmvaltool version
+
+
The version of ESMValTool installed should be displayed on the screen
+as:
+
+
OUTPUT
+
+
ESMValCore: 2.11.0
+ESMValTool: 2.11.0
+
+
Note that on HPC servers such as JASMIN, sometimes a more recent
+development version may be displayed for ESMValTool, for e.g.
+ESMValTool: 2.11.0.dev71+g2c60b4d97
+
+
+
+
+
+
+
+
+
+
Key Points
+
+
All the required packages can be installed using mamba.
+
You can find more information about installation in the documentation.
+
+
diff --git a/03-configuration.html b/03-configuration.html
new file mode 100644
index 00000000..1a51f093
--- /dev/null
+++ b/03-configuration.html
@@ -0,0 +1,934 @@
+
+ESMValTool Tutorial: Configuration
+ Skip to main content
+
+
+
+
+
+
+
+ Beta
+
+ This lesson is in the beta phase, which means that it is ready for teaching by instructors outside of the original author team.
+
+
+
+
What is the user configuration file and how should I use it?
+
+
+
+
+
+
+
Objectives
+
Understand the contents of the user-config.yml file
+
Prepare a personalized user-config.yml file
+
Configure ESMValTool to use some settings
+
+
+
+
+
+
The configuration file
+
For the purposes of this tutorial, we will create a directory in our
+home directory called esmvaltool_tutorial and use that as
+our working directory. The following steps should do that:
+
+
BASH
+
+
mkdir esmvaltool_tutorial
+cd esmvaltool_tutorial
+
+
The config-user.yml configuration file contains all the
+global level information needed by ESMValTool to run. This is a YAML file.
+
You can get the default configuration file by running:
The default configuration file will be downloaded to the directory
+specified with the --path variable. For instance, you can
+provide the path to your working directory as the
+target_dir. If this option is not used, the file will be
+saved to the default location:
+~/.esmvaltool/config-user.yml, where ~ is the
+path to your home directory. Note that files and directories starting
+with a period are “hidden”, to see the .esmvaltool
+directory in the terminal use ls -la ~. Note that if a
+configuration file by that name already exists in the default location,
+the get_config_user command will not update the file as
+ESMValTool will not overwrite the file. You will have to move the file
+first if you want an updated copy of the user configuration file.
+
We run a text editor called nano to have a look inside
+the configuration file and then modify it if needed:
+
+
BASH
+
+
nano ~/.esmvaltool/config-user.yml
+
+
Any other editor can be used, e.g.vim.
+
This file contains the information for:
+
Output settings
+
Destination directory
+
Auxiliary data directory
+
Number of tasks that can be run in parallel
+
Rootpath to input data
+
Directory structure for the data from different projects
+
+
+
+
+
+
Text editor side note
+
+
No matter what editor you use, you will need to know where it
+searches for and saves files. If you start it from the shell, it will
+(probably) use your current working directory as its default location.
+We use nano in examples here because it is one of the least
+complex text editors. Press ctrl + O to save the
+file, and then ctrl + X to exit
+nano.
+
+
+
+
Output settings
+
The configuration file starts with output settings that inform
+ESMValTool about your preference for output. You can turn on or off the
+setting by true or false values. Most of these
+settings are fairly self-explanatory.
+
+
+
+
+
+
Saving preprocessed data
+
+
Later in this tutorial, we will want to look at the contents of the
+preproc folder. This folder contains preprocessed data and
+is removed by default when ESMValTool is run. In the configuration file,
+which settings can be modified to prevent this from happening?
+
+
+
+
+
+
+
+
+
If the option remove_preproc_dir is set to
+false, then the preproc/ directory contains
+all the pre-processed data and the metadata interface files. If the
+option save_intermediary_cubes is set to true
+then data will also be saved after each preprocessor step in the folder
+preproc. Note that saving all intermediate results to file
+will result in a considerable slowdown, and can quickly fill your
+disk.
+
+
+
+
+
Destination directory
+
The destination directory is the rootpath where ESMValTool will store
+its output folders containing e.g. figures, data, logs, etc. With every
+run, ESMValTool automatically generates a new output folder determined
+by recipe name, and date and time using the format: YYYYMMDD_HHMMSS.
+
+
+
+
+
+
Set the destination directory
+
+
Let’s name our destination directory esmvaltool_output
+in the working directory. ESMValTool should write the output to this
+path, so make sure you have the disk space to write output to this
+directory. How do we set this in the config-user.yml?
+
+
+
+
+
+
+
+
+
We use output_dir entry in the
+config-user.yml file as:
+
+
YAML
+
+
output_dir: ./esmvaltool_output
+
+
If the esmvaltool_output does not exist, ESMValTool will
+generate it for you.
+
+
+
+
+
Rootpath to input data
+
ESMValTool uses several categories (in ESMValTool, this is referred
+to as projects) for input data based on their source. The current
+categories in the configuration file are mentioned below. For example,
+CMIP is used for a dataset from the Climate Model Intercomparison
+Project whereas OBS may be used for an observational dataset. More
+information about the projects used in ESMValTool is available in the
+[documentation](https://docs.esmvaltool.org/projects/esmvalcore/en/latest/
+quickstart/find_data.html). When using ESMValTool on your own machine,
+you can create a directory to download climate model data or observation
+data sets and let the tool use data from there. It is also possible to
+ask ESMValTool to download climate model data as needed. This can be
+done by specifying a download directory and by setting the option to
+download data as shown below.
+
+
YAML
+
+
# Directory for storing downloaded climate data
+download_dir: ~/climate_data
+search_esgf: always
+
+
If you are working offline or do not want to download the data then
+set the option above to never. If you want to download data
+only when the necessary files are missing at the usual location, you can
+set the option to when_missing.
+
The rootpath specifies the directories where ESMValTool
+will look for input data. For each category, you can define either one
+path or several paths as a list. For example:
These are typically available in the default configuration file you
+downloaded, so simply removing the machine specific lines should be
+sufficient to access input data.
+
+
+
+
+
+
Set the correct rootpath
+
+
In this tutorial, we will work with data from CMIP5 and CMIP6. How can we
+modify the rootpath to make sure the data path is set
+correctly for both CMIP5 and CMIP6? Note: to get the data, check the
+instructions in Setup.
+
+
+
+
+
+
+
+
+
Are you working on your own local machine? You need to add the root
+path of the folder where the data is available to the
+config-user.yml file as:
Are you working on your local machine and have downloaded data using
+ESMValTool? You need to add the root path of the folder where the data
+has been downloaded to as specified in the
+download_dir.
Are you working on a computer cluster like Jasmin or DKRZ?
+Site-specific path to the data for JASMIN/DKRZ/ETH/IPSL are already
+listed at the end of the config-user.yml file. You need to
+uncomment the related lines. For example, on JASMIN:
Directory structure for the data from different projects
+
Input data can be from various models, observations and reanalysis
+data that adhere to the CF/CMOR
+standard. The drs setting describes the file
+structure.
+
The drs setting describes the file structure for several
+projects (e.g. CMIP6, CMIP5, obs4mips, OBS6, OBS) on several key
+machines (e.g. BADC, CP4CDS, DKRZ, ETHZ, SMHI, BSC). For more
+information about drs, you can visit the ESMValTool
+documentation on [Data Reference Syntax (DRS)](https://docs.esmvaltool.org/projects/esmvalcore/
+en/latest/quickstart/find_data.html#cmor-drs).
+
+
+
+
+
+
Set the correct drs
+
+
In this lesson, we will work with data from CMIP5 and CMIP6. How can we
+set the correct drs?
+
+
+
+
+
+
+
+
+
Are you working on your own local machine? You need to set the
+drs of the data in the config-user.yml file
+as:
+
+
YAML
+
+
drs:
+CMIP5: default
+CMIP6: default
+
+
Are you asking ESMValTool to download the data for use with your
+diagnostics? You need to set the drs of the data in the
+config-user.yml file as:
Are you working on a computer cluster like Jasmin or DKRZ?
+Site-specific drs of the data are already listed at the end
+of the config-user.yml file. You need to uncomment the
+related lines. For example, on Jasmin:
+
+
YAML
+
+
# Site-specific entries: Jasmin
+ # Uncomment the lines below to locate data on JASMIN
+drs:
+CMIP6: BADC
+CMIP5: BADC
+OBS: default
+OBS6: default
+obs4mips: default
+ana4mips: default
+
+
+
+
+
+
+
+
+
+
+
Explain the default drs (if working on local machine)
+
+
In the previous exercise, we set the drs of CMIP5 data
+to default. Can you explain why?
+
Have a look at the directory structure of the OBS data.
+There is a folder called Tier1. What does it mean?
+
+
+
+
+
+
+
+
+
drs: default is one way to retrieve data from a ROOT
+directory that has no DRS-like structure. default indicates
+that all the files are in a folder without any structure.
+
Observational data are organized in Tiers depending on their
+level of public availability. Therefore the default directory must be
+structured accordingly with sub-directories TierX
+e.g. Tier1, Tier2 or Tier3, even when drs: default. More
+details can be found in the [documentation](https://docs.esmvaltool.org/projects/esmvalcore/en/latest/
+quickstart/find_data.html#observational-data).
+
+
+
+
+
Other settings
+
+
+
+
+
+
Auxiliary data directory
+
+
The auxiliary_data_dir setting is the path where any
+required additional auxiliary data files are stored. This location
+allows us to tell the diagnostic script where to find the files if they
+can not be downloaded at runtime. This option should not be used for
+model or observational datasets, but for data files (e.g. shape files)
+used in plotting such as coastline descriptions and if you want to feed
+some additional data (e.g. shape files) to your recipe.
This option enables you to perform parallel processing. You can
+choose the number of tasks in parallel as 1/2/3/4/… or you can set it to
+null. That tells ESMValTool to use the maximum number of
+available CPUs. For the purpose of the tutorial, please set ESMValTool
+use only 1 cpu:
+
+
YAML
+
+
max_parallel_tasks:1
+
+
In general, if you run out of memory, try setting
+max_parallel_tasks to 1. Then, check the amount of memory
+you need for that by inspecting the file
+run/resource_usage.txt in the output directory. Using the
+number there you can increase the number of parallel tasks again to a
+reasonable number for the amount of memory available in your system.
+
+
+
+
+
+
+
+
+
Make your own configuration file
+
+
It is possible to have several configuration files with different
+purposes, for example: config-user_formalised_runs.yml,
+config-user_debugging.yml. In this case, you have to pass the path of
+your own configuration file as a command-line option when running the
+ESMValTool. We will learn how to do this in the next lesson.
+
+
+
+
+
+
+
+
+
Key Points
+
+
The config-user.yml tells ESMValTool where to find
+input data.
+
+
diff --git a/04-recipe.html b/04-recipe.html
new file mode 100644
index 00000000..4fe5480d
--- /dev/null
+++ b/04-recipe.html
@@ -0,0 +1,1068 @@
+
+ESMValTool Tutorial: Running your first recipe
+ Skip to main content
+
+
+
+
+
+
+
+ Beta
+
+ This lesson is in the beta phase, which means that it is ready for teaching by instructors outside of the original author team.
+
+
+
+
This episode describes how ESMValTool recipes work, how to run a
+recipe and how to explore the recipe output. By the end of this episode,
+you should be able to run your first recipe, look at the recipe output,
+and make small modifications.
+
Running an existing recipe
+
The recipe format has briefly been introduced in the
+[Introduction][lesson-introduction] episode. To see all the recipes that
+are shipped with ESMValTool, type
+
+
BASH
+
+
esmvaltool recipes list
+
+
We will start by running [examples/recipe_python.yml](https://docs.esmvaltool.
+org/en/latest/recipes/recipe_examples.html)
+
esmvaltool run examples/recipe_python.yml
+
or if you have the user configuration file in your current directory
+then
+
esmvaltool run --config_file ./config-user.yml examples/recipe_python.yml
+
If everything is okay, you should see that ESMValTool is printing a
+lot of output to the command line. The final message should be “Run was
+successful”. The exact output varies depending on your machine, but it
+should look something like the example log output on terminal below.
+
{% include example_output.txt %}
+
+
+
+
+
+
Pro tip: ESMValTool search paths
+
+
You might wonder how ESMValTool was able find the recipe file, even
+though it’s not in your working directory. All the recipe paths printed
+from esmvaltool recipes list are relative to ESMValTool’s
+installation location. This is where ESMValTool will look if it cannot
+find the file by following the path from your working directory.
+
+
+
+
Investigating the log messages
+
Let’s dissect what’s happening here.
+
+
+
+
+
+
Output files and directories
+
+
After the banner and general information, the output starts with some
+important locations.
+
Did ESMValTool use the right config file?
+
What is the path to the example recipe?
+
What is the main output folder generated by ESMValTool?
+
Can you guess what the different output directories are for?
+
ESMValTool creates two log files. What is the difference?
+
+
+
+
+
+
+
+
+
The config file should be the one we edited in the previous episode,
+something like
+/home/<username>/.esmvaltool/config-user.yml or
+~/esmvaltool_tutorial/config-user.yml.
+
ESMValTool found the recipe in its installation directory, something
+like
+/home/users/username/mambaforge/envs/esmvaltool/bin/esmvaltool/recipes/examples/
+or if you are using a pre-installed module on a server, something like
+/apps/jasmin/community/esmvaltool/ESMValTool_<version> /esmvaltool/recipes/examples/recipe_python.yml,
+where <version> is the latest release.
+
ESMValTool creates a time-stamped output directory for every run. In
+this case, it should be something like
+recipe_python_YYYYMMDD_HHMMSS. This folder is made inside
+the output directory specified in the previous episode:
+~/esmvaltool_tutorial/esmvaltool_output.
+
There should be four output folders:
+
+plots/: this is where output figures are stored.
+
+preproc/: this is where pre-processed data are
+stored.
+
+run/: this is where esmvaltool stores general
+information about the run, such as log messages and a copy of the recipe
+file.
+
+work/: this is where output files (not figures) are
+stored.
+
The log files are:
+
+main_log.txt is a copy of the command-line output
+
+main_log_debug.txt contains more detailed information
+that may be useful for debugging.
+
+
+
+
+
+
+
+
+
+
Debugging: No ‘preproc’ directory?
+
+
If you’re missing the preproc directory, then your
+config-user.yml file has the value
+remove_preproc_dir set to true (this is used
+to save disk space). Please set this value to false and run
+the recipe again.
+
+
+
+
After the output locations, there are two main sections that can be
+distinguished in the log messages:
+
Creating tasks
+
Executing tasks
+
+
+
+
+
+
Analyse the tasks
+
+
List all the tasks that ESMValTool is executing for this recipe. Can
+you guess what this recipe does?
+
+
+
+
+
+
+
+
+
Just after all the ‘creating tasks’ and before ‘executing tasks’, we
+find the following line in the output:
+
[134535] INFO These tasks will be executed: map/tas, timeseries/tas_global,
+timeseries/script1, map/script1, timeseries/tas_amsterdam
+
So there are three tasks related to timeseries: global temperature,
+Amsterdam temperature, and a script (tas: near-surface air
+temperature). And then there are two tasks related to a map: something
+with temperature, and again a script.
+
+
+
+
+
Examining the recipe file
+
To get more insight into what is happening, we will have a look at
+the recipe file itself. Use the following command to copy the recipe to
+your working directory
+
+
BASH
+
+
esmvaltool recipes get examples/recipe_python.yml
+
+
Now you should see the recipe file in your working directory (type
+ls to verify). Use the nano editor to open
+this file:
+
+
BASH
+
+
nano recipe_python.yml
+
+
For reference, you can also view the recipe by unfolding the box
+below.
+
+
+
+
+
+
+
YAML
+
+
# ESMValTool
+# recipe_python.yml
+#
+# See https://docs.esmvaltool.org/en/latest/recipes/recipe_examples.html
+# for a description of this recipe.
+#
+# See https://docs.esmvaltool.org/projects/esmvalcore/en/latest/recipe/overview.html
+# for a description of the recipe format.
+---
+documentation:
+ description: |
+ Example recipe that plots a map and timeseries of temperature.
+
+title: Recipe that runs an example diagnostic written in Python.
+
+authors:
+- andela_bouwe
+- righi_mattia
+
+maintainer:
+- schlund_manuel
+
+references:
+- acknow_project
+
+projects:
+- esmval
+- c3s-magic
+
+datasets:
+-{dataset: BCC-ESM1,project: CMIP6,exp: historical,ensemble: r1i1p1f1,grid: gn}
+-{dataset: bcc-csm1-1,project: CMIP5,exp: historical,ensemble: r1i1p1}
+
+preprocessors:
+ # See https://docs.esmvaltool.org/projects/esmvalcore/en/latest/recipe/preprocessor.html
+ # for a description of the preprocessor functions.
+
+to_degrees_c:
+convert_units:
+units: degrees_C
+
+annual_mean_amsterdam:
+extract_location:
+location: Amsterdam
+scheme: linear
+annual_statistics:
+operator: mean
+multi_model_statistics:
+statistics:
+- mean
+span: overlap
+convert_units:
+units: degrees_C
+
+annual_mean_global:
+area_statistics:
+operator: mean
+annual_statistics:
+operator: mean
+convert_units:
+units: degrees_C
+
+diagnostics:
+
+map:
+description: Global map of temperature in January 2000.
+themes:
+- phys
+realms:
+- atmos
+variables:
+tas:
+mip: Amon
+preprocessor: to_degrees_c
+timerange: 2000/P1M
+ caption: |
+ Global map of {long_name} in January 2000 according to {dataset}.
+scripts:
+script1:
+script: examples/diagnostic.py
+quickplot:
+plot_type: pcolormesh
+cmap: Reds
+
+timeseries:
+description: Annual mean temperature in Amsterdam and global mean since 1850.
+themes:
+- phys
+realms:
+- atmos
+variables:
+tas_amsterdam:
+short_name: tas
+mip: Amon
+preprocessor: annual_mean_amsterdam
+timerange: 1850/2000
+caption: Annual mean {long_name} in Amsterdam according to {dataset}.
+tas_global:
+short_name: tas
+mip: Amon
+preprocessor: annual_mean_global
+timerange: 1850/2000
+caption: Annual global mean {long_name} according to {dataset}.
+scripts:
+script1:
+script: examples/diagnostic.py
+quickplot:
+plot_type: plot
+
+
+
+
+
+
Do you recognize the basic recipe structure that was introduced in
+episode 1?
+
+Documentation with relevant (citation)
+information
+
+Datasets that should be analysed
+
+Preprocessors groups of common preprocessing
+steps
+
+Diagnostics scripts performing more specific
+evaluation steps
+
+
+
+
+
+
Analyse the recipe
+
+
Try to answer the following questions:
+
Who wrote this recipe?
+
Who should be approached if there is a problem with this
+recipe?
+
How many datasets are analyzed?
+
What does the preprocessor called annual_mean_global
+do?
+
Which script is applied for the diagnostic called
+map?
+
Can you link specific lines in the recipe to the tasks that we saw
+before?
+
How is the location of the city specified?
+
How is the temporal range of the data specified?
+
+
+
+
+
+
+
+
+
The example recipe is written by Bouwe Andela and Mattia
+Righi.
+
Manuel Schlund is listed as the maintainer of this
+recipe.
+
Two datasets are analysed:
+
CMIP6 data from the model BCC-ESM1
+
CMIP5 data from the model bcc-csm1-1
+
The preprocessor annual_mean_global computes an area
+mean as well as annual means
+
The diagnostic called map executes a script referred
+to as script1. This is a python script named
+examples/diagnostic.py
+
There are two diagnostics: map and
+timeseries. Under the diagnostic map we find
+two tasks:
+
a preprocessor task called tas, applying the
+preprocessor called to_degrees_c to the variable
+tas.
+
a diagnostic task called script1, applying the script
+examples/diagnostic.py to the preprocessed data
+(map/tas).
+
Under the diagnostic timeseries we find three tasks:
+
a preprocessor task called tas_amsterdam, applying the
+preprocessor called annual_mean_amsterdam to the variable
+tas.
+
a preprocessor task called tas_global, applying the
+preprocessor called annual_mean_global to the variable
+tas.
+
a diagnostic task called script1, applying the script
+examples/diagnostic.py to the preprocessed data
+(timeseries/tas_global and
+timeseries/tas_amsterdam).
+
The extract_location preprocessor is used to get
+data for a specific location here. ESMValTool interpolates to the
+location based on the chosen scheme. Can you tell the scheme used here?
+For more ways to extract areas, see the [Area
+operations][preproc-area-manipulation] page.
+
The timerange tag is used to extract data from a
+specific time period here. The start time is 01/01/2000 and
+the span of time to calculate means is 1 Month given by
+P1M. For more options on how to specify time ranges, see
+the [timerange documentation][timeranges].
+
+
+
+
+
+
+
+
+
+
Pro tip: short names and variable groups
+
+
The preprocessor tasks in ESMValTool are called ‘variable groups’.
+For the diagnostic timeseries, we have two variable groups:
+tas_amsterdam and tas_global. Both of them
+operate on the variable tas (as indicated by the
+short_name), but they apply different preprocessors. For
+the diagnostic map the variable group itself is named
+tas, and you’ll notice that we do not explicitly provide
+the short_name. This is a shorthand built into
+ESMValTool.
+
+
+
+
+
+
+
+
+
Output files
+
+
Have another look at the output directory created by the ESMValTool
+run.
+
Which files/folders are created by each task?
+
+
+
+
+
+
+
+
+
+map/tas: creates /preproc/map/tas,
+which contains preprocessed data for each of the input datasets, a file
+called metadata.yml describing the contents of these
+datasets and provenance information in the form of .xml
+files.
+
+timeseries/tas_global: creates
+/preproc/timeseries/tas_global, which contains preprocessed
+data for each of the input datasets, a metadata.yml file
+and provenance information in the form of .xml files.
+
+timeseries/tas_amsterdam: creates
+/preproc/timeseries/tas_amsterdam, which contains
+preprocessed data for each of the input datasets, plus a combined
+MultiModelMean, a metadata.yml file and
+provenance files.
+
+map/script1: creates /run/map/script1
+with general information and a log of the diagnostic script run. It also
+creates /plots/map/script1/ and
+/work/map/script1, which contain output figures and output
+datasets, respectively. For each output file, there is also
+corresponding provenance information in the form of .xml,
+.bibtex and .txt files.
+
+timeseries/script1: creates
+/run/timeseries/script1 with general information and a log
+of the diagnostic script run. It also creates
+/plots/timeseries/script1 and
+/work/timeseries/script1, which contain output figures and
+output datasets, respectively. For each output file, there is also
+corresponding provenance information in the form of .xml,
+.bibtex and .txt files.
+
+
+
+
+
+
+
+
+
+
Pro tip: diagnostic logs
+
+
When you run ESMValTool, any log messages from the diagnostic script
+are not printed on the terminal. But they are written to the
+log.txt files in the folder
+/run/<diag_name>/log.txt.
+
ESMValTool does print a command that can be used to re-run a
+diagnostic script. When you use this the output will be printed
+to the command line.
+
+
+
+
Modifying the example recipe
+
Let’s make a small modification to the example recipe. Notice that
+now that you have copied and edited the recipe, you can use
+
esmvaltool run recipe_python.yml
+
to refer to your local file rather than the default version shipped
+with ESMValTool.
+
+
+
+
+
+
Change your location
+
+
Modify and run the recipe to analyse the temperature for your own
+location.
+
+
+
+
+
+
+
+
+
In principle, you only have to modify the location in the
+preprocessor called annual_mean_amsterdam. However, it is
+good practice to also replace all instances of amsterdam
+with the correct name of your location. Otherwise the log messages and
+output will be confusing. You are free to modify the names of
+preprocessors or diagnostics.
+
In the diff file below you will see the changes we have
+made to the file. The top 2 lines are the filenames and the lines like
+@@ -39,9 +39,9 @@ represent the line numbers in the
+original and modified file, respectively. For more info on this format,
+see here.
+
+
DIFF
+
+
--- recipe_python.yml
++++ recipe_python_london.yml
+@@ -39,9 +39,9 @@
+ convert_units:
+ units: degrees_C
+
+- annual_mean_amsterdam:
++ annual_mean_london:
+ extract_location:
+- location: Amsterdam
++ location: London
+ scheme: linear
+ annual_statistics:
+ operator: mean
+@@ -83,7 +83,7 @@
+ cmap: Reds
+
+ timeseries:
+- description: Annual mean temperature in Amsterdam and global mean since 1850.
++ description: Annual mean temperature in London and global mean since 1850.
+ themes:
+ - phys
+ realms:
+@@ -92,9 +92,9 @@
+ tas_amsterdam:
+ short_name: tas
+ mip: Amon
+- preprocessor: annual_mean_amsterdam
++ preprocessor: annual_mean_london
+ timerange: 1850/2000
+- caption: Annual mean {long_name} in Amsterdam according to {dataset}.
++ caption: Annual mean {long_name} in London according to {dataset}.
+ tas_global:
+ short_name: tas
+ mip: Amon
+
+
+
+
+
+
+
+
+
+
+
Key Points
+
+
ESMValTool recipes work ‘out of the box’ (if input data is
+available)
+
There are strong links between the recipe, log file, and output
+folders
+
Recipes can easily be modified to re-use existing code for your own
+use case
+
+
diff --git a/05-conclusions.html b/05-conclusions.html
new file mode 100644
index 00000000..4c157fc6
--- /dev/null
+++ b/05-conclusions.html
@@ -0,0 +1,595 @@
+
+ESMValTool Tutorial: Conclusion of the basic tutorial
+ Skip to main content
+
+
+
+
+
+
+
+ Beta
+
+ This lesson is in the beta phase, which means that it is ready for teaching by instructors outside of the original author team.
+
+
+
+
There are lots of resources available to assist you in using
+ESMValTool.
+
The ESMValTool Discussions
+page is a good place to find information on general issues, or check
+if your question has already been addressed. If you have a GitHub
+account, you can also post your questions there.
+
If you encounter difficulties, a great starting point is to visit
+issues page issue page
+to check whether your issues have already been reported or not. If they
+have been reported before, suggestions provided by developers can help
+you to solve the issues you encountered. Note that you will need a
+GitHub account for this.
+
Additionally, there is an ESMValTool email list. Please see information
+on how to subscribe to the user mailing list.
+
+
+
What if I find a bug?
+
If you find a bug, please report it to the ESMValTool team. This will
+help us fix issues, ensuring not only your uninterrupted workflow but
+also contributing to the overall stability of ESMValTool for all
+users.
+
To report a bug, please create a new issue using the issue
+page.
+
In your bug report, please describe the problem as clearly and as
+completely as possible. You may need to include a recipe or the output
+log as well.
+
+
diff --git a/06-preprocessor.html b/06-preprocessor.html
new file mode 100644
index 00000000..9740b645
--- /dev/null
+++ b/06-preprocessor.html
@@ -0,0 +1,1234 @@
+
+ESMValTool Tutorial: Writing your own recipe
+ Skip to main content
+
+
+
+
+
+
+
+ Beta
+
+ This lesson is in the beta phase, which means that it is ready for teaching by instructors outside of the original author team.
+
+
+
+
Can I use different preprocessors for different variables?
+
Can I use different datasets for different variables?
+
How can I combine different preprocessor functions?
+
Can I run the same recipe for multiple ensemble members?
+
+
+
+
+
+
+
Objectives
+
Create a recipe with multiple preprocessors
+
Use different preprocessors for different variables
+
Run a recipe with variables from different datasets
+
+
+
+
+
+
Introduction
+
One of the key strengths of ESMValTool is in making complex analyses
+reusable and reproducible. But that doesn’t mean everything in
+ESMValTool needs to be complex. Sometimes, the biggest challenge is in
+keeping things simple. You probably know the ‘warming stripes’
+visualization by Professor Ed Hawkins. On the site https://showyourstripes.info{:target=“_blank”} you can
+find the same visualization for many regions in the world.
+
Shared by Ed Hawkins under a Creative Commons 4.0 Attribution
+International licence. Source: https://showyourstripes.info
+
In this episode, we will reproduce and extend this functionality with
+ESMValTool. We have prepared a small Python script that takes a NetCDF
+file with timeseries data, and visualizes it in the form of our desired
+warming stripes figure.
+
The diagnostic script that we will use is called
+warming_stripes.py and can be downloaded here.
+
Download the file and store it in your working directory. If you
+want, you may also have a look at the contents, but it is not necessary
+to do so for this lesson.
+
We will write an ESMValTool recipe that takes some data, performs the
+necessary preprocessing, and then runs this Python script.
+
+
+
+
+
+
Drawing up a plan
+
+
Previously, we saw that running ESMValTool executes a number of
+tasks. What tasks do you think we will need to execute and what should
+each of these tasks do to generate the warming stripes?
+
+
+
+
+
+
+
+
+
In this episode, we will need to do the following two tasks:
+
A preprocessing task that converts the gridded temperature data to a
+timeseries of global temperature anomalies
+
A diagnostic tasks that calls our Python script, taking our
+preprocessed timeseries data as input.
+
+
+
+
+
Building a recipe from scratch
+
The easiest way to make a new recipe is to start from an existing
+one, and modify it until it does exactly what you need. However, in this
+episode we will start from scratch. This forces us to think about all
+the steps involved in processing the data. We will also deal with
+commonly occurring errors through the development of the recipe.
+
Remember the basic structure of a recipe, and notice that each
+component is extensively described in the documentation under the
+section, [“Overview”][recipe-overview]{:target=“_blank”}:
This is the first place to look for help if you get stuck.
+
Open a new file called recipe_warming_stripes.yml:
+
+
BASH
+
+
nano recipe_warming_stripes.yml
+
+
Let’s add the standard header comments (these do not do anything),
+and a first description.
+
+
YAML
+
+
# ESMValTool
+# recipe_warming_stripes.yml
+---
+documentation:
+description: Reproducing Ed Hawkins' warming stripes visualization
+title: Reproducing Ed Hawkins' warming stripes visualization.
+
+
Notice that yaml always requires two spaces
+indentation between the different levels. Pressing ctrl+o
+will save the file. Verify the filename at the bottom and press enter.
+Then use ctrl+x to exit the editor.
+
We will try to run the recipe after every modification we make, to
+see if it (still) works!
+
+
BASH
+
+
esmvaltool run recipe_warming_stripes.yml
+
+
In this case, it gives an error. Below you see the last few lines of
+the error message.
+
+
ERROR
+
+
...
+yamale.yamale_error.YamaleError:
+Error validating data '/home/users/username/esmvaltool_tutorial/recipe_warming_stripes.yml'
+with schema
+'/apps/jasmin/community/esmvaltool/miniconda3_py311_23.11.0-2/envs/esmvaltool/lib/python3.11/
+site-packages/esmvalcore/_recipe/recipe_schema.yml'
+ documentation.authors: Required field missing
+2024-05-27 13:21:23,805 UTC [41924] INFO
+If you have a question or need help, please start a new discussion on
+https://github.com/ESMValGroup/ESMValTool/discussions
+If you suspect this is a bug, please open an issue on
+https://github.com/ESMValGroup/ESMValTool/issues
+To make it easier to find out what the problem is, please consider attaching the
+files run/recipe_*.yml and run/main_log_debug.txt from the output directory.
+
+
+
We can use the the log message above, to understand why ESMValTool
+failed. Here, this is because we missed a required field with author
+names. The text
+documentation.authors: Required field missing tells us
+that. We see that ESMValTool always tries to validate the recipe at an
+early stage. Note also the suggestion to open a GitHub issue if you need
+help debugging the error message. This is something most users do when
+they cannot understand the error or are not able to fix it on their
+own.
+
Let’s add some additional information to the recipe. Open the recipe
+file again, and add an authors section below the description. ESMValTool
+expects the authors as a list, like so:
+
+
YAML
+
+
authors:
+- lastname_firstname
+
+
To bypass a number of similar error messages, add a minimal
+diagnostics section below the documentation. The file should now look
+like:
This is the minimal recipe layout that is required by ESMValTool. If
+we now run the recipe again, you will probably see the following
+error:
+
+
ERROR
+
+
ValueError: Tag 'doe_john' does not exist in section
+'authors' of /apps/jasmin/community/esmvaltool/ESMValTool_2.10.0/esmvaltool/config-references.yml
+
+
+
+
+
+
+
+
Pro tip: config-references.yml
+
+
The error message above points to a file named
+[config-references.yml][config-references] This is where ESMValTool
+stores all its citation information. To add yourself as an author, add
+your name in the form lastname_firstname in alphabetical
+order following the existing entries, under the
+# Development team section. See the [List of
+authors][list-of-authors]{:target=“_blank”} section in the ESMValTool
+documentation for more information.
+
+
+
+
For now, let’s just use one of the existing references. Change the
+author field to righi_mattia, who cannot receive enough
+credit for all the effort he put into ESMValTool. If you now run the
+recipe again, you should see the final message
+
+
OUTPUT
+
+
ERROR No tasks to run!
+
+
Although there is no actual error in the recipe, ESMValTool assumes
+you mistakenly left out a variable name to process and alerts you with
+this error message.
+
Adding a dataset entry
+
Let’s add a datasets section.
+
+
+
+
+
+
Filling in the dataset keys
+
+
Use the paths specified in the configuration file to explore the data
+directory, and look at the explanation of the dataset entry in the
+[ESMValTool documentation][recipe-section-datasets]{:target=“_blank”}.
+For both the datasets, write down the following properties:
Note that the grid key is only required for CMIP6 data, and that the
+extent of the historical period has changed between CMIP5 and CMIP6.
+
+
+
+
+
Let us start with the BCC-ESM1 dataset and add a datasets section to
+the recipe, listing this single dataset, as shown below. Note that key
+fields such as mip or start_year are included
+in the datasets section here but are part of the
+diagnostic section in the recipe example seen in Running your first recipe.
The recipe should run but produce the same message as in the previous
+case since we still have not included a variable to actually process. We
+have not included the short name of the variable in this dataset section
+because this allows us to reuse this dataset entry with different
+variable names later on. This is not really necessary for our simple use
+case, but it is common practice in ESMValTool.
+
+
+
+
+
+
Pro-tip: Automatically populating a recipe with all available datasets
+
+
You can select all available models for processing using
+glob patterns or wildcards. An example
+datasets section that uses all available CMIP6 models and
+ensemble members for the historical experiment is available
+[here] [include-all-datasets]{:target=“_blank”}. Note that you will have
+to set the search_esgf option in the
+config_file to always so that you can download
+data from ESGF nodes as needed.
+
+
+
+
Adding the preprocessor section
+
Above, we already described the preprocessing task that needs to
+convert the standard, gridded temperature data to a timeseries of
+temperature anomalies.
+
+
+
+
+
+
Defining the preprocessor
+
+
Have a look at the available preprocessors in the
+[documentation][preprocessor]{:target=“_blank”}. Write down
+
Which preprocessor functions do you think we should use?
+
What are the parameters that we can pass to these functions?
+
What do you think should be the order of the preprocessors?
+
A suitable name for the overall preprocessor
+
+
+
+
+
+
+
+
+
We need to calculate anomalies and global means. There is an
+anomalies preprocessor which takes in as arguments, a time
+period, a reference period, and whether or not to standardize the data.
+The global means can be calculated with the area_statistics
+preprocessor, which takes an operator as argument (in our case we want
+to compute the mean).
+
The default order in which these preprocessors are applied can be
+seen [here][preprocessor-functions]{:target=“_blank”}:
+area_statistics comes before anomalies. If you
+want to change this, you can use the custom_order
+preprocessor as described
+[here][recipe-section-preprocessors]{:target=“_blank”}. For this
+example, we will keep the default order..
+
Let’s name our preprocessor global_anomalies.
+
+
+
+
+
Add the following block to your recipe file between the
+datasets and diagnostics block:
We are now ready to finish our diagnostics section. Remember that we
+want to create two tasks: a preprocessor task, and a diagnostic task. To
+illustrate that we can also pass settings to the diagnostic script, we
+add the option to specify a custom colormap.
+
+
+
+
+
+
Fill in the blanks
+
+
Extend the diagnostics section in your recipe by filling in the
+blanks in the following template:
+
+
YAML
+
+
diagnostics:
+<... (suitable name for our diagnostic)>:
+ description: <...>
+variables:
+<... (suitable name for the preprocessed variable)>:
+ short_name: <...>
+ preprocessor: <...>
+scripts:
+<... (suitable name for our python script)>:
+script: <full path to python script>
+colormap: <... choose from matplotlib colormaps>
+
+
+
+
+
+
+
+
+
+
+
YAML
+
+
diagnostics:
+diagnostic_warming_stripes:
+description: visualize global temperature anomalies as warming stripes
+variables:
+global_temperature_anomalies_global:
+short_name: tas
+preprocessor: global_anomalies
+scripts:
+warming_stripes_script:
+script: ~/esmvaltool_tutorial/warming_stripes.py
+colormap:'bwr'
+
+
+
+
+
+
You should now be able to run the recipe to get your own warming
+stripes.
+
Note: for the purpose of simplicity in this episode, we have not
+added logging or provenance tracking in the diagnostic script. Once you
+start to develop your own diagnostic scripts and want to add them to the
+ESMValTool repositories, this will be required. Writing your own
+diagnostic script is discussed in a later
+episode.
+
Bonus exercises
+
Below are a few exercises to practice modifying an ESMValTool recipe.
+For your reference, here’s a copy of the recipe at this point. This
+will be the point of departure for each of the modifications we’ll make
+below.
+
+
+
+
+
+
Specific location selection
+
+
On showyourstripes.org, you can download stripes for specific
+locations. Here we show how this can be done with ESMValTool. Instead of
+the global mean, we can pick a location to plot the stripes for. Can you
+find a suitable preprocessor to do this?
+
+
+
+
+
+
+
+
+
You can use extract_point or extract_region
+to select a location. We used extract_point. Here’s a copy
+of the recipe at this
+point and this is the difference from the previous recipe:
Split the diagnostic in two with two different time periods for the
+same variable. You can choose the time periods yourself. In the example
+below, we have chosen the recent past and the 20th century and have used
+variable grouping.
+
+
+
+
+
+
+
+
+
Here’s a copy of the recipe at this point
+and this is the difference with the previous recipe:
Now that you have different variable groups, we can also use
+different preprocessors. Add a second preprocessor to add another
+location of your choosing.
+
+
Solution
+
Here’s a copy of the recipe at
+this point and this is the difference with the previous recipe:
If you want to avoid retyping the arguments used in your
+preprocessor, you can use YAML anchors as seen in the
+anomalies preprocessor specifications in the recipe
+above.
+
+
+
+
+
+
+
+
+
Additional datasets
+
+
So far we have defined the datasets in the datasets section of the
+recipe. However, it’s also possible to add specific datasets only for
+specific variables or variable groups. Take a look at the documentation
+to learn about the additional_datasets keyword
+[here][additional-datasets]{:target=“_blank”}, and add a second dataset
+only for one of the variable groups.
+
+
+
+
+
+
+
+
+
Here’s a copy of the recipe at
+this point and this is the difference with the previous recipe:
You can choose data from multiple ensemble members for a model in a
+single line.
+
+
+
+
+
+
+
+
+
The dataset section allows you to choose more than one
+ensemble member Here’s a copy of the changed recipe
+to do that. Changes made are shown in the diff output below:
Check out the section on a different way to use multiple ensemble
+members or even multiple experiments at [Concatenating data
+corresponding to multiple facets]
+[concatenating-datasets]{:target=“_blank”}.
+
+
+
+
+
+
+
+
+
Key Points
+
+
A recipe can work with different preprocessors at the same
+time.
+
The setting additional_datasets can be used to add a
+different dataset.
+
Variable groups are useful for defining different settings for
+different variables.
+
Multiple ensemble members and experiments can be analysed in a
+single recipe through concatenation.
+
+
diff --git a/07-development-setup.html b/07-development-setup.html
new file mode 100644
index 00000000..b43b3732
--- /dev/null
+++ b/07-development-setup.html
@@ -0,0 +1,1070 @@
+
+ESMValTool Tutorial: Development and contribution
+ Skip to main content
+
+
+
+
+
+
+
+ Beta
+
+ This lesson is in the beta phase, which means that it is ready for teaching by instructors outside of the original author team.
+
+
+
+
How can I incorporate my contributions into ESMValTool?
+
+
+
+
+
+
+
Objectives
+
Execute a successful ESMValTool installation from the source
+code.
+
Contribute to ESMValTool development.
+
+
+
+
+
+
We now know how ESMValTool works, but how do we develop it?
+ESMValTool is an open-source project in ESMValGroup. We can contribute
+to its development by:
helping with reviewing process of pull requests, see ESMValTool
+documentation on Review
+of pull requests
+
+
In this lesson, we first show how to set up a development
+installation of ESMValTool so you can make changes or additions. We then
+explain how you can contribute these changes to the community.
+
+
+
+
+
+
Git knowledge
+
+
For this episode, you need some knowledge of Git. You can refresh your knowledge in
+the corresponding Git carpentries
+course.
+
+
+
+
Development installation
+
We’ll explore how ESMValTool can be installed it in a
+develop mode. Even if you aren’t collaborating with the
+community, this installation is needed to run your new codes with
+ESMValTool. Let’s get started.
Download the code from the repository. A ZIP file called
+ESMValTool-main.zip is downloaded. To continue the
+installation, unzip the file, move to the ESMValTool-main
+directory and then follow the sequence of steps starting from the
+section on ESMValTool dependencies below.
+
Clone the repository if you want to contribute to the ESMValTool
+development:
This command will ask your GitHub username and a personal
+token as password. Please follow instructions on
+[GitHub token authentication
+requirements][token-authentication-requirements] to create a personal
+access token. Alternatively, you could [generate a new SSH
+key][generate-ssh-key] and [add it to your GitHub account][add-ssh-key].
+After the authentication, the output might look like:
Now, a folder called ESMValTool has been created in your
+working directory. This folder contains the source code of the tool. To
+continue the installation, we move into the ESMValTool
+directory:
+
+
BASH
+
+
cd ESMValTool
+
+
Note that the main branch is checked out by default. We
+can see this if we run:
+
+
BASH
+
+
git status
+
+
+
OUTPUT
+
+
On branch main
+Your branch is up to date with 'origin/main'.
+
+nothing to commit, working tree clean
+
+
+
+
2 ESMValTool dependencies
+
Please don’t forget if an esmvaltool environment is already created
+following the lesson Installation, we
+should choose another name for the new environment in this lesson.
+
ESMValTool now uses mamba instead of conda
+for the recommended installation. For a minimal mamba installation, see
+section Install Mamba in lesson Installation.
+
It is good practice to update the version of mamba and conda on your
+machine before setting up ESMValTool. This can be done as follows:
+
+
BASH
+
+
mamba update --name base mamba conda
+
+
To simplify the installation process, an environment file
+environment.yml is provided in the ESMValTool directory. We
+create an environment by running:
The environment is called esmvaltool by default. If an
+esmvaltool environment is already created following the
+lesson Installation, we should choose
+another name for the new environment in this lesson by:
This will create a new conda environment and install ESMValTool (with
+all dependencies that are needed for development purposes) into it with
+a single command.
+
For more information see [conda managing
+environments][manage-environments].
+
Now, we should activate the environment:
+
+
BASH
+
+
conda activate esmvaltool
+
+
where esmvaltool is the name of the environment (replace
+by a_new_name in case another environment name was
+used).
+
+
+
3 ESMValTool installation
+
ESMValTool can be installed in a develop mode by
+running:
+
+
BASH
+
+
pip install --editable'.[develop]'
+
+
This will add the esmvaltool directory to the Python
+path in editable mode and install the development dependencies. We
+should check if the installation works properly. To do this, run the
+tool with:
+
+
BASH
+
+
esmvaltool--help
+
+
If the installation is successful, ESMValTool prints a help message
+to the console.
+
+
+
+
+
+
Checking the development installation
+
+
We can use the command mamba list to list installed
+packages in the esmvaltool environment. Use this command to
+check that ESMValTool is installed in a develop mode.
# Name Version Build Channel
+esmvaltool 2.10.0.dev3+g2dbc2cfcc pypi_0 pypi
+
+
+
+
+
+
+
+
4 Updating ESMValTool
+
The main branch has the latest features of ESMValTool.
+Please make sure that the source code on your machine is up-to-date. If
+you obtain the source code using git clone as explained in step
+1 Source code, you can run git pull to
+update the source code. Then ESMValTool installation will be updated
+with changes from the main branch.
+
+
Contribution
+
We have seen how to install ESMValTool in a develop
+mode. Now, we try to contribute to its development. Let’s see how this
+can be achieved. We first discuss our ideas in an issue
+in ESMValTool repository. This can avoid disappointment at a later
+stage, for example, if more people are doing the same thing. It also
+gives other people an early opportunity to provide input and
+suggestions, which results in more valuable contributions.
+
Then, we create a new branch locally and start
+developing new codes. To create a new branch:
+
+
BASH
+
+
git checkout -b your_branch_name
+
+
If needed, a link to a git tutorial can be found in Setup.
+
Once our development is finished, we can initiate a
+pull request. To this end, we encourage you to join the
+ESMValTool development team.
+
For more extensive documentation on contributing code, including a
+section on the GitHubWorkflow,
+please see the [Contributing code and documentation][code-documentation]
+section in the ESMValtool documentation.
+
+
Review process
+
The pull request will be tested, discussed and merged as part of the
+“review process”. The process will take some effort and
+time to learn. However, a few (command line) tools can get you a long
+way, and we’ll cover those essentials in the next sections.
+
Tip: we encourage you to keep the pull requests
+small. Reviewing small incremental changes is more efficient.
+
+
+
Working example
+
We saw the ‘warming stripes’ diagnostic in lesson Writing your own recipe. Imagine the
+following task: you want to contribute warming stripes recipe and
+diagnostics to ESMValTool. You have to add the diagnostics warming_stripes.py and the recipe recipe_warming_stripes.yml
+to their locations in ESMValTool directory. After these changes, you
+should also check if everything works fine. This is where we take
+advantage of the tools that are introduced later.
+
Let’s get started. Note that since this is an exercise to get
+familiar with the development and contribution process, we will not
+create a GitHub issue at this time but proceed as though it has been
+done.
+
+
+
Check code quality
+
We aim to adhere to best practices and coding standards. There are
+several tools that check our code against those standards like:
+
+flake8 for
+checking against the PEP8 style guide
+
+yapf to ensure
+consistent formatting for the whole project
+
+isort to consistently
+sort the import statements
+
+yamllint
+to ensure there are no syntax errors in our recipes and config
+files
The good news is that pre-commit has been already
+installed when we chose development installation.
+pre-commit is a command line and runs all of those tools.
+It also fixes some of those errors. To explore other tools, have a look
+at ESMValTool documentation on Code
+style.
+
+
+
+
+
+
Using pre-commit
+
+
Let’s checkout our local branch and add the script warming_stripes.py to the
+esmvaltool/diag_scripts directory.
+
+
BASH
+
+
cd ESMValTool
+git checkout your_branch_name
+cp path_of_warming_stripes.py esmvaltool/diag_scripts/
+
+
By default, pre-commit only runs on the files that have
+been staged in git:
+
+
BASH
+
+
git status
+git add esmvaltool/diag_scripts/warming_stripes.py
+pre-commit run --files esmvaltool/diag_scripts/warming_stripes.py
+
+
Inspect the output of pre-commit and fix the remaining
+errors.
+
+
+
+
+
+
+
+
+
The tail of the output of pre-commit:
+
+
BASH
+
+
Check for added large files..............................................Passed
+Check python ast.........................................................Passed
+Check for case conflicts.................................................Passed
+Check for merge conflicts................................................Passed
+Debug Statements (Python)................................................Passed
+Fix End of Files.........................................................Passed
+Trim Trailing Whitespace.................................................Passed
+yamllint.............................................(no files to check)Skipped
+nclcodestyle.........................................(no files to check)Skipped
+style-files..........................................(no files to check)Skipped
+lintr................................................(no files to check)Skipped
+codespell................................................................Passed
+isort....................................................................Passed
+yapf.....................................................................Passed
+docformatter.............................................................Failed
+- hook id: docformatter
+- files were modified by this hook
+flake8...................................................................Failed
+- hook id: flake8
+- exit code: 1
+
+esmvaltool/diag_scripts/warming_stripes.py:20:5:
+F841 local variable 'nx' is assigned to but never used
+
+
As can be seen above, there are two Failed check:
+
+docformatter: it is mentioned that “files were modified
+by this hook”. We run git diff to see the modifications.
+The output includes the following:
+
+
BASH
+
+
+in the form of the popular warming stripes figure by Ed Hawkins."""
+
+
The syntax """ at the end of docstring is moved by one
+line. Shifting it to the next line should fix this error.
+
+flake8: the error message is about an unused local
+variable nx. We should check our codes regarding the usage
+of nx. For now, let’s assume that it is added by mistake
+and remove it. Note that you have to run git add again to
+re-stage the file. Then rerun pre-commit and check that it passes.
+
+
+
+
+
+
+
Run unit tests
+
Previous section introduced some tools to check code style and
+quality. There is lack of mechanism to determine whether or not our code
+is getting the right answer. To achieve that, we need to write and run
+tests for widely-used functions. ESMValTool comes with a lot of tests
+that are in the folder tests.
+
To run tests, first we make sure that the working directory is
+ESMValTool and our local branch is checked out. Then, we
+can run tests using pytest locally:
+
+
BASH
+
+
pytest
+
+
Tests will also be run automatically by CircleCI, when you submit a pull
+request.
+
+
+
+
+
+
Running tests
+
+
Make sure our local branch is checked out and add the recipe recipe_warming_stripes.yml
+to the esmvaltool/recipes directory:
Run pytest and inspect the results, this might take a
+few minutes. If a test is failed, try to fix it.
+
+
+
+
+
+
+
+
+
Run:
+
+
BASH
+
+
pytest
+
+
When pytest run is complete, you can inspect the test
+reports that are printed in the console. Have a look at the second
+section of the report FAILURES:
The test message shows that the recipe
+recipe_warming_stripes.yml is not a valid recipe. Look for
+a line that starts with an E in the rest of the
+message:
+
+
BASH
+
+
+E esmvalcore._task.DiagnosticError: Cannot execute script
+'~/esmvaltool_tutorial/warming_stripes.py'(~/esmvaltool_tutorial/warming_stripes.py):
+file does not exist.
+
+
To fix the recipe, we need to edit the path of the diagnostic script
+as warming_stripes.py:
When we add or update a code, we also update its corresponding
+documentation. The ESMValTool documentation is available on docs.esmvaltool.org.
+The source files are located in
+ESMValTool/doc/sphinx/source/.
+
To build documentation locally, first we make sure that the working
+directory is ESMValTool and our local branch is checked
+out. Then, we run:
Similar to code, documentation should be well written and adhere to
+standards. If the documentation is built properly, the previous command
+prints a message to the console:
+
+
OUTPUT
+
+
build succeeded.
+
+The HTML pages are in doc/sphinx/build.
+
+
The main page of the documentation has been built into
+index.html in doc/sphinx/build/ directory. To
+preview this page locally, we open the file in a web browser:
+
+
BASH
+
+
xdg-open doc/sphinx/build/index.html
+
+
+
+
+
+
+
Creating a documentation
+
+
In previous exercises, we added the recipe recipe_warming_stripes.yml
+to ESMValTool. Now, we create a documentation file
+recipe_warming_stripes.rst for this recipe:
Save and close the file. We can think of this file as one page of a
+book. Examples of documentation pages can be found in the folder
+ESMValTool/doc/sphinx/source/recipes. Then, we need to
+decide where this page should be located inside the book. The table of
+content is defined by index.rst. Let’s have a look at the
+content:
+
+
BASH
+
+
nano doc/sphinx/source/recipes/index.rst
+
+
Add the recipe name i.e. recipe_warming_stripes to the
+section Other in this file and preview the recipe
+documentation page locally.
+
+
+
+
+
+
+
+
+
First, we add the recipe name recipe_warming_stripes to
+the section Other:
+
+
diff --git a/08-diagnostics.html b/08-diagnostics.html
new file mode 100644
index 00000000..ca132bd9
--- /dev/null
+++ b/08-diagnostics.html
@@ -0,0 +1,1122 @@
+
+ESMValTool Tutorial: Writing your own diagnostic script
+ Skip to main content
+
+
+
+
+
+
+
+ Beta
+
+ This lesson is in the beta phase, which means that it is ready for teaching by instructors outside of the original author team.
+
+
+
+
How do I use the preprocessor output in a Python diagnostic?
+
+
+
+
+
+
+
Objectives
+
Write a new Python diagnostic script.
+
Explain how a diagnostic script reads the preprocessor output.
+
+
+
+
+
+
Introduction
+
The diagnostic script is an important component of ESMValTool and it
+is where the scientific analysis or performance metric is implemented.
+With ESMValTool, you can adapt an existing diagnostic or write a new
+script from scratch. Diagnostics can be written in a number of open
+source languages such as Python, R, Julia and NCL but we will focus on
+understanding and writing Python diagnostics in this lesson.
+
In this lesson, we will explain how to find an existing diagnostic
+and run it using ESMValTool installed in editable/development mode. For
+a development installation, see the instructions in the lesson Development and contribution. Also,
+we will work with the recipe [recipe_python.yml][recipe] and the
+diagnostic script [diagnostic.py][diagnostic] called by this recipe that
+we have seen in the lesson Running your first
+recipe .
+
Let’s get started!
+
Understanding an existing Python diagnostic
+
If you clone the ESMValTool repository, a folder called
+ESMValTool is created in your home/working directory, see
+the instructions in the lesson Development and contribution .
+
The folder ESMValTool contains the source code of the
+tool. We can find the recipe recipe_python.yml and the
+python script diagnostic.py in these directories:
Let’s have look at the code in diagnostic.py. For
+reference, we show the diagnostic code in the dropdown box below. There
+are four main sections in the script:
+
A description i.e. the docstring (line 1).
+
Import statements (line 2-16).
+
Functions that implement our analysis (line 21-102).
+
A typical Python top-level script
+i.e. if __name__ == '__main__' (line 105-108).
+
+
+
+
+
+
+
PYTHON
+
+
1: """Python example diagnostic."""
+2: import logging
+3: from pathlib import Path
+4: from pprint import pformat
+5:
+6: import iris
+7:
+8: from esmvaltool.diag_scripts.shared import (
+9: group_metadata,
+10: run_diagnostic,
+11: save_data,
+12: save_figure,
+13: select_metadata,
+14: sorted_metadata,
+15: )
+16: from esmvaltool.diag_scripts.shared.plot import quickplot
+17:
+18: logger = logging.getLogger(Path(__file__).stem)
+19:
+20:
+21: def get_provenance_record(attributes, ancestor_files):
+22: """Create a provenance record describing the diagnostic data and plot."""
+23: caption = caption = attributes['caption'].format(**attributes)
+24:
+25: record = {
+26: 'caption': caption,
+27: 'statistics': ['mean'],
+28: 'domains': ['global'],
+29: 'plot_types': ['zonal'],
+30: 'authors': [
+31: 'andela_bouwe',
+32: 'righi_mattia',
+33: ],
+34: 'references': [
+35: 'acknow_project',
+36: ],
+37: 'ancestors': ancestor_files,
+38: }
+39: return record
+40:
+41:
+42: def compute_diagnostic(filename):
+43: """Compute an example diagnostic."""
+44: logger.debug("Loading %s", filename)
+45: cube = iris.load_cube(filename)
+46:
+47: logger.debug("Running example computation")
+48: cube = iris.util.squeeze(cube)
+49: return cube
+50:
+51:
+52: def plot_diagnostic(cube, basename, provenance_record, cfg):
+53: """Create diagnostic data and plot it."""
+54:
+55: # Save the data used for the plot
+56: save_data(basename, provenance_record, cfg, cube)
+57:
+58: if cfg.get('quickplot'):
+59: # Create the plot
+60: quickplot(cube, **cfg['quickplot'])
+61: # And save the plot
+62: save_figure(basename, provenance_record, cfg)
+63:
+64:
+65: def main(cfg):
+66: """Compute the time average for each input dataset."""
+67: # Get a description of the preprocessed data that we will use as input.
+68: input_data = cfg['input_data'].values()
+69:
+70: # Demonstrate use of metadata access convenience functions.
+71: selection = select_metadata(input_data, short_name='tas', project='CMIP5')
+72: logger.info("Example of how to select only CMIP5 temperature data:\n%s",
+73: pformat(selection))
+74:
+75: selection = sorted_metadata(selection, sort='dataset')
+76: logger.info("Example of how to sort this selection by dataset:\n%s",
+77: pformat(selection))
+78:
+79: grouped_input_data = group_metadata(input_data,
+80: 'variable_group',
+81: sort='dataset')
+82: logger.info(
+83: "Example of how to group and sort input data by variable groups from "
+84: "the recipe:\n%s", pformat(grouped_input_data))
+85:
+86: # Example of how to loop over variables/datasets in alphabetical order
+87: groups = group_metadata(input_data, 'variable_group', sort='dataset')
+88: for group_name in groups:
+89: logger.info("Processing variable %s", group_name)
+90: for attributes in groups[group_name]:
+91: logger.info("Processing dataset %s", attributes['dataset'])
+92: input_file = attributes['filename']
+93: cube = compute_diagnostic(input_file)
+94:
+95: output_basename = Path(input_file).stem
+96: if group_name != attributes['short_name']:
+97: output_basename = group_name +'_'+ output_basename
+98: if"caption"notin attributes:
+99: attributes['caption'] = input_file
+100: provenance_record = get_provenance_record(
+101: attributes, ancestor_files=[input_file])
+102: plot_diagnostic(cube, output_basename, provenance_record, cfg)
+103:
+104:
+105: if__name__=='__main__':
+106:
+107: with run_diagnostic() as config:
+108: main(config)
+
+
+
+
+
+
+
+
+
+
+
What is the starting point of a diagnostic?
+
+
Can you spot a function called main in the code
+above?
+
What are its input arguments?
+
How many times is this function mentioned?
+
+
+
+
+
+
+
+
+
The main function is defined in line 65 as
+main(cfg).
+
The input argument to this function is the variable
+cfg, a Python dictionary that holds all the necessary
+information needed to run the diagnostic script such as the location of
+input data and various settings. We will next parse this
+cfg variable in the main function and extract
+information as needed to do our analyses (e.g. in line 68).
+
The main function is called near the very end on line
+108. So, it is mentioned twice in our code - once where it is called by
+the top-level Python script and second where it is defined.
+
+
+
+
+
+
+
+
+
+
The function run_diagnostic
+
+
The function run_diagnostic (line 107) is called a
+context manager provided with ESMValTool and is the main entry point for
+most Python diagnostics.
+
+
+
+
Preprocessor-diagnostic interface
+
In the previous exercise, we have seen that the variable
+cfg is the input argument of the main
+function. The first argument passed to the diagnostic via the
+cfg dictionary is a path to a file called
+settings.yml. The ESMValTool documentation page provides an
+overview of what is in this file, see [Diagnostic script
+interfaces][interface].
+
+
+
+
+
+
What information do I need when writing a diagnostic script?
+
+
From the lesson Configuration, we
+saw how to change the configuration settings before running a recipe.
+First we set the option remove_preproc_dir to
+false in the configuration file, then run the recipe
+recipe_python.yml:
+
+
BASH
+
+
esmvaltool run examples/recipe_python.yml
+
+
Find one example of the file settings.yml in the
+run directory?
+
Open the file settings.yml and look at the
+input_files list. It contains paths to some files
+metadata.yml. What information do you think is saved in
+those files?
+
+
+
+
+
+
+
+
+
One example of settings.yml can be found in the
+directory:
+path_to_recipe_output/run/map/script1/settings.yml
+
+
The metadata.yml files hold information about the
+preprocessed data. There is one file for each variable having detailed
+information on your data including project (e.g., CMIP6, CMIP5), dataset
+names (e.g., BCC-ESM1, CanESM2), variable attributes (e.g.,
+standard_name, units), preprocessor applied and time range of the data.
+You can use all of this information in your own diagnostic.
+
+
+
+
+
Diagnostic shared functions
+
Looking at the code in diagnostic.py, we see that
+input_data is read from the cfg dictionary
+(line 68). Now we can group the input_data according to
+some criteria such as the model or experiment. To do so, ESMValTool
+provides many functions such as select_metadata (line 71),
+sorted_metadata (line 75), and group_metadata
+(line 79). As you can see in line 8, these functions are imported from
+esmvaltool.diag_scripts.shared that means these are shared
+across several diagnostics scripts. A list of available functions and
+their description can be found in [The ESMValTool Diagnostic API
+reference][shared].
+
+
Extracting
+information needed for analyses
+
We have seen the functions used for selecting, sorting and grouping
+data in the script. What do these functions do?
+
+
Answer
+
There is a statement after use of select_metadata,
+sorted_metadata and group_metadata that starts
+with logger.info (lines 72, 76 and 82). These lines print
+output to the log files. In the previous exercise, we ran the recipe
+recipe_python.yml. If you look at the log file
+recipe_python_#_#/run/map/script1/log.txt in
+esmvaltool_output directory, you can see the output from
+each of these functions, for example:
+
2023-06-28 12:47:14,038 [2548510] INFO diagnostic,106 Example of how to
+group and sort input data by variable groups from the recipe:
+{'tas': [{'alias': 'CMIP5',
+ 'caption': 'Global map of {long_name} in January 2000 according to '
+ '{dataset}.\n',
+ 'dataset': 'bcc-csm1-1',
+ 'diagnostic': 'map',
+ 'end_year': 2000,
+ 'ensemble': 'r1i1p1',
+ 'exp': 'historical',
+ 'filename': '~/recipe_python_20230628_124639/preproc/map/tas/
+This is how we can access preprocessed data within our diagnostic.
+
+
+
+
+
Diagnostic computation
+
After grouping and selecting data, we can read individual attributes
+(such as filename) of each item. Here, we have grouped the input data by
+variables, so we loop over the variables (line 88).
+Following this is a call to the function compute_diagnostic
+(line 93). Let’s look at the definition of this function in line 42,
+where the actual analysis of the data is done.
+
Note that output from the ESMValCore preprocessor is in the form of
+NetCDF files. Here, compute_diagnostic uses Iris
+to read data from a netCDF file and performs an operation
+squeeze to remove any dimensions of length one. We can
+adapt this function to add our own analysis. As an example, here we
+calculate the bias using the average of the data using Iris cubes.
+
+
PYTHON
+
+
def compute_diagnostic(filename):
+"""Compute an example diagnostic."""
+ logger.debug("Loading %s", filename)
+ cube = iris.load_cube(filename)
+
+ logger.debug("Running example computation")
+ cube = iris.util.squeeze(cube)
+
+# Calculate a bias using the average of data
+ cube.data = cube.core_data() - cube.core_data.mean()
+return cube
+
+
+
+
+
+
+
iris cubes
+
+
Iris reads data from NetCDF files into data structures called cubes.
+The data in these cubes can be modified, combined with other cubes’ data
+or plotted.
+
+
+
+
+
+
+
+
+
Reading data using xarray
+
+
Alternately, you can use xarrays to read the data
+instead of Iris.
+
+
+
+
+
+
+
+
+
First, import xarray package at the top of the script
+as:
+
+
PYTHON
+
+
import xarray as xr
+
+
Then, change the compute_diagnostic as:
+
+
PYTHON
+
+
def compute_diagnostic(filename):
+"""Compute an example diagnostic."""
+ logger.debug("Loading %s", filename)
+ dataset = xr.open_dataset(filename)
+
+#do your analyses on the data here
+
+return dataset
+
+
Caution: If you read data using xarray keep in mind to change
+accordingly the other functions in the diagnostic which are dealing at
+the moment with Iris cubes.
+
+
+
+
+
+
+
+
+
+
Reading data using the netCDF4 package
+
+
Yet another option to read the NetCDF file data is to use the
+[netCDF-4 Python interface][netCDF] to the netCDF C library.
+
+
+
+
+
+
+
+
+
First, import the netCDF4 package at the top of the
+script as:
+
+
PYTHON
+
+
import netCDF4
+
+
Then, change compute_diagnostic as:
+
+
PYTHON
+
+
def compute_diagnostic(filename):
+"""Compute an example diagnostic."""
+ logger.debug("Loading %s", filename)
+ nc_data = netCDF4.Dataset(filename,'r')
+
+#do your analyses on the data here
+
+return nc_data
+
+
Caution: If you read data using netCDF4 keep in mind to change
+accordingly the other functions in the diagnostic which are dealing at
+the moment with Iris cubes.
+
+
+
+
+
Diagnostic output
+
+
Plotting the output
+
Often, the end product of a diagnostic script is a plot or figure.
+The Iris cube returned from the compute_diagnostic function
+(line 93) is passed to the plot_diagnostic function (line
+102). Let’s have a look at the definition of this function in line 52.
+This is where we would plug in our plotting routine in the diagnostic
+script.
+
More specifically, the quickplot function (line 60) can
+be replaced with the function of our choice. As can be seen, this
+function uses **cfg['quickplot'] as an input argument. If
+you look at the diagnostic section in the recipe
+recipe_python.yml, you see quickplot is a key
+there:
This way, we can pass arguments such as the type of plot
+pcolormesh and the colormap cmap:Reds from the
+recipe to the quickplot function in the diagnostic.
+
+
+
+
+
+
Passing arguments from the recipe to the diagnostic
+
+
Change the type of the plot and its colormap and inspect the output
+figure.
+
+
+
+
+
+
+
+
+
In the recipe recipe_python.yml, you could change
+plot_type and cmap. As an example, we choose
+plot_type: pcolor and cmap: BuGn:
The plot can be found at
+path_to_recipe_output/plots/map/script1/png.
+
+
+
+
+
+
+
+
+
+
ESMValTool gallery
+
+
ESMValTool makes it possible to produce a wide array of plots and
+figures as seen in the gallery.
+
+
+
+
+
+
Saving the output
+
In our example, the function save_data in line 56 is
+used to save the Iris cube. The saved files can be found under the
+work directory in a .nc format. There is also
+the function save_figure in line 62 to save the plots under
+the plot directory in a .png format (or
+preferred format specified in your configuration settings). Again, you
+may choose your own method of saving the output.
+
+
+
Recording the provenance
+
When developing a diagnostic script, it is good practice to record
+provenance. To do so, we use the function
+get_provenance_record (line 100). Let us have a look at the
+definition of this function in line 21 where we describe the diagnostic
+data and plot. Using the dictionary record, it is possible
+to add custom provenance to our diagnostics output. Provenance is stored
+in the W3C PROV
+XML format and also in an SVG file under the
+work and plot directory. For more information,
+see [recording provenance][provenance].
+
+
Congratulations!
+
You now know the basic diagnostic script structure and some available
+tools for putting together your own diagnostics. Have a look at existing
+recipes and diagnostics in the repository for more examples of functions
+you can use in your diagnostics!
+
+
+
+
+
+
Key Points
+
+
ESMValTool provides helper functions to interface a Python
+diagnostic script with preprocessor output.
+
Existing diagnostics can be used as templates and modified to write
+new diagnostics.
+
Helper functions can be imported from
+esmvaltool.diag_scripts.shared and used in your own
+diagnostic script.
+
+
diff --git a/09-cmorization.html b/09-cmorization.html
new file mode 100644
index 00000000..2c69dca2
--- /dev/null
+++ b/09-cmorization.html
@@ -0,0 +1,1493 @@
+
+ESMValTool Tutorial: CMORization: adding new datasets to ESMValTool
+ Skip to main content
+
+
+
+
+
+
+
+ Beta
+
+ This lesson is in the beta phase, which means that it is ready for teaching by instructors outside of the original author team.
+
+
+
+
How to use the existing CMORizer scripts shipped with
+ESMValTool?
+
How to add support for new (observational) datasets?
+
+
+
+
+
+
+
Objectives
+
Understand what CMORization is and why it is necessary.
+
Use existing scripts to CMORize your data.
+
Write a new CMORizer script to support additional data.
+
+
+
+
+
+
Introduction
+
This episode deals with “CMORization”. ESMValTool is designed to work
+with data that follow the CMOR standards. Unfortunately, not all
+datasets follow these standards. In order to use such datasets in
+ESMValTool we first need to reformat the data. This process is called
+“CMORization”.
+
+
+
+
+
+
What are the CMOR standards?
+
+
The name “CMOR” originates from a tool: the Climate Model Output Rewriter.
+This tool is used to create “CF-Compliant netCDF files for use in the
+CMIP projects”. So CMOR extends the CF-standard with additional
+requirements for the Coupled Model Intercomparison Projects (see e.g. here).
+
Concretely, the CMOR standards dictate e.g. the variable names and
+units, coordinate information, how the data should be structured (e.g. 1
+variable per file), additional metadata requirements, and file naming
+conventions a.k.a. the data reference syntax (DRS).
+All this information is stored in so-called CMOR tables. For example,
+the CMOR tables for the CMIP6 project can be found here.
+
+
+
+
ESMValTool offers two ways to CMORize data:
+
A reformatting script can be used to create a CMOR-compliant copy.
+CMORizer scripts for several popular datasets are included in
+ESMValTool, and ESMValTool also provides a convenient way to execute
+them.
+
ESMValCore can execute CMOR fixes ‘[on the fly](https://docs.esmvaltool.org/projects/esmvalcore/en/latest/develop/
+fixing_data.html#fixing-data)’. The advantage is that you don’t need to
+store an additional, reformatted copy of the data. The disadvantage is
+that these fixes should be implemented inside ESMValCore, which is
+beyond the scope of this tutorial.
+
In this lesson, we will re-implement a CMORizer script for the
+FLUXCOM dataset that contains observations of the Gross Primary
+Production (GPP), a variable that is important for calculating
+components of the global carbon cycle. See the next section on how to
+obtain data.
The data for this episode is available via the FluxCom Data
+Portal. First you’ll need to register. After registration, in the
+dropdown boxes, select FLUXCOM as the data choice and click download.
+Three files will be displayed. Click the download button on the “FLUXCOM
+(RS+METEO) Global Land Carbon Fluxes using CRUNCEP climate data”. You’ll
+receive an email with the FTP address to access the server. Connect to
+the server, follow the path in your email, and look for the file
+raw/monthly/GPP.ANN.CRUNCEPv6.monthly.2000.nc. Download
+that file and save it in a folder called
+~/data/RAWOBS/Tier3/FLUXCOM.
+
Note: you’ll need a user-friendly ftp client. On Linux,
+ncftp works okay.
+
+
+
+
+
+
What is the deal with those “tiers”?
+
+
Many datasets come with access restrictions. In this way the data
+providers can keep track of how their data is used. In many cases
+“restricted access” just means that one has to register with an email
+address and accept the terms of use, which typically ask that you
+acknowledge the data providers.
+
There are also datasets available that do not need a registration.
+The “obs4MIPs” or “ana4MIPs” datasets, for example, are specifically
+produced to facilitate comparisons with model simulations.
+
To reflect these different levels of access restriction, the
+ESMValTool team has created a tier-system. The definition of the
+different tiers are as follows:
+
+Tier1: obs4MIPs and ana4MIPS datasets (can be used
+directly with the ESMValTool)
+
+Tier2: other freely available datasets (most of
+them will need some kind of cmorization)
+
+Tier3: datasets with access restrictions (most of
+these datasets will also need some kind of cmorization)
+
These access restrictions are also why the ESMValTool developers
+cannot distribute copies or automate downloading of all observations and
+reanalysis data used in the recipes. As a compromise, we provide the
+CMORization scripts so that each user can CMORize their own copy of the
+access restricted datasets if needed.
+
+
+
+
Run the existing CMORizer script
+
Before we develop our own CMORizer script, let’s first see what
+happens when we run the existing one. There is a specific command
+available in the ESMValTool to run the CMORizer scripts:
+
+
BASH
+
+
esmvaltool data format --config_file<path to config-user.yml><dataset-name>
+
+
The config-user.yml is the file in which we define the
+different data paths, see the episode on Configuration. In the
+rootpath of your config-user.yml, make sure to
+add the right directory for “RAWOBS” data in which you downloaded the
+FLUXCOM dataset:
+
+
YAML
+
+
rootpath:
+RAWOBS: ~/data/RAWOBS
+
+
This enables ESMValTool to find the raw observational datasets stored
+in the “RAWOBS” folder. The dataset-name needs to be
+identical to the folder name that was created to store the raw
+observation data files, i.e. RAWOBS/TierX/dataset-name. In
+our case this would be “FLUXCOM”.
+
If everything is okay, the output should look something like
+this:
+
+
OUTPUT
+
+
...
+... Starting the CMORization Tool at time: 2022-07-26 14:02:16 UTC
+... ----------------------------------------------------------------------
+... input_dir = /home/peter/data/RAWOBS
+... output_dir = /home/peter/esmvaltool_output/data_formatting_20220726_140216
+... ----------------------------------------------------------------------
+... Running the CMORization scripts.
+... Processing datasets ['FLUXCOM']
+... Input data from: /home/peter/data/RAWOBS/Tier3/FLUXCOM
+... Output will be written to: /home/peter/esmvaltool_output/
+ data_formatting_20220726_140216/Tier3/FLUXCOM
+... Reformat script: /home/peter/mambaforge/envs/esmvaltool/lib/python3.9/
+ site-packages/esmvaltool/cmorizers/data/formatters/datasets/fluxcom
+... CMORizing dataset FLUXCOM using Python script /home/peter/mambaforge/envs/
+ esmvaltool/lib/python3.9/site-packages/esmvaltool/cmorizers/data/formatters/
+ datasets/fluxcom.py
+... Found input file '/home/peter/data/RAWOBS/Tier3/FLUXCOM/GPP.ANN.CRUNCEPv6.monthly.*.nc'
+... CMORizing variable 'gpp'
+... Lmon
+... Var is gpp
+... ... UserWarning: Ignoring netCDF variable 'GPP' invalid units 'gC m-2 day-1'
+
+... Fixing time...
+... Fixing latitude...
+... Fixing longitude...
+... Flipping dimensional coordinate latitude...
+... Saving file
+... Saving: /home/peter/esmvaltool_output/data_formatting_20220726_140216/Tier3/
+ FLUXCOM/OBS_FLUXCOM_reanaly_ANN-v1_Lmon_gpp_200001-200012.nc
+... Cube has lazy data [lazy is preferred]
+... CMORization of dataset FLUXCOM finished!
+... Formatting successful for dataset FLUXCOM
+
+
So you can see that several fixes are applied, and the CMORized file
+is written to the ESMValTool output directory, i.e.
+~/esmvaltool_output/data_formatting_YYYYMMDD_HHMMSS/TierX/dataset-name/filename.nc
+In order to use it, we’ll have to copy it from the output directory to a
+folder called ~/data/OBS/Tier3/FLUXCOM and make sure the
+path to OBS is set correctly in our config-user file:
+
+
YAML
+
+
rootpath:
+OBS: ~/data/OBS
+
+
You can also see the path where ESMValTool stores the reformatting
+script:
+~/ESMValTool/esmvaltool/data/formatters/datasets/fluxcom.py.
+You may have a look at this file if you want. The script also uses a
+configuration file:
+~/ESMValTool/esmvaltool/cmorizers/data/cmor_config/FLUXCOM.yml.
+
Make a test recipe
+
To verify that the data is correctly CMORized, we will make a simple
+test recipe. As illustrated in the figure at the top of this episode,
+one of the steps that ESMValTool executes is a CMOR-check. If the data
+is not correctly CMORized, ESMValTool will give a warning or error.
+
+
+
+
+
+
Create a test recipe
+
+
Create a simple recipe called recipe_check_fluxcom.yml that
+loads the FLUXCOM data. It should include a datasets section with a
+single entry for the “FLUXCOM” dataset with the correct dataset keys,
+and a diagnostics section with two variables: gpp. We don’t need any
+preprocessors or scripts (set scripts: null), but we have
+to add a documentation section with a description, authors and
+maintainer, otherwise the recipe will fail.
+
Use the following dataset keys:
+
project: OBS
+
dataset: FLUXCOM
+
type: reanaly
+
version: ANN-v1
+
mip: Lmon
+
start_year: 2000
+
end_year: 2000
+
tier: 3
+
Some of these dataset keys are further explained in the callout boxes
+in this episode.
+
+
+
+
+
+
+
+
+
Here’s an example recipe
+
+
YAML
+
+
documentation:
+
+description: Test recipe for FLUXCOM data
+title: This is a test recipe for the FLUXCOM data.
+
+authors:
+- kalverla_peter
+
+maintainer:
+- kalverla_peter
+
+datasets:
+-{project: OBS,dataset: FLUXCOM,mip: Lmon,tier:3,start_year:2000,
+end_year:2000,type: reanaly,version: ANN-v1}
+
+diagnostics:
+check_fluxcom:
+description: Check that ESMValTool can load the cmorized fluxnet data without errors.
+variables:
+gpp:
+scripts:null
esmvaltool run recipe_check_fluxcom.yml --config_file<path to config-user.yml> --log_level debug
+
+
If everything is okay, the recipe should run without problems.
+
Starting from scratch
+
Now that you’ve seen how to use an existing CMORizer script, let’s
+think about adding a new one. We will remove the existing CMORizer
+script, and re-implement it from scratch. This exercise allows us to
+point out all the details of what’s going on. We’ll also remove the
+CMORized data that we’ve just created, so our test recipe will not be
+able to use it anymore.
"""ESMValTool CMORizer for FLUXCOM GPP data.
+
+<We will add some useful info here later>
+"""
+import logging
+from esmvaltool.cmorizers.data import utilities as utils
+
+logger = logging.getLogger(__name__)
+
+def cmorization(in_dir, out_dir, cfg, cfg_user, start_date, end_date):
+"""Cmorize the dataset."""
+
+# This is where you'll add the cmorization code
+# 1. find the input data
+# 2. apply the necessary fixes
+# 3. store the data with the correct filename
+
+
Here, in_dir corresponds to the input directory of the
+raw files, out_dir to the output directory of final
+reformatted data set and cfg to a configuration dictionary
+given by a configuration file that we will get to shortly. The last
+three arguments will not be considered in this script but can be used in
+other cases. cfg_user corresponds to the user configuration
+file, start_date to the start of the period to format, and
+end_date to the end of the period to format. When you type
+the command esmvaltool data format in the terminal,
+ESMValTool will call this function with the settings found in your
+configuration files.
+
The ESMValTool CMORizer also needs a dataset configuration file.
+Create a file called
+~/ESMValTool/esmvaltool/cmorizers/data/cmor_config/FLUXCOM.yml
+and fill it with the following boilerplate:
Note: the name of this file must be
+identical to dataset-name.
+
As you can see, the configuration file contains information about the
+original filename of the dataset, and some additional metadata that you
+might recognize from the CMOR filename structure. It also contains a
+list of variables that’s available for this dataset. We’ll add this
+information step by step in the following sections.
+
+
+
+
+
+
RAWOBS, OBS, OBS6!?
+
+
In the configuration above we’ve already filled in the
+project_id. ESMValTool uses these project IDs to find the
+data on your hard drive, and also to find more information about the
+data. The RAWOBS and OBS projects refer to
+external data before and after CMORization, respectively.
+Historically, most external data were observations, hence the
+naming.
+
In going from CMIP5 to CMIP6, the CMOR standards changed a bit. For
+example, some variables were renamed, which posed a dilemma: should
+CMORization reformat to the CMIP5 or CMIP6 definition? To solve this,
+the OBS6 project was created. So OBS6 data
+follow the CMIP6 standards, and that’s what we’ll use for the new
+CMORizer.
+
+
+
+
You can try running the CMORizer at this point, and it should work
+without errors. However, it doesn’t produce any output yet:
+
+
BASH
+
+
esmvaltool data format --config_file<path to config-user.yml> FLUXCOM
+
+
+
1. Find the input data
+
First we’ll get the CMORizer script to locate our FLUXCOM data. We
+can use the information from the in_dir and
+cfg variables. Add the following snippet to your CMORizer
+script:
+
+
PYTHON
+
+
# 1. find the input data
+logger.info("in_dir: '%s'", in_dir)
+logger.info("cfg: '%s'", cfg)
+
+
If you run the CMORizer again, it will print out the content of these
+variables and the output should contain something like this:
Try to locate the input data inside the CMORizer script and load it
+(we’ll use iris because ESMValTool includes helper
+utilities for iris cubes). Confirm that you’ve loaded the data by
+logging the correct path and (part of the) file content.
+
+
+
+
+
+
+
+
+
There are many ways to do it. In any case, you should have added the
+original filename to the configuration file (and un-commented this
+line): filename: 'GPP.ANN.CRUNCEPv6.monthly.*.nc'. Note the
+*: this is a useful shorthand to find multiple files for
+different years. In a similar way we can also look for multiple
+variables, etc.
+
Here’s an example solution (inserted directly under the original
+comment):
+
+
PYTHON
+
+
# 1. find the input data
+filename_pattern = cfg['filename']
+matches = Path(in_dir).glob(filename_pattern)
+
+for match in matches:
+ input_file =str(match)
+ logger.info("found: %s", input_file)
+ cube = iris.load_cube(input_file)
+ logger.info("content: %s", cube)
+
+
To make this work we’ve added import iris and
+from pathlib import Path at the top of the file. Note that
+we’ve started a loop, since we may find multiple files if there’s more
+than one year of data available.
+
+
+
+
+
+
+
2. Save the data with the correct filename
+
Before we start adding fixes, we’ll first make sure that our CMORizer
+can also write output files with the correct name. This will enable us
+to use the test recipe for the CMOR compatibility check.
+
We can use the save function from the utils
+that we imported at the top. The call signature looks like this:
+utils.save_variables(cube, var, outdir, attrs, **kwargs).
+
We already have the cube and the outdir.
+The variable short name (var) and attributes
+(attrs) are set through the configuration file. So we need
+to find out what the correct short name and attributes are.
+
The standard attributes for CMIP variables are defined in the CMIP
+tables. These tables are differentiated according to the “MIP” they
+belong to. The tables are a copy of the PCMDI guidelines.
+
+
+
+
+
+
Find the variable “gpp” in a CMOR table
+
+
Check the available CMOR tables to find the variable “gpp” with the
+following characteristics:
The variable “gpp” belongs to the land variables. The temporal
+resolution that we are looking for is “monthly”. This information points
+to the “Lmon” CMIP table. And indeed, the variable “gpp” can be found in
+the file [here](https://github.com/ESMValGroup/ESMValCore/blob/main/esmvalcore/
+cmor/tables/cmip6/Tables/CMIP6_Lmon.json).
+
+
+
+
+
If the variable you are interested in is not available in the
+standard CMOR tables, you could write a custom CMOR table entry for the
+variable. This, however, is beyond the scope of this tutorial.
+
+
+
+
+
+
Fill the configuration file
+
+
Uncomment the following entries in your configuration file and fill
+them with appropriate values:
+
dataset_id
+
version
+
tier
+
modeling_realm
+
short_name (the ??? immediately under variables)
+
mip
+
+
+
+
+
+
+
+
+
The configuration file now look something like this:
Now that we have set this information correctly in the config file,
+we can call the save function. Add the following python code to your
+CMORizer script:
+
+
PYTHON
+
+
# 3. store the data with the correct filename
+attributes = cfg['attributes']
+variables = cfg['variables']
+
+for short_name, variable_info in variables.items():
+ all_attributes = {**attributes, **variable_info} # add the mip to the other attributes
+ utils.save_variable(cube=cube, var=short_name, outdir=out_dir, attrs=all_attributes)
+
+
Since we only have one variable (gpp), the loop is not strictly
+necessary. However, this makes it possible to add more variables later
+on.
+
+
+
+
+
+
Was the CMORization successful so far?
+
+
If you run the CMORizer again, you should see that it creates an
+output file named
+OBS6_FLUXCOM_reanaly_ANN-v1_Lmon_gpp_xxxx01-xxxx12.nc
+stored in your ESMValTool output directory
+~/esmvaltool_output/data_formatting_YYYYMMDD_HHMMSS/Tier3/FLUXCOM/.
+The “xxxx” and “yyyy” represent the start and end year of the data.
+
+
+
+
Great! So we have produced a NetCDF file with the CMORizer that
+follows the naming convention for ESMValTool datasets. Let’s have a look
+at the NetCDF file as it was written with the very basic CMORizer from
+above.
The file contains a variable named “GPP” that contains three
+dimensions: “time”, “lat”, “lon”. Notice the strange time units, and the
+invalid_units in the global attributes section. Also it
+seems that there is not information available about the lat and lon
+coordinates. These are just some of the things we’ll address in the next
+section.
+
+
+
3. Implementing additional fixes
+
Copy the output of the CMORizer to your folder
+~/data/OBS6/Tier3/FLUXCOM/ and change the test recipe to
+look for OBS6 data instead of OBS (note: we’re upgrading the CMORizer to
+newer standards here!). Make sure the path to OBS6 is set
+correctly in our config-user file:
+
+
YAML
+
+
rootpath:
+OBS6: ~/data/OBS6
+
+
If we now run the test recipe on our newly ‘CMORized’ data,
+
+
BASH
+
+
esmvaltool run recipe_check_fluxcom.yml --config_file<path to config-user.yml> --log_level debug
+
+
it should be able to find the correct file, but it does not succeed
+yet. The first thing that the ESMValTool CMOR checker brings up is:
+
+
ERROR
+
+
iris.exceptions.UnitConversionError: Cannot convert from unknown units. The
+"units" attribute may be set directly.
+
+
If you look closely at the error messages, you can see that this
+error concerns the units of the coordinates. ESMValTool tries to fix
+them automatically, but since no units are defined on the coordinates,
+this fails.
+
The cmorizer utilities also include a function called
+fix_coords, but before we can use it, we’ll also need to
+make sure the coordinates have the correct standard name. Add the
+following code to your cmorizer:
+
+
PYTHON
+
+
# 2. Apply the necessary fixes
+# 2a. Fix/add coordinate information and metadata
+cube.coord('lat').standard_name ='latitude'
+cube.coord('lon').standard_name ='longitude'
+utils.fix_coords(cube)
+
+
With some additional refactoring, our cmorization function might then
+look something like this:
+
+
PYTHON
+
+
def cmorization(in_dir, out_dir, cfg, cfg_user, start_date, end_date):
+"""Cmorize the dataset."""
+
+# Get general information from the config file
+ attributes = cfg['attributes']
+ variables = cfg['variables']
+
+for short_name, variable_info in variables.items():
+ logger.info("CMORizing variable: %s", short_name)
+
+# 1a. Find the input data (one file for each year)
+ filename_pattern = cfg['filename']
+ matches = Path(in_dir).glob(filename_pattern)
+
+for match in matches:
+# 1b. Load the input data
+ input_file =str(match)
+ logger.info("found: %s", input_file)
+ cube = iris.load_cube(input_file)
+
+# 2. Apply the necessary fixes
+# 2a. Fix/add coordinate information and metadata
+ cube.coord('lat').standard_name ='latitude'
+ cube.coord('lon').standard_name ='longitude'
+ utils.fix_coords(cube)
+
+# 3. Save the CMORized data
+ all_attributes = {**attributes, **variable_info}
+ utils.save_variable(cube=cube, var=short_name, outdir=out_dir, attrs=all_attributes)
+
+
Run the CMORizer script once more. Have a look at the netCDF file,
+and confirm that the coordinates now have much more metadata added to
+them. Then, run the test recipe again with the latest CMORizer output.
+The next error is:
+
+
ERROR
+
+
esmvalcore.cmor.check.CMORCheckError: There were errors in variable GPP:
+Variable GPP units unknown can not be converted to kg m-2 s-1 in cube:
+
+
Okay, so let’s fix the units of the “GPP” variable in the CMORizer.
+Remember that you can find the correct units in the CMOR table. Add the
+following three lines to our CMORizer:
+
+
PYTHON
+
+
# 2b. Fix gpp units
+logger.info("Changing units for gpp from gc/m2/day to kg/m2/s")
+cube.data = cube.core_data() / (1000*86400)
+cube.units ='kg m-2 s-1'
+
+
If everything is okay, the test recipe should now pass. We’re getting
+there. Looking through the output though, there’s still a warning.
+
+
OUTPUT
+
+
WARNING There were warnings in variable GPP:
+Standard name for GPP changed from None to gross_primary_productivity_of_biomass_expressed_as_carbon
+Long name for GPP changed from GPP to Carbon Mass Flux out of Atmosphere Due to
+ Gross Primary Production on Land [kgC m-2 s-1]
+
+
ESMValTool is able to apply automatic fixes here, but if we are
+running a CMORizer script anyway, we might as well fix it
+immediately.
You can see that we’re using the CMOR table here. This was passed on
+by ESMValTool as part of the CFG input variable. So here
+we’re making sure that we’re updating the cubes metadata to conform to
+the CMOR table.
+
Finally, the test recipe should run without errors or warnings.
+
+
+
4. Finalizing the CMORizer
+
Once everything works as expected, there’s a couple of things that we
+can still do.
+
+Add download instructions. The header of the
+CMORizer contains information about where to obtain the data, when it
+was accessed the last time, which ESMValTool “tier” it is associated
+with, and more detailed information about the necessary downloading and
+processing steps.
+
+
+
+
+
+
Fill out the header for the “FLUXCOM” dataset
+
+
Fill out the header of the new CMORizer. The different parts that
+need to be present in the header are the following:
+
Caption: the first line of the docstring should summarize what the
+script does.
+
Tier
+
Source
+
Last access
+
Download and processing instructions
+
+
+
+
+
+
+
+
+
The header for the “FLUXCOM” dataset could look something like
+this:
+
+
PYTHON
+
+
"""ESMValTool CMORizer for FLUXCOM GPP data.
+
+Tier
+ Tier 3: restricted dataset.
+
+Source
+ http://www.bgc-jena.mpg.de/geodb/BGI/Home
+
+Last access
+ 20190727
+
+Download and processing instructions
+ From the website, select FLUXCOM as the data choice and click download.
+ Two files will be displayed. One for Land Carbon Fluxes and one for
+ Land Energy fluxes. The Land Carbon Flux file (RS + METEO) using
+ CRUNCEP data file has several data files for different variables.
+ The data for GPP generated using the
+ Artificial Neural Network Method will be in files with name:
+ GPP.ANN.CRUNCEPv6.monthly.\*.nc
+ A registration is required for downloading the data.
+ Users in the UK with a CEDA-JASMIN account may request access to the jules
+ workspace and access the data.
+ Note : This data may require rechunking of the netcdf files.
+ This constraint will not exist once iris is updated to
+ version 2.3.0 Aug 2019
+"""
+
+
+
+
+
+
+Fill the dataset information list. The file
+[datasets.yml](https://github.com/ESMValGroup/ESMValTool/blob/main/esmvaltool/
+cmorizers/data/datasets.yml) contains the ESMValTool “tier”, the data
+source, the last access time and download instructions for all supported
+datasets in ESMValTool. You can simply reuse the information written in
+the header of the CMORizer.
+
+
+
+
+
+
Fill out the FLUXCOM entry in
+datasets.yml
+
+
+
Fill out the FLUXCOM entry in datasets.yml. The
+different parts that need to be present in the entry are the
+following:
+
Dataset-name
+
Tier
+
Source
+
Last access
+
Download and processing instructions
+
+
+
+
+
+
+
+
+
The entry for the “FLUXCOM” dataset should look like:
+
+
YAML
+
+
FLUXCOM:
+tier:3
+source: http://www.bgc-jena.mpg.de/geodb/BGI/Home
+last_access: 2019-07-27
+ info: |
+ From the website, select FLUXCOM as the data choice and click download.
+ Two files will be displayed. One for Land Carbon Fluxes and one for
+ Land Energy fluxes. The Land Carbon Flux file (RS + METEO) using
+ CRUNCEP data file has several data files for different variables.
+ The data for GPP generated using the
+ Artificial Neural Network Method will be in files with name:
+ GPP.ANN.CRUNCEPv6.monthly.*.nc
+ A registration is required for downloading the data.
+ Users in the UK with a CEDA-JASMIN account may request access to the jules
+ workspace and access the data.
+ Note : This data may require rechunking of the netcdf files.
+ This constraint will not exist once iris is updated to
+ version 2.3.0 Aug 2019
+
+
+
+
+
+
Once the datasets.yml file is filled, you can check that
+ESMValTool can display information about the added dataset with:
+
+
BASH
+
+
esmvaltool data info FLUXCOM
+
+
If everything is okay, the output should look something like
+this:
+
+
OUTPUT
+
+
$ esmvaltool data info FLUXCOM
+FLUXCOM
+
+Tier: 3
+Source: http://www.bgc-jena.mpg.de/geodb/BGI/Home
+Automatic download: No
+
+From the website, select FLUXCOM as the data choice and click download.
+Two files will be displayed. One for Land Carbon Fluxes and one for
+Land Energy fluxes. The Land Carbon Flux file (RS + METEO) using
+CRUNCEP data file has several data files for different variables.
+The data for GPP generated using the
+Artificial Neural Network Method will be in files with name:
+GPP.ANN.CRUNCEPv6.monthly.*.nc
+A registration is required for downloading the data.
+Users in the UK with a CEDA-JASMIN account may request access to the jules
+workspace and access the data.
+Note : This data may require rechunking of the netcdf files.
+This constraint will not exist once iris is updated to
+version 2.3.0 Aug 2019
+
+
Note that Automatic download: No means that no automatic
+downloading script is available in ESMValTool for this dataset. The
+implementation of such a script is beyond the scope of this tutorial. To
+find out which datasets come with an automatic download script, you can
+run: esmvaltool data list to list all datasets supported in
+ESMValTool. More information about the usage of automatic downloading
+scripts can be found in the User
+Guide.
+
+Complete the metadata in the config file. We have
+left a few fields empty in the configuration file, such as ‘source’. By
+filling out these fields we can make sure the relevant metadata is
+passed on as attributes in the CMORized data. To make this work, add the
+following line to the CMORizer script:
+
+
PYTHON
+
+
# 2d. Update the cubes metadata with all info from the config file
+utils.set_global_atts(cube, attributes)
+
+
Add a reference. Make sure that there is a
+reference file available for the dataset, see the instruction [here](https://docs.esmvaltool.org/en
+/latest/community/diagnostic.html#adding-references).
+
Make a pull request. Since you have gone through
+all the trouble to reformat the dataset so that the ESMValTool can work
+with it, it would be great if you could provide the CMORizer, and
+ultimately with that the dataset, to the rest of the community. For more
+information, see the episode on Development and
+contribution.
+
Add documentation. Make sure that you have added
+the info of your dataset to the User Guide so that people know it is
+available for the ESMValTool Obtaining
+input data.
+
+
Some final comments
+
Congratulations! You have just added support for a new dataset to
+ESMValTool! Adding a new CMORizer is definitely already an advanced task
+when working with the ESMValTool. You need to have a basic understanding
+of how the ESMValTool works and how it’s internal structure looks like.
+In addition, you need to have a basic understanding of NetCDF files and
+a programming language. In our example we used python for the CMORizing
+script since we advocate for focusing the code development on only a few
+different programming languages. This helps to maintain the code and to
+ensure the compatibility of the code with possible fundamental changes
+to the structure of the ESMValTool and ESMValCore.
+
More information about adding observations to the ESMValTool can be
+found in the documentation.
+
+
+
+
+
+
Key Points
+
+
CMORizers are dataset-specific scripts that can be run once to
+generate CMOR-compliant data.
+
ESMValTool comes with a set of CMORizers readily available, but you
+can also add your own.
+
+
diff --git a/10-debugging.html b/10-debugging.html
new file mode 100644
index 00000000..d9cd46d8
--- /dev/null
+++ b/10-debugging.html
@@ -0,0 +1,1033 @@
+
+ESMValTool Tutorial: Debugging
+ Skip to main content
+
+
+
+
+
+
+
+ Beta
+
+ This lesson is in the beta phase, which means that it is ready for teaching by instructors outside of the original author team.
+
+
+
+
Every user encounters errors. Once you know why you get certain types
+of errors, they become much easier to fix. The good news is that
+ESMValTool creates a record of the output messages and stores them in
+log files. They can be used for debugging or monitoring the process.
+This lesson helps you understand the different types of errors and when
+you are likely to encounter them.
+
Log files
+
Each time we run ESMValTool, it will produce a new output directory.
+This directory should contain the run folder that is
+automatically generated by ESMValTool. To examine this, we run a
+recipe_python.yml that can be found in lesson Running your first recipe . Check lesson Configuration on how to set paths.
+
In a new terminal, run the recipe:
+
+
BASH
+
+
cd esmvaltool_tutorial
+esmvaltool run examples/recipe_python.yml
+
+
+
ERROR
+
+
esmvaltool: command not found
+
+
ESMValTool encounters this error because the conda environment
+esmvaltool has not been activated. To fix the error, before
+running the recipe, activate the environment:
+
+
BASH
+
+
conda activate esmvaltool
+
+
+
+
+
+
+
conda environment
+
+
More information about the conda environment can be found at Installation.
In the main_log_debug.txt and main_log.txt,
+ESMValTool writes the output messages, warnings and possible errors that
+might happen during pre-processings. To inspect them, we can look inside
+the files. For example:
In the log.txt, ESMValTool writes the output messages,
+warnings and possible errors that are related to the diagnostic
+script.
+
If you encounter an error and don’t know what it means, it is
+important to read the log information. Sometimes knowing where the error
+occurred is enough to fix it, even if you don’t entirely understand the
+message. However, note that you may not always be able to find the error
+or fix it. In that case, ESMValTool community helps you figure out what
+went wrong.
+
+
+
+
+
+
Different log files
+
+
In the run directory, there are two log files
+main_log_debug.txt and main_log.txt. What are
+their differences?
+
+
+
+
+
+
+
+
+
The main_log_debug.txt contains the output messages from
+the pre-processor whereas the main_log.txt shows general
+errors and warnings that might happen in running the recipe and
+diagnostics script.
+
+
+
+
+
Let’s change some settings in the recipe to run a regional
+pre-processor. We use a text editor called nano to open the
+recipe file:
+
+
BASH
+
+
nano recipe_python.yml
+
+
+
+
+
+
+
Text editor side note
+
+
No matter what editor you use, you will need to know where it
+searches for and saves files. If you start it from the shell, it will
+(probably) use your current working directory as its default location.
+We use nano in examples here because it is one of the least
+complex text editors. Press ctrl + O to save the
+file, and then ctrl + X to exit
+nano.
+
+
+
+
+
+
+
+
+
+
YAML
+
+
# ESMValTool
+# recipe_python.yml
+---
+# See https://docs.esmvaltool.org/en/latest/recipes/recipe_examples.html
+# for a description of this recipe.
+
+# See https://docs.esmvaltool.org/projects/esmvalcore/en/latest/recipe/overview.html
+# for a description of the recipe format.
+---
+documentation:
+ description: |
+ Example recipe that plots a map and timeseries of temperature.
+
+title: Recipe that runs an example diagnostic written in Python.
+
+authors:
+- andela_bouwe
+- righi_mattia
+
+maintainer:
+- schlund_manuel
+
+references:
+- acknow_project
+
+projects:
+- esmval
+- c3s-magic
+
+datasets:
+-{dataset: BCC-ESM1,project: CMIP6,exp: historical,ensemble: r1i1p1f1,grid: gn}
+-{dataset: bcc-csm1-1,version: v1,project: CMIP5,exp: historical,ensemble: r1i1p1}
+
+preprocessors:
+# See https://docs.esmvaltool.org/projects/esmvalcore/en/latest/recipe/preprocessor.html
+# for a description of the preprocessor functions.
+
+to_degrees_c:
+convert_units:
+units: degrees_C
+
+annual_mean_amsterdam:
+extract_location:
+location: Amsterdam
+scheme: linear
+annual_statistics:
+operator: mean
+multi_model_statistics:
+statistics:
+- mean
+span: overlap
+convert_units:
+units: degrees_C
+
+annual_mean_global:
+area_statistics:
+operator: mean
+annual_statistics:
+operator: mean
+convert_units:
+units: degrees_C
+
+diagnostics:
+
+map:
+description: Global map of temperature in January 2000.
+themes:
+- phys
+realms:
+- atmos
+variables:
+tas:
+mip: Amon
+preprocessor: to_degrees_c
+timerange: 2000/P1M
+ caption: |
+ Global map of {long_name} in January 2000 according to {dataset}.
+scripts:
+script1:
+script: examples/diagnostic.py
+quickplot:
+plot_type: pcolormesh
+cmap: Reds
+
+timeseries:
+description: Annual mean temperature in Amsterdam and global mean since 1850.
+themes:
+- phys
+realms:
+- atmos
+variables:
+tas_amsterdam:
+short_name: tas
+mip: Amon
+preprocessor: annual_mean_amsterdam
+timerange: 1850/2000
+caption: Annual mean {long_name} in Amsterdam according to {dataset}.
+tas_global:
+short_name: tas
+mip: Amon
+preprocessor: annual_mean_global
+timerange: 1850/2000
+caption: Annual global mean {long_name} according to {dataset}.
+scripts:
+script1:
+script: examples/diagnostic.py
+quickplot:
+plot_type: plot
+
+
+
+
+
+
Keys and values in recipe settings
+
The [ESMValTool pre-processors](https://docs.esmvaltool.org/projects/ESMValCore/en/latest/
+recipe/preprocessor.html?highlight=preprocessor) cover a broad range of
+operations on the input data, like time manipulation, area manipulation,
+land-sea masking, variable derivation, etc. Let’s add the preprocessor
+extract_region to a new section
+annual_mean_regional:
+
+
YAML
+
+
preprocessors:
+annual_mean_regional:
+annual_statistics:
+operator: mean
+extract_region:
+start_longitude:-10
+end_longitude:40
+start_latitude:27
+end_latitude:70
+
+
Also, we change the projects value esmval
+to tutorial:
+
+
YAML
+
+
projects:
+- tutorial
+- c3s-magic
+
+
Then, we save the file and run the recipe:
+
+
BASH
+
+
esmvaltool run recipe_python.yml
+
+
+
ERROR
+
+
ValueError: Tag 'tutorial' does not exist in section 'projects' of
+esmvaltool/config-references.yml 2020-06-29 18:09:56,641 UTC [46055] INFO If you
+have a question or need help, please start a new discussion on
+https://github.com/ESMValGroup/ESMValTool/discussions If you suspect this is a
+bug, please open an issue on https://github.com/ESMValGroup/ESMValTool/issues To
+make it easier to find out what the problem is, please consider attaching the
+files run/recipe_*.yml and run/main_log_debug.txt from the output directory.
+
+
The values for the keys author, maintainer,
+projects and references in the recipe should
+be known by ESMValTool:
ESMValTool references in BibTeX format can be found in
+the [ESMValTool/esmvaltool/references](https://github.com/ESMValGroup/ESMValTool/
+tree/master/esmvaltool/references) directory.
+
+
ESMValTool can’t locate the
+data
+
You are assisting a colleague with ESMValTool. The colleague replaces
+the CanESM2 entry in
+dataset: CanESM2, project: CMIP5 to ACCESS1-3
+and runs the recipe. However, ESMValTool encounters an error like:
+
+
BASH
+
+
ERROR No input files found for variable {'short_name': 'tas', 'mip': 'Amon',
+'preprocessor':'annual_mean_amsterdam', 'variable_group': 'tas_amsterdam',
+'diagnostic':'timeseries', 'dataset': 'ACCESS1-3', 'project': 'CMIP5', 'exp':
+'historical','ensemble': 'r1i1p1', 'recipe_dataset_index': 1, 'institute':
+['CSIRO-BOM'],'product': ['output1', 'output2'], 'timerange': '1850/2000',
+'alias':'CMIP5', 'original_short_name': 'tas', 'standard_name':
+'air_temperature','long_name': 'Near-Surface Air Temperature', 'units': 'K',
+'modeling_realm':['atmos'], 'frequency': 'mon', 'start_year': 1850,
+'end_year': 2000}
+ERROR Looked for files matching
+['tas_Amon_ACCESS1-3_historical_r1i1p1*.nc'], but did not find any existing
+input directory
+ERROR Set 'log_level' to 'debug' to get more information
+
+
+
What suggestions would you give the researcher for fixing the
+error?
+
+
+
+
+
+
Challenge
+
+
+
+
+
+
+
+
+
+
+
Check user-config.yml to see if the correct directory
+for input data is introduced
+
Check the available data, regarding exp, mip, ensemble, start_year,
+and end_year
+
Check the variable names in the diagnostics section in the
+recipe
+
+
+
+
+
Check pre-processed data
+
The setting save_intermediary_cubes in the configuration
+file can be used to save the pre-processed data. More information about
+this setting can be found at Configuration.
+
+
+
+
+
+
save_intermediary_cubes
+
+
Note that this setting should only be used for debugging, as it
+significantly slows down the recipe and increases disk usage because a
+lot of output files need to be stored.
+
+
+
+
Check diagnostic script path
+
The result of the pre-processor is passed to the
+examples/diagnostic.py script, that is introduced in the
+recipe as:
The diagnostic scripts are located in the folder
+diag_scripts in the ESMValTool installation directory
+<path_to_esmvaltool>. To find where ESMValTool is
+located on your system, see Installation .
+
Let’s see what happens if we can change the script path as:
esmvalcore._task.DiagnosticError: Cannot execute script
+'diag_scripts/ocean/diagnostic_timeseries.py'
+(~/mambaforge/envs/esmvaltool2.6/lib/python3.10/site-packages/esmvaltool/
+diag_scripts/diag_scripts/ocean/diagnostic_timeseries.py):
+file does not exist. 2022-10-18 11:42:34,136 UTC [39323] INFO If you have a
+question or need help, please start a new discussion on
+https://github.com/ESMValGroup/ESMValTool/discussions If you suspect this is a
+bug, please open an issue on https://github.com/ESMValGroup/ESMValTool/issues To
+make it easier to find out what the problem is, please consider attaching the
+files run/recipe_*.yml and run/main_log_debug.txt from the output directory.
+
+
The script path should be relative to diag_scripts
+directory. It means that the script
+diagnostic_timeseries.py is located in
+<path_to_esmvaltool>/diag_scripts/ocean/.
+Alternatively, the script path can be an absolute path. To examine this,
+we can download the script from the ESMValTool
+repository:
Then, run the recipe again and examine the output to see
+Run was successful!
+
+
+
+
+
+
Available recipe and diagnostic scripts
+
+
ESMValTool provides a broad suite of recipes
+and diagnostic scripts for different disciplines like atmosphere,
+climate metrics, future projections, IPCC, land, ocean, ….
+
+
+
+
+
Re-running a diagnostic
+
Look at the main_log.txt file and answer the following
+question: How to re-run the diagnostic script?
+
+
Solution
+
The main_log.txt file contains information on how to
+re-run the diagnostic script without re-running the pre-processors:
+
+
+
+
BASH
+
+
2020-06-29 20:36:32,844 UTC [52810] INFO To re-run this diagnostic script, run:
+
+
+
+
+
+
+
Challenge
+
+
+
+
+
+
+
+
+
+
+
If you run the command in a terminal, you will be able to re-run the
+diagnostic.
+
+
+
+
+
+
+
+
+
+
Memory issues
+
+
If you run out of memory, try setting max_parallel_tasks
+to 1 in the configuration file. Then, check the amount of memory you
+need for that by inspecting the file run/resource_usage.txt
+in the output directory. Using the number, there you can increase the
+number of parallel tasks again to a reasonable number for the amount of
+memory available in your system.
+
+
+
+
+
+
+
+
+
Key Points
+
+
There are three different kinds of log files:
+main_log.txt, and main_log_debug.txt and
+log.txt.
+
+
diff --git a/404.html b/404.html
new file mode 100644
index 00000000..abdbdd8a
--- /dev/null
+++ b/404.html
@@ -0,0 +1,453 @@
+
+ESMValTool Tutorial: Page not found
+ Skip to main content
+
+
+
+
+
+
+
+ Beta
+
+ This lesson is in the beta phase, which means that it is ready for teaching by instructors outside of the original author team.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ESMValTool Tutorial
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
Page not found
+
+
Our apologies!
+
We cannot seem to find the page you are looking for. Here are some
+tips that may help:
+
+
diff --git a/CODE_OF_CONDUCT.html b/CODE_OF_CONDUCT.html
new file mode 100644
index 00000000..8df6c3cf
--- /dev/null
+++ b/CODE_OF_CONDUCT.html
@@ -0,0 +1,465 @@
+
+ESMValTool Tutorial: Contributor Code of Conduct
+ Skip to main content
+
+
+
+
+
+
+
+ Beta
+
+ This lesson is in the beta phase, which means that it is ready for teaching by instructors outside of the original author team.
+
+
+
+
+
+
diff --git a/LICENSE.html b/LICENSE.html
new file mode 100644
index 00000000..6e306701
--- /dev/null
+++ b/LICENSE.html
@@ -0,0 +1,513 @@
+
+ESMValTool Tutorial: Licenses
+ Skip to main content
+
+
+
+
+
+
+
+ Beta
+
+ This lesson is in the beta phase, which means that it is ready for teaching by instructors outside of the original author team.
+
+
+
+
to Share—copy and redistribute the material in any
+medium or format
+
to Adapt—remix, transform, and build upon the
+material
+
for any purpose, even commercially.
+
The licensor cannot revoke these freedoms as long as you follow the
+license terms.
+
Under the following terms:
+
Attribution—You must give appropriate credit
+(mentioning that your work is derived from work that is Copyright (c)
+The Carpentries and, where practical, linking to https://carpentries.org/), provide a link to the
+license, and indicate if changes were made. You may do so in any
+reasonable manner, but not in any way that suggests the licensor
+endorses you or your use.
+
No additional restrictions—You may not apply
+legal terms or technological measures that legally restrict others from
+doing anything the license permits. With the understanding
+that:
+
Notices:
+
You do not have to comply with the license for elements of the
+material in the public domain or where your use is permitted by an
+applicable exception or limitation.
+
No warranties are given. The license may not give you all of the
+permissions necessary for your intended use. For example, other rights
+such as publicity, privacy, or moral rights may limit how you use the
+material.
+
Software
+
Except where otherwise noted, the example programs and other software
+provided by The Carpentries are made available under the OSI-approved MIT
+license.
+
Permission is hereby granted, free of charge, to any person obtaining
+a copy of this software and associated documentation files (the
+“Software”), to deal in the Software without restriction, including
+without limitation the rights to use, copy, modify, merge, publish,
+distribute, sublicense, and/or sell copies of the Software, and to
+permit persons to whom the Software is furnished to do so, subject to
+the following conditions:
+
The above copyright notice and this permission notice shall be
+included in all copies or substantial portions of the Software.
+
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND,
+EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
+CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
+TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
+SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+
Trademark
+
“The Carpentries”, “Software Carpentry”, “Data Carpentry”, and
+“Library Carpentry” and their respective logos are registered trademarks
+of Community Initiatives.
+
+
diff --git a/aio.html b/aio.html
new file mode 100644
index 00000000..338ba137
--- /dev/null
+++ b/aio.html
@@ -0,0 +1,6322 @@
+
+
+
+
+
+ESMValTool Tutorial: All in One View
+
+
+
+
+
+
+
+
+
+
+
+
+ Skip to main content
+
+
+
+
+
+
+
+ Beta
+
+ This lesson is in the beta phase, which means that it is ready for teaching by instructors outside of the original author team.
+
+
+
+
This tutorial is a first introduction to ESMValTool. Before diving
+into the technical steps, let’s talk about what ESMValTool is all
+about.
+
+
+
+
+
+
What is ESMValTool?
+
+
What do you already know about or expect from ESMValTool?
+
+
+
+
+
+
+
+
+
EMSValTool is many things, but in this tutorial we will focus on the
+following traits:
+
✓ A tool to analyse climate data
+
✓ A collection of diagnostics for reproducible climate
+science
+
✓ A community effort
+
+
+
+
+
A tool to analyse climate data
+
+
+
ESMValTool takes care of finding, opening, checking, fixing,
+concatenating, and preprocessing CMIP data and several other supported
+datasets.
+
The central component of ESMValTool that we will see in this tutorial
+is the recipe. Any ESMValTool recipe is basically a set
+of instructions to reproduce a certain result. The basic structure of a
+recipe is as follows:
+
+
+Documentation with relevant (citation)
+information
+
+Datasets that should be analysed
+
+Preprocessor steps that must be applied
+
+Diagnostic scripts performing more specific
+evaluation steps
+
+
An example recipe could look like this:
+
+
YAML
+
+
documentation:
+title: This is an example recipe.
+description: Example recipe
+authors:
+- lastname_firstname
+
+datasets:
+-{dataset: HadGEM2-ES,project: CMIP5,exp: historical,mip: Amon,
+ensemble: r1i1p1,start_year:1960,end_year:2005}
+
+preprocessors:
+global_mean:
+area_statistics:
+operator: mean
+
+diagnostics:
+hockeystick_plot:
+description: plot of global mean temperature change
+variables:
+temperature:
+short_name: tas
+preprocessor: global_mean
+scripts: hockeystick.py
+
+
+
+
+
+
+
Understanding the different section of the recipe
+
+
Try to figure out the meaning of the different dataset keys. Hint:
+they can be found in the documentation of ESMValTool.
A collection of diagnostics for reproducible climate science
+
+
+
More than a tool, ESMValTool is a collection of publicly available
+recipes and diagnostic scripts. This makes it possible to easily
+reproduce important results.
ESMValTool is built and maintained by an active community of
+scientists and software engineers. It is an open source project to which
+anyone can contribute. Many of the interactions take place on GitHub.
+Here, we briefly introduce you to some of the most important pages.
+
+
+
+
+
+
Meet the ESMValGroup
+
+
Go to github.com/ESMValGroup. This
+is the GitHub page of our ‘organization’. Have a look around. How many
+collaborators are there? Do you know any of them?
+
Near the top of the page there are 2 pinned repositories: ESMValTool
+and ESMValCore. Visit each of the repositories. How many people have
+contributed to each of them? Can you also find out how many people have
+contributed to this tutorial?
+
+
+
+
+
+
+
+
+
Issues and pull requests
+
+
Go back to the repository pages of ESMValTool or ESMValCore. There
+are tabs for ‘issues’ and ‘pull requests’. You can use the labels to
+navigate them a bit more. How many open issues are about enhancements of
+ESMValTool? And how many bugs have been fixed in ESMValCore? There is
+also an ‘insights’ tab, where you can see a summary of recent activity.
+How many issues have been opened and closed in the past month?
+
+
+
+
Conclusion
+
+
+
This concludes the introduction of the tutorial. You now have a basic
+knowledge of ESMValTool and its community. The following episodes will
+walk you through the installation, configuration and running your first
+recipes.
+
+
+
+
+
+
Key Points
+
+
+
ESMValTool provides a reliable interface to analyse and evaluate
+climate data
+
A large collection of recipes and diagnostic scripts is already
+available
+
ESMValTool is built and maintained by an active community of
+scientists and developers
How do I load and check the ESMValTool environment?
+
How do I configure ESMValTool?
+
How do I run a recipe?
+
+
+
+
+
+
+
+
Objectives
+
+
Understand the purpose of the quickstart guide
+
Load and check the ESMValTool environment
+
Configure ESMValTool
+
Run a recipe
+
+
+
+
+
+
+
+
+
+
+
+
What is the purpose of the quickstart guide?
+
+
+
The purpose of the quickstart guide is to enable a user of
+ESMValTool to run ESMValTool as quickly as possible by making the bare
+minimum number of changes.
+
+
+
+
+
+
+
+
+
+
How do I load and check the ESMValTool environment?
+
+
+
For this quickstart guide, an assumption is made that ESMValTool
+has already been installed at the site where ESMValTool will be run. If
+this is not the case, see the [Installation][lesson-installation]
+episode in this tutorial.
+
Load the ESMValTool environment by following the instructions at
+[ESMValTool: Pre-installed versions on HPC clusters / other
+servers][activate-environment].
+
+
Check the ESMValTool environment by accessing the help for
+ESMValTool:
+
+
BASH
+
+
esmvaltool--help
+
+
+
+
+
+
+
+
+
+
+
+
How do I configure ESMValTool?
+
+
+
+
Create the ESMValTool user configuration file (the file is
+written by default to ~/.esmvaltool/config-user.yml):
+
+
BASH
+
+
esmvaltool config get_config_user
+
+
+
Edit the ESMValTool user configuration file using your favourite
+text editor to uncomment the lines relating to the site where ESMValTool
+will be run.
+
For more details about the ESMValTool user configuration file see
+the [Configuration][lesson-configuration] episode in this
+tutorial.
+
+
+
+
+
+
+
+
+
+
How do I run a recipe?
+
+
+
+
Run the example Python recipe:
+
+
BASH
+
+
esmvaltool run examples/recipe_python.yml
+
+
+
+
Wait for the recipe to complete. If the recipe completes
+successfully, the last line printed to screen at the end of the log will
+look something like:
+
+
BASH
+
+
YYYY-MM-DD HH:mm:SS, NNN UTC [NNNNN] INFO Run was successful
+
+
+
+
View the output of the recipe by opening the HTML file produced
+by ESMValTool (the location of this file is printed to screen near the
+end of the log):
+
+
BASH
+
+
YYYY-MM-DD HH:mm:SS, NNN UTC [NNNNN] INFO Wrote recipe output to:
+file:///$HOME/esmvaltool_output/recipe_python_<date>_<time>/index.html
+
+
+
For more details about running recipes see the [Running your
+first recipe][lesson-recipe] episode in this tutorial.
+
+
+
+
+
+
+
+
+
+
Key Points
+
+
+
The purpose of the quickstart guide is to enable a user of
+ESMValTool to run ESMValTool as quickly as possible without having to go
+through the whole tutorial
+
Use the module load command to load the ESMValTool
+environment, see the \[Installation\]\[lesson-installation\] episode for more
+details and use esmvaltool --help to check the ESMValTool
+environment
+
Use esmvaltool config get_config_user to create the
+ESMValTool user configuration file
What are the prerequisites for installing ESMValTool?
+
How do I confirm that the installation was successful?
+
+
+
+
+
+
+
+
Objectives
+
+
Install ESMValTool
+
Demonstrate that the installation was successful
+
+
+
+
+
+
+
Overview
+
+
+
The instructions help with the installation of ESMValTool on
+operating systems like Linux/MacOSX/Windows. We use the Mamba
+package manager to install the ESMValTool. Other installation methods
+are also available; they can be found in the documentation.
+We will first install Mamba, and then ESMValTool. We end this chapter by
+testing that the installation was successful.
+
Before we begin, here are all the possible ways in which you can use
+ESMValTool depending on your level of expertise or involvement with
+ESMValTool and associated software such as GitHub and Mamba.
+
+
If you have access to a server where ESMValTool is already installed
+as a module, for e.g., the [CEDA JASMIN](https://help.jasmin.ac.uk/article
+/4955-community-software-esmvaltool) server, you can simply load the
+module with the following command:
+
+
+
BASH
+
+
module load esmvaltool
+
+
After loading esmvaltool, we can start using ESMValTool
+right away. Please see the next
+lesson. 2. If you would like to install ESMValTool as a mamba
+package, then this lesson will tell you how! 3. If you would like to
+start experimenting with existing diagnostics or contributing to
+ESMvalTool, please see the instructions for source installation in the
+lesson Development and
+contribution and in the documentation.
+
+
+
+
+
+
Install ESMValTool on Windows
+
+
ESMValTool does not directly support Windows, but successful usage
+has been reported through the Windows Subsystem
+for Linux(WSL), available in Windows 10. To install the WSL please
+follow the instructions on the
+Windows Documentation page. After installing the WSL, installation
+can be done using the same instructions for Linux/MacOSX.
+
+
+
+
Install ESMValTool on Linux/MacOSX
+
+
+
+
Install Mamba
+
+
ESMValTool is distributed using Mamba. To
+install mamba on Linux or MacOSX, follow the
+instructions below:
+
+
Please download the installation file for the latest Mamba
+version here.
+
Next, run the installer from the place where you downloaded
+it:
+
+
On Linux:
+
+
BASH
+
+
bash Mambaforge-Linux-x86_64.sh
+
+
On MacOSX:
+
+
BASH
+
+
bash Mambaforge-MacOSX-x86_64.sh
+
+
+
Follow the instructions in the installer. The defaults should
+normally suffice.
+
You will need to restart your terminal for the changes to have
+effect.
+
We recommend updating mamba before the esmvaltool installation.
+To do so, run:
+
+
+
BASH
+
+
mamba update --name base mamba
+
+
+
Verify you have a working mamba installation by:
+
+
+
BASH
+
+
which mamba
+
+
This should show the path to your mamba executable,
+e.g. ~/mambaforge/bin/mamba.
+
For more information about installing mamba, see [the mamba
+installation documentation](https://docs.esmvaltool.org/en
+/latest/quickstart/installation.html#mamba-installation).
+
+
+
Install the ESMValTool package
+
+
The ESMValTool package contains diagnostics scripts in four
+languages: R, Python, Julia and NCL. This introduces a lot of
+dependencies, and therefore the installation can take quite long. It is,
+however, possible to install ‘subpackages’ for each of the languages.
+The following (sub)packages are available:
+
+
esmvaltool-python
+
esmvaltool-ncl
+
esmvaltool-r
+
+esmvaltool –> the complete package, i.e. the
+combination of the above.
+
+
For the tutorial, we will install the complete package. Thus, to
+install the ESMValTool package, run
+
+
BASH
+
+
mamba create --name esmvaltool esmvaltool
+
+
On MacOSX ESMValTool functionalities in Julia, NCL, and R are not
+supported. To install a Mamba environment on MacOSX, please refer to
+specific information.
Some ESMValTool diagnostics are written in the Julia programming
+language. If you want a full installation of ESMValTool including Julia
+diagnostics, you need to make sure Julia is installed before installing
+ESMValTool.
+
In this tutorial, we will not use Julia, but for reference, we have
+listed the steps to install Julia below. Complete instructions for
+installing Julia can be found on the Julia
+installation page.
+
+
+
+
+
+
First, open a bash terminal and activate the newly created
+esmvaltool environment.
+
+
BASH
+
+
conda activate esmvaltool
+
+
Next, to install Julia via mamba, you can use the
+following command:
+
+
BASH
+
+
mamba install julia
+
+
To check that the Julia executable can be found, run
+
+
BASH
+
+
which julia
+
+
to display the path to the Julia executable, it should be
+
+
OUTPUT
+
+
~/mambaforge/envs/esmvaltool/bin/julia
+
+
To test that Julia is installed correctly, run
+
+
BASH
+
+
julia
+
+
to start the interactive Julia interpreter. Press Ctrl+D
+to exit.
+
+
+
+
+
+
+
Test that the installation was successful
+
+
To test that the installation was successful, run
+
+
BASH
+
+
conda activate esmvaltool
+
+
to activate the conda environment called esmvaltool. In
+the shell prompt the active conda environment should have been changed
+from (base) to (esmvaltool).
+
Next, run
+
+
BASH
+
+
esmvaltool--help
+
+
to display the command line help.
+
+
+
+
+
+
Version of ESMValTool
+
+
Can you figure out which version of ESMValTool has been
+installed?
+
+
+
+
+
+
+
+
+
The esmvaltool --help command lists version
+as a command to get the version
+
When you run
+
+
BASH
+
+
esmvaltool version
+
+
The version of ESMValTool installed should be displayed on the screen
+as:
+
+
OUTPUT
+
+
ESMValCore: 2.11.0
+ESMValTool: 2.11.0
+
+
Note that on HPC servers such as JASMIN, sometimes a more recent
+development version may be displayed for ESMValTool, for e.g.
+ESMValTool: 2.11.0.dev71+g2c60b4d97
+
+
+
+
+
+
+
+
+
+
Key Points
+
+
+
All the required packages can be installed using mamba.
+
You can find more information about installation in the documentation.
What is the user configuration file and how should I use it?
+
+
+
+
+
+
+
+
Objectives
+
+
Understand the contents of the user-config.yml file
+
Prepare a personalized user-config.yml file
+
Configure ESMValTool to use some settings
+
+
+
+
+
+
+
The configuration file
+
+
+
For the purposes of this tutorial, we will create a directory in our
+home directory called esmvaltool_tutorial and use that as
+our working directory. The following steps should do that:
+
+
BASH
+
+
mkdir esmvaltool_tutorial
+cd esmvaltool_tutorial
+
+
The config-user.yml configuration file contains all the
+global level information needed by ESMValTool to run. This is a YAML file.
+
You can get the default configuration file by running:
The default configuration file will be downloaded to the directory
+specified with the --path variable. For instance, you can
+provide the path to your working directory as the
+target_dir. If this option is not used, the file will be
+saved to the default location:
+~/.esmvaltool/config-user.yml, where ~ is the
+path to your home directory. Note that files and directories starting
+with a period are “hidden”, to see the .esmvaltool
+directory in the terminal use ls -la ~. Note that if a
+configuration file by that name already exists in the default location,
+the get_config_user command will not update the file as
+ESMValTool will not overwrite the file. You will have to move the file
+first if you want an updated copy of the user configuration file.
+
We run a text editor called nano to have a look inside
+the configuration file and then modify it if needed:
+
+
BASH
+
+
nano ~/.esmvaltool/config-user.yml
+
+
Any other editor can be used, e.g.vim.
+
This file contains the information for:
+
+
Output settings
+
Destination directory
+
Auxiliary data directory
+
Number of tasks that can be run in parallel
+
Rootpath to input data
+
Directory structure for the data from different projects
+
+
+
+
+
+
+
Text editor side note
+
+
No matter what editor you use, you will need to know where it
+searches for and saves files. If you start it from the shell, it will
+(probably) use your current working directory as its default location.
+We use nano in examples here because it is one of the least
+complex text editors. Press ctrl + O to save the
+file, and then ctrl + X to exit
+nano.
+
+
+
+
Output settings
+
+
+
The configuration file starts with output settings that inform
+ESMValTool about your preference for output. You can turn on or off the
+setting by true or false values. Most of these
+settings are fairly self-explanatory.
+
+
+
+
+
+
Saving preprocessed data
+
+
Later in this tutorial, we will want to look at the contents of the
+preproc folder. This folder contains preprocessed data and
+is removed by default when ESMValTool is run. In the configuration file,
+which settings can be modified to prevent this from happening?
+
+
+
+
+
+
+
+
+
If the option remove_preproc_dir is set to
+false, then the preproc/ directory contains
+all the pre-processed data and the metadata interface files. If the
+option save_intermediary_cubes is set to true
+then data will also be saved after each preprocessor step in the folder
+preproc. Note that saving all intermediate results to file
+will result in a considerable slowdown, and can quickly fill your
+disk.
+
+
+
+
+
Destination directory
+
+
+
The destination directory is the rootpath where ESMValTool will store
+its output folders containing e.g. figures, data, logs, etc. With every
+run, ESMValTool automatically generates a new output folder determined
+by recipe name, and date and time using the format: YYYYMMDD_HHMMSS.
+
+
+
+
+
+
Set the destination directory
+
+
Let’s name our destination directory esmvaltool_output
+in the working directory. ESMValTool should write the output to this
+path, so make sure you have the disk space to write output to this
+directory. How do we set this in the config-user.yml?
+
+
+
+
+
+
+
+
+
We use output_dir entry in the
+config-user.yml file as:
+
+
YAML
+
+
output_dir: ./esmvaltool_output
+
+
If the esmvaltool_output does not exist, ESMValTool will
+generate it for you.
+
+
+
+
+
Rootpath to input data
+
+
+
ESMValTool uses several categories (in ESMValTool, this is referred
+to as projects) for input data based on their source. The current
+categories in the configuration file are mentioned below. For example,
+CMIP is used for a dataset from the Climate Model Intercomparison
+Project whereas OBS may be used for an observational dataset. More
+information about the projects used in ESMValTool is available in the
+[documentation](https://docs.esmvaltool.org/projects/esmvalcore/en/latest/
+quickstart/find_data.html). When using ESMValTool on your own machine,
+you can create a directory to download climate model data or observation
+data sets and let the tool use data from there. It is also possible to
+ask ESMValTool to download climate model data as needed. This can be
+done by specifying a download directory and by setting the option to
+download data as shown below.
+
+
YAML
+
+
# Directory for storing downloaded climate data
+download_dir: ~/climate_data
+search_esgf: always
+
+
If you are working offline or do not want to download the data then
+set the option above to never. If you want to download data
+only when the necessary files are missing at the usual location, you can
+set the option to when_missing.
+
The rootpath specifies the directories where ESMValTool
+will look for input data. For each category, you can define either one
+path or several paths as a list. For example:
These are typically available in the default configuration file you
+downloaded, so simply removing the machine specific lines should be
+sufficient to access input data.
+
+
+
+
+
+
Set the correct rootpath
+
+
In this tutorial, we will work with data from CMIP5 and CMIP6. How can we
+modify the rootpath to make sure the data path is set
+correctly for both CMIP5 and CMIP6? Note: to get the data, check the
+instructions in Setup.
+
+
+
+
+
+
+
+
+
+
Are you working on your own local machine? You need to add the root
+path of the folder where the data is available to the
+config-user.yml file as:
Are you working on your local machine and have downloaded data using
+ESMValTool? You need to add the root path of the folder where the data
+has been downloaded to as specified in the
+download_dir.
Are you working on a computer cluster like Jasmin or DKRZ?
+Site-specific path to the data for JASMIN/DKRZ/ETH/IPSL are already
+listed at the end of the config-user.yml file. You need to
+uncomment the related lines. For example, on JASMIN:
Directory structure for the data from different projects
+
+
+
Input data can be from various models, observations and reanalysis
+data that adhere to the CF/CMOR
+standard. The drs setting describes the file
+structure.
+
The drs setting describes the file structure for several
+projects (e.g. CMIP6, CMIP5, obs4mips, OBS6, OBS) on several key
+machines (e.g. BADC, CP4CDS, DKRZ, ETHZ, SMHI, BSC). For more
+information about drs, you can visit the ESMValTool
+documentation on [Data Reference Syntax (DRS)](https://docs.esmvaltool.org/projects/esmvalcore/
+en/latest/quickstart/find_data.html#cmor-drs).
+
+
+
+
+
+
Set the correct drs
+
+
In this lesson, we will work with data from CMIP5 and CMIP6. How can we
+set the correct drs?
+
+
+
+
+
+
+
+
+
+
Are you working on your own local machine? You need to set the
+drs of the data in the config-user.yml file
+as:
+
+
+
YAML
+
+
drs:
+CMIP5: default
+CMIP6: default
+
+
+
Are you asking ESMValTool to download the data for use with your
+diagnostics? You need to set the drs of the data in the
+config-user.yml file as:
Are you working on a computer cluster like Jasmin or DKRZ?
+Site-specific drs of the data are already listed at the end
+of the config-user.yml file. You need to uncomment the
+related lines. For example, on Jasmin:
+
+
+
YAML
+
+
# Site-specific entries: Jasmin
+ # Uncomment the lines below to locate data on JASMIN
+drs:
+CMIP6: BADC
+CMIP5: BADC
+OBS: default
+OBS6: default
+obs4mips: default
+ana4mips: default
+
+
+
+
+
+
+
+
+
+
+
Explain the default drs (if working on local machine)
+
+
+
In the previous exercise, we set the drs of CMIP5 data
+to default. Can you explain why?
+
Have a look at the directory structure of the OBS data.
+There is a folder called Tier1. What does it mean?
+
+
+
+
+
+
+
+
+
+
+
drs: default is one way to retrieve data from a ROOT
+directory that has no DRS-like structure. default indicates
+that all the files are in a folder without any structure.
+
Observational data are organized in Tiers depending on their
+level of public availability. Therefore the default directory must be
+structured accordingly with sub-directories TierX
+e.g. Tier1, Tier2 or Tier3, even when drs: default. More
+details can be found in the [documentation](https://docs.esmvaltool.org/projects/esmvalcore/en/latest/
+quickstart/find_data.html#observational-data).
+
+
+
+
+
+
Other settings
+
+
+
+
+
+
+
+
Auxiliary data directory
+
+
The auxiliary_data_dir setting is the path where any
+required additional auxiliary data files are stored. This location
+allows us to tell the diagnostic script where to find the files if they
+can not be downloaded at runtime. This option should not be used for
+model or observational datasets, but for data files (e.g. shape files)
+used in plotting such as coastline descriptions and if you want to feed
+some additional data (e.g. shape files) to your recipe.
This option enables you to perform parallel processing. You can
+choose the number of tasks in parallel as 1/2/3/4/… or you can set it to
+null. That tells ESMValTool to use the maximum number of
+available CPUs. For the purpose of the tutorial, please set ESMValTool
+use only 1 cpu:
+
+
YAML
+
+
max_parallel_tasks:1
+
+
In general, if you run out of memory, try setting
+max_parallel_tasks to 1. Then, check the amount of memory
+you need for that by inspecting the file
+run/resource_usage.txt in the output directory. Using the
+number there you can increase the number of parallel tasks again to a
+reasonable number for the amount of memory available in your system.
+
+
+
+
+
+
+
+
+
Make your own configuration file
+
+
It is possible to have several configuration files with different
+purposes, for example: config-user_formalised_runs.yml,
+config-user_debugging.yml. In this case, you have to pass the path of
+your own configuration file as a command-line option when running the
+ESMValTool. We will learn how to do this in the next lesson.
+
+
+
+
+
+
+
+
+
Key Points
+
+
+
The config-user.yml tells ESMValTool where to find
+input data.
This episode describes how ESMValTool recipes work, how to run a
+recipe and how to explore the recipe output. By the end of this episode,
+you should be able to run your first recipe, look at the recipe output,
+and make small modifications.
+
Running an existing recipe
+
+
+
The recipe format has briefly been introduced in the
+[Introduction][lesson-introduction] episode. To see all the recipes that
+are shipped with ESMValTool, type
+
+
BASH
+
+
esmvaltool recipes list
+
+
We will start by running [examples/recipe_python.yml](https://docs.esmvaltool.
+org/en/latest/recipes/recipe_examples.html)
+
esmvaltool run examples/recipe_python.yml
+
or if you have the user configuration file in your current directory
+then
+
esmvaltool run --config_file ./config-user.yml examples/recipe_python.yml
+
If everything is okay, you should see that ESMValTool is printing a
+lot of output to the command line. The final message should be “Run was
+successful”. The exact output varies depending on your machine, but it
+should look something like the example log output on terminal below.
+
{% include example_output.txt %}
+
+
+
+
+
+
Pro tip: ESMValTool search paths
+
+
You might wonder how ESMValTool was able find the recipe file, even
+though it’s not in your working directory. All the recipe paths printed
+from esmvaltool recipes list are relative to ESMValTool’s
+installation location. This is where ESMValTool will look if it cannot
+find the file by following the path from your working directory.
+
+
+
+
Investigating the log messages
+
+
+
Let’s dissect what’s happening here.
+
+
+
+
+
+
Output files and directories
+
+
After the banner and general information, the output starts with some
+important locations.
+
+
Did ESMValTool use the right config file?
+
What is the path to the example recipe?
+
What is the main output folder generated by ESMValTool?
+
Can you guess what the different output directories are for?
+
ESMValTool creates two log files. What is the difference?
+
+
+
+
+
+
+
+
+
+
+
The config file should be the one we edited in the previous episode,
+something like
+/home/<username>/.esmvaltool/config-user.yml or
+~/esmvaltool_tutorial/config-user.yml.
+
ESMValTool found the recipe in its installation directory, something
+like
+/home/users/username/mambaforge/envs/esmvaltool/bin/esmvaltool/recipes/examples/
+or if you are using a pre-installed module on a server, something like
+/apps/jasmin/community/esmvaltool/ESMValTool_<version> /esmvaltool/recipes/examples/recipe_python.yml,
+where <version> is the latest release.
+
ESMValTool creates a time-stamped output directory for every run. In
+this case, it should be something like
+recipe_python_YYYYMMDD_HHMMSS. This folder is made inside
+the output directory specified in the previous episode:
+~/esmvaltool_tutorial/esmvaltool_output.
+
There should be four output folders:
+
+
+
+plots/: this is where output figures are stored.
+
+preproc/: this is where pre-processed data are
+stored.
+
+run/: this is where esmvaltool stores general
+information about the run, such as log messages and a copy of the recipe
+file.
+
+work/: this is where output files (not figures) are
+stored.
+
+
+
The log files are:
+
+
+
+main_log.txt is a copy of the command-line output
+
+main_log_debug.txt contains more detailed information
+that may be useful for debugging.
+
+
+
+
+
+
+
+
+
+
+
Debugging: No ‘preproc’ directory?
+
+
If you’re missing the preproc directory, then your
+config-user.yml file has the value
+remove_preproc_dir set to true (this is used
+to save disk space). Please set this value to false and run
+the recipe again.
+
+
+
+
After the output locations, there are two main sections that can be
+distinguished in the log messages:
+
+
Creating tasks
+
Executing tasks
+
+
+
+
+
+
+
Analyse the tasks
+
+
List all the tasks that ESMValTool is executing for this recipe. Can
+you guess what this recipe does?
+
+
+
+
+
+
+
+
+
Just after all the ‘creating tasks’ and before ‘executing tasks’, we
+find the following line in the output:
+
[134535] INFO These tasks will be executed: map/tas, timeseries/tas_global,
+timeseries/script1, map/script1, timeseries/tas_amsterdam
+
So there are three tasks related to timeseries: global temperature,
+Amsterdam temperature, and a script (tas: near-surface air
+temperature). And then there are two tasks related to a map: something
+with temperature, and again a script.
+
+
+
+
+
Examining the recipe file
+
+
+
To get more insight into what is happening, we will have a look at
+the recipe file itself. Use the following command to copy the recipe to
+your working directory
+
+
BASH
+
+
esmvaltool recipes get examples/recipe_python.yml
+
+
Now you should see the recipe file in your working directory (type
+ls to verify). Use the nano editor to open
+this file:
+
+
BASH
+
+
nano recipe_python.yml
+
+
For reference, you can also view the recipe by unfolding the box
+below.
+
+
+
+
+
+
+
YAML
+
+
# ESMValTool
+# recipe_python.yml
+#
+# See https://docs.esmvaltool.org/en/latest/recipes/recipe_examples.html
+# for a description of this recipe.
+#
+# See https://docs.esmvaltool.org/projects/esmvalcore/en/latest/recipe/overview.html
+# for a description of the recipe format.
+---
+documentation:
+ description: |
+ Example recipe that plots a map and timeseries of temperature.
+
+title: Recipe that runs an example diagnostic written in Python.
+
+authors:
+- andela_bouwe
+- righi_mattia
+
+maintainer:
+- schlund_manuel
+
+references:
+- acknow_project
+
+projects:
+- esmval
+- c3s-magic
+
+datasets:
+-{dataset: BCC-ESM1,project: CMIP6,exp: historical,ensemble: r1i1p1f1,grid: gn}
+-{dataset: bcc-csm1-1,project: CMIP5,exp: historical,ensemble: r1i1p1}
+
+preprocessors:
+ # See https://docs.esmvaltool.org/projects/esmvalcore/en/latest/recipe/preprocessor.html
+ # for a description of the preprocessor functions.
+
+to_degrees_c:
+convert_units:
+units: degrees_C
+
+annual_mean_amsterdam:
+extract_location:
+location: Amsterdam
+scheme: linear
+annual_statistics:
+operator: mean
+multi_model_statistics:
+statistics:
+- mean
+span: overlap
+convert_units:
+units: degrees_C
+
+annual_mean_global:
+area_statistics:
+operator: mean
+annual_statistics:
+operator: mean
+convert_units:
+units: degrees_C
+
+diagnostics:
+
+map:
+description: Global map of temperature in January 2000.
+themes:
+- phys
+realms:
+- atmos
+variables:
+tas:
+mip: Amon
+preprocessor: to_degrees_c
+timerange: 2000/P1M
+ caption: |
+ Global map of {long_name} in January 2000 according to {dataset}.
+scripts:
+script1:
+script: examples/diagnostic.py
+quickplot:
+plot_type: pcolormesh
+cmap: Reds
+
+timeseries:
+description: Annual mean temperature in Amsterdam and global mean since 1850.
+themes:
+- phys
+realms:
+- atmos
+variables:
+tas_amsterdam:
+short_name: tas
+mip: Amon
+preprocessor: annual_mean_amsterdam
+timerange: 1850/2000
+caption: Annual mean {long_name} in Amsterdam according to {dataset}.
+tas_global:
+short_name: tas
+mip: Amon
+preprocessor: annual_mean_global
+timerange: 1850/2000
+caption: Annual global mean {long_name} according to {dataset}.
+scripts:
+script1:
+script: examples/diagnostic.py
+quickplot:
+plot_type: plot
+
+
+
+
+
+
Do you recognize the basic recipe structure that was introduced in
+episode 1?
+
+
+Documentation with relevant (citation)
+information
+
+Datasets that should be analysed
+
+Preprocessors groups of common preprocessing
+steps
+
+Diagnostics scripts performing more specific
+evaluation steps
+
+
+
+
+
+
+
Analyse the recipe
+
+
Try to answer the following questions:
+
+
Who wrote this recipe?
+
Who should be approached if there is a problem with this
+recipe?
+
How many datasets are analyzed?
+
What does the preprocessor called annual_mean_global
+do?
+
Which script is applied for the diagnostic called
+map?
+
Can you link specific lines in the recipe to the tasks that we saw
+before?
+
How is the location of the city specified?
+
How is the temporal range of the data specified?
+
+
+
+
+
+
+
+
+
+
+
The example recipe is written by Bouwe Andela and Mattia
+Righi.
+
Manuel Schlund is listed as the maintainer of this
+recipe.
+
Two datasets are analysed:
+
+
+
CMIP6 data from the model BCC-ESM1
+
CMIP5 data from the model bcc-csm1-1
+
+
+
The preprocessor annual_mean_global computes an area
+mean as well as annual means
+
The diagnostic called map executes a script referred
+to as script1. This is a python script named
+examples/diagnostic.py
+
There are two diagnostics: map and
+timeseries. Under the diagnostic map we find
+two tasks:
+
+
+
a preprocessor task called tas, applying the
+preprocessor called to_degrees_c to the variable
+tas.
+
a diagnostic task called script1, applying the script
+examples/diagnostic.py to the preprocessed data
+(map/tas).
+
+
Under the diagnostic timeseries we find three tasks:
+
+
a preprocessor task called tas_amsterdam, applying the
+preprocessor called annual_mean_amsterdam to the variable
+tas.
+
a preprocessor task called tas_global, applying the
+preprocessor called annual_mean_global to the variable
+tas.
+
a diagnostic task called script1, applying the script
+examples/diagnostic.py to the preprocessed data
+(timeseries/tas_global and
+timeseries/tas_amsterdam).
+
+
+
The extract_location preprocessor is used to get
+data for a specific location here. ESMValTool interpolates to the
+location based on the chosen scheme. Can you tell the scheme used here?
+For more ways to extract areas, see the [Area
+operations][preproc-area-manipulation] page.
+
The timerange tag is used to extract data from a
+specific time period here. The start time is 01/01/2000 and
+the span of time to calculate means is 1 Month given by
+P1M. For more options on how to specify time ranges, see
+the [timerange documentation][timeranges].
+
+
+
+
+
+
+
+
+
+
+
Pro tip: short names and variable groups
+
+
The preprocessor tasks in ESMValTool are called ‘variable groups’.
+For the diagnostic timeseries, we have two variable groups:
+tas_amsterdam and tas_global. Both of them
+operate on the variable tas (as indicated by the
+short_name), but they apply different preprocessors. For
+the diagnostic map the variable group itself is named
+tas, and you’ll notice that we do not explicitly provide
+the short_name. This is a shorthand built into
+ESMValTool.
+
+
+
+
+
+
+
+
+
Output files
+
+
Have another look at the output directory created by the ESMValTool
+run.
+
Which files/folders are created by each task?
+
+
+
+
+
+
+
+
+
+
+map/tas: creates /preproc/map/tas,
+which contains preprocessed data for each of the input datasets, a file
+called metadata.yml describing the contents of these
+datasets and provenance information in the form of .xml
+files.
+
+timeseries/tas_global: creates
+/preproc/timeseries/tas_global, which contains preprocessed
+data for each of the input datasets, a metadata.yml file
+and provenance information in the form of .xml files.
+
+timeseries/tas_amsterdam: creates
+/preproc/timeseries/tas_amsterdam, which contains
+preprocessed data for each of the input datasets, plus a combined
+MultiModelMean, a metadata.yml file and
+provenance files.
+
+map/script1: creates /run/map/script1
+with general information and a log of the diagnostic script run. It also
+creates /plots/map/script1/ and
+/work/map/script1, which contain output figures and output
+datasets, respectively. For each output file, there is also
+corresponding provenance information in the form of .xml,
+.bibtex and .txt files.
+
+timeseries/script1: creates
+/run/timeseries/script1 with general information and a log
+of the diagnostic script run. It also creates
+/plots/timeseries/script1 and
+/work/timeseries/script1, which contain output figures and
+output datasets, respectively. For each output file, there is also
+corresponding provenance information in the form of .xml,
+.bibtex and .txt files.
+
+
+
+
+
+
+
+
+
+
+
Pro tip: diagnostic logs
+
+
When you run ESMValTool, any log messages from the diagnostic script
+are not printed on the terminal. But they are written to the
+log.txt files in the folder
+/run/<diag_name>/log.txt.
+
ESMValTool does print a command that can be used to re-run a
+diagnostic script. When you use this the output will be printed
+to the command line.
+
+
+
+
Modifying the example recipe
+
+
+
Let’s make a small modification to the example recipe. Notice that
+now that you have copied and edited the recipe, you can use
+
esmvaltool run recipe_python.yml
+
to refer to your local file rather than the default version shipped
+with ESMValTool.
+
+
+
+
+
+
Change your location
+
+
Modify and run the recipe to analyse the temperature for your own
+location.
+
+
+
+
+
+
+
+
+
In principle, you only have to modify the location in the
+preprocessor called annual_mean_amsterdam. However, it is
+good practice to also replace all instances of amsterdam
+with the correct name of your location. Otherwise the log messages and
+output will be confusing. You are free to modify the names of
+preprocessors or diagnostics.
+
In the diff file below you will see the changes we have
+made to the file. The top 2 lines are the filenames and the lines like
+@@ -39,9 +39,9 @@ represent the line numbers in the
+original and modified file, respectively. For more info on this format,
+see here.
+
+
DIFF
+
+
--- recipe_python.yml
++++ recipe_python_london.yml
+@@ -39,9 +39,9 @@
+ convert_units:
+ units: degrees_C
+
+- annual_mean_amsterdam:
++ annual_mean_london:
+ extract_location:
+- location: Amsterdam
++ location: London
+ scheme: linear
+ annual_statistics:
+ operator: mean
+@@ -83,7 +83,7 @@
+ cmap: Reds
+
+ timeseries:
+- description: Annual mean temperature in Amsterdam and global mean since 1850.
++ description: Annual mean temperature in London and global mean since 1850.
+ themes:
+ - phys
+ realms:
+@@ -92,9 +92,9 @@
+ tas_amsterdam:
+ short_name: tas
+ mip: Amon
+- preprocessor: annual_mean_amsterdam
++ preprocessor: annual_mean_london
+ timerange: 1850/2000
+- caption: Annual mean {long_name} in Amsterdam according to {dataset}.
++ caption: Annual mean {long_name} in London according to {dataset}.
+ tas_global:
+ short_name: tas
+ mip: Amon
+
+
+
+
+
+
+
+
+
+
+
Key Points
+
+
+
ESMValTool recipes work ‘out of the box’ (if input data is
+available)
+
There are strong links between the recipe, log file, and output
+folders
+
Recipes can easily be modified to re-use existing code for your own
+use case
There are lots of resources available to assist you in using
+ESMValTool.
+
The ESMValTool Discussions
+page is a good place to find information on general issues, or check
+if your question has already been addressed. If you have a GitHub
+account, you can also post your questions there.
+
If you encounter difficulties, a great starting point is to visit
+issues page issue page
+to check whether your issues have already been reported or not. If they
+have been reported before, suggestions provided by developers can help
+you to solve the issues you encountered. Note that you will need a
+GitHub account for this.
+
Additionally, there is an ESMValTool email list. Please see information
+on how to subscribe to the user mailing list.
+
+
+
What if I find a bug?
+
+
If you find a bug, please report it to the ESMValTool team. This will
+help us fix issues, ensuring not only your uninterrupted workflow but
+also contributing to the overall stability of ESMValTool for all
+users.
+
To report a bug, please create a new issue using the issue
+page.
+
In your bug report, please describe the problem as clearly and as
+completely as possible. You may need to include a recipe or the output
+log as well.
Can I use different preprocessors for different variables?
+
Can I use different datasets for different variables?
+
How can I combine different preprocessor functions?
+
Can I run the same recipe for multiple ensemble members?
+
+
+
+
+
+
+
+
Objectives
+
+
Create a recipe with multiple preprocessors
+
Use different preprocessors for different variables
+
Run a recipe with variables from different datasets
+
+
+
+
+
+
+
Introduction
+
+
+
One of the key strengths of ESMValTool is in making complex analyses
+reusable and reproducible. But that doesn’t mean everything in
+ESMValTool needs to be complex. Sometimes, the biggest challenge is in
+keeping things simple. You probably know the ‘warming stripes’
+visualization by Professor Ed Hawkins. On the site https://showyourstripes.info{:target=“_blank”} you can
+find the same visualization for many regions in the world.
+
Shared by Ed Hawkins under a Creative Commons 4.0 Attribution
+International licence. Source: https://showyourstripes.info
+
In this episode, we will reproduce and extend this functionality with
+ESMValTool. We have prepared a small Python script that takes a NetCDF
+file with timeseries data, and visualizes it in the form of our desired
+warming stripes figure.
+
The diagnostic script that we will use is called
+warming_stripes.py and can be downloaded here.
+
Download the file and store it in your working directory. If you
+want, you may also have a look at the contents, but it is not necessary
+to do so for this lesson.
+
We will write an ESMValTool recipe that takes some data, performs the
+necessary preprocessing, and then runs this Python script.
+
+
+
+
+
+
Drawing up a plan
+
+
Previously, we saw that running ESMValTool executes a number of
+tasks. What tasks do you think we will need to execute and what should
+each of these tasks do to generate the warming stripes?
+
+
+
+
+
+
+
+
+
In this episode, we will need to do the following two tasks:
+
+
A preprocessing task that converts the gridded temperature data to a
+timeseries of global temperature anomalies
+
A diagnostic tasks that calls our Python script, taking our
+preprocessed timeseries data as input.
+
+
+
+
+
+
Building a recipe from scratch
+
+
+
The easiest way to make a new recipe is to start from an existing
+one, and modify it until it does exactly what you need. However, in this
+episode we will start from scratch. This forces us to think about all
+the steps involved in processing the data. We will also deal with
+commonly occurring errors through the development of the recipe.
+
Remember the basic structure of a recipe, and notice that each
+component is extensively described in the documentation under the
+section, [“Overview”][recipe-overview]{:target=“_blank”}:
This is the first place to look for help if you get stuck.
+
Open a new file called recipe_warming_stripes.yml:
+
+
BASH
+
+
nano recipe_warming_stripes.yml
+
+
Let’s add the standard header comments (these do not do anything),
+and a first description.
+
+
YAML
+
+
# ESMValTool
+# recipe_warming_stripes.yml
+---
+documentation:
+description: Reproducing Ed Hawkins' warming stripes visualization
+title: Reproducing Ed Hawkins' warming stripes visualization.
+
+
Notice that yaml always requires two spaces
+indentation between the different levels. Pressing ctrl+o
+will save the file. Verify the filename at the bottom and press enter.
+Then use ctrl+x to exit the editor.
+
We will try to run the recipe after every modification we make, to
+see if it (still) works!
+
+
BASH
+
+
esmvaltool run recipe_warming_stripes.yml
+
+
In this case, it gives an error. Below you see the last few lines of
+the error message.
+
+
ERROR
+
+
...
+yamale.yamale_error.YamaleError:
+Error validating data '/home/users/username/esmvaltool_tutorial/recipe_warming_stripes.yml'
+with schema
+'/apps/jasmin/community/esmvaltool/miniconda3_py311_23.11.0-2/envs/esmvaltool/lib/python3.11/
+site-packages/esmvalcore/_recipe/recipe_schema.yml'
+ documentation.authors: Required field missing
+2024-05-27 13:21:23,805 UTC [41924] INFO
+If you have a question or need help, please start a new discussion on
+https://github.com/ESMValGroup/ESMValTool/discussions
+If you suspect this is a bug, please open an issue on
+https://github.com/ESMValGroup/ESMValTool/issues
+To make it easier to find out what the problem is, please consider attaching the
+files run/recipe_*.yml and run/main_log_debug.txt from the output directory.
+
+
+
We can use the the log message above, to understand why ESMValTool
+failed. Here, this is because we missed a required field with author
+names. The text
+documentation.authors: Required field missing tells us
+that. We see that ESMValTool always tries to validate the recipe at an
+early stage. Note also the suggestion to open a GitHub issue if you need
+help debugging the error message. This is something most users do when
+they cannot understand the error or are not able to fix it on their
+own.
+
Let’s add some additional information to the recipe. Open the recipe
+file again, and add an authors section below the description. ESMValTool
+expects the authors as a list, like so:
+
+
YAML
+
+
authors:
+- lastname_firstname
+
+
To bypass a number of similar error messages, add a minimal
+diagnostics section below the documentation. The file should now look
+like:
This is the minimal recipe layout that is required by ESMValTool. If
+we now run the recipe again, you will probably see the following
+error:
+
+
ERROR
+
+
ValueError: Tag 'doe_john' does not exist in section
+'authors' of /apps/jasmin/community/esmvaltool/ESMValTool_2.10.0/esmvaltool/config-references.yml
+
+
+
+
+
+
+
+
Pro tip: config-references.yml
+
+
The error message above points to a file named
+[config-references.yml][config-references] This is where ESMValTool
+stores all its citation information. To add yourself as an author, add
+your name in the form lastname_firstname in alphabetical
+order following the existing entries, under the
+# Development team section. See the [List of
+authors][list-of-authors]{:target=“_blank”} section in the ESMValTool
+documentation for more information.
+
+
+
+
For now, let’s just use one of the existing references. Change the
+author field to righi_mattia, who cannot receive enough
+credit for all the effort he put into ESMValTool. If you now run the
+recipe again, you should see the final message
+
+
OUTPUT
+
+
ERROR No tasks to run!
+
+
Although there is no actual error in the recipe, ESMValTool assumes
+you mistakenly left out a variable name to process and alerts you with
+this error message.
+
Adding a dataset entry
+
+
+
Let’s add a datasets section.
+
+
+
+
+
+
Filling in the dataset keys
+
+
Use the paths specified in the configuration file to explore the data
+directory, and look at the explanation of the dataset entry in the
+[ESMValTool documentation][recipe-section-datasets]{:target=“_blank”}.
+For both the datasets, write down the following properties:
Note that the grid key is only required for CMIP6 data, and that the
+extent of the historical period has changed between CMIP5 and CMIP6.
+
+
+
+
+
Let us start with the BCC-ESM1 dataset and add a datasets section to
+the recipe, listing this single dataset, as shown below. Note that key
+fields such as mip or start_year are included
+in the datasets section here but are part of the
+diagnostic section in the recipe example seen in Running your first recipe.
The recipe should run but produce the same message as in the previous
+case since we still have not included a variable to actually process. We
+have not included the short name of the variable in this dataset section
+because this allows us to reuse this dataset entry with different
+variable names later on. This is not really necessary for our simple use
+case, but it is common practice in ESMValTool.
+
+
+
+
+
+
Pro-tip: Automatically populating a recipe with all available datasets
+
+
You can select all available models for processing using
+glob patterns or wildcards. An example
+datasets section that uses all available CMIP6 models and
+ensemble members for the historical experiment is available
+[here] [include-all-datasets]{:target=“_blank”}. Note that you will have
+to set the search_esgf option in the
+config_file to always so that you can download
+data from ESGF nodes as needed.
+
+
+
+
Adding the preprocessor section
+
+
+
Above, we already described the preprocessing task that needs to
+convert the standard, gridded temperature data to a timeseries of
+temperature anomalies.
+
+
+
+
+
+
Defining the preprocessor
+
+
Have a look at the available preprocessors in the
+[documentation][preprocessor]{:target=“_blank”}. Write down
+
+
Which preprocessor functions do you think we should use?
+
What are the parameters that we can pass to these functions?
+
What do you think should be the order of the preprocessors?
+
A suitable name for the overall preprocessor
+
+
+
+
+
+
+
+
+
+
We need to calculate anomalies and global means. There is an
+anomalies preprocessor which takes in as arguments, a time
+period, a reference period, and whether or not to standardize the data.
+The global means can be calculated with the area_statistics
+preprocessor, which takes an operator as argument (in our case we want
+to compute the mean).
+
The default order in which these preprocessors are applied can be
+seen [here][preprocessor-functions]{:target=“_blank”}:
+area_statistics comes before anomalies. If you
+want to change this, you can use the custom_order
+preprocessor as described
+[here][recipe-section-preprocessors]{:target=“_blank”}. For this
+example, we will keep the default order..
+
Let’s name our preprocessor global_anomalies.
+
+
+
+
+
Add the following block to your recipe file between the
+datasets and diagnostics block:
We are now ready to finish our diagnostics section. Remember that we
+want to create two tasks: a preprocessor task, and a diagnostic task. To
+illustrate that we can also pass settings to the diagnostic script, we
+add the option to specify a custom colormap.
+
+
+
+
+
+
Fill in the blanks
+
+
Extend the diagnostics section in your recipe by filling in the
+blanks in the following template:
+
+
YAML
+
+
diagnostics:
+<... (suitable name for our diagnostic)>:
+ description: <...>
+variables:
+<... (suitable name for the preprocessed variable)>:
+ short_name: <...>
+ preprocessor: <...>
+scripts:
+<... (suitable name for our python script)>:
+script: <full path to python script>
+colormap: <... choose from matplotlib colormaps>
+
+
+
+
+
+
+
+
+
+
+
YAML
+
+
diagnostics:
+diagnostic_warming_stripes:
+description: visualize global temperature anomalies as warming stripes
+variables:
+global_temperature_anomalies_global:
+short_name: tas
+preprocessor: global_anomalies
+scripts:
+warming_stripes_script:
+script: ~/esmvaltool_tutorial/warming_stripes.py
+colormap:'bwr'
+
+
+
+
+
+
You should now be able to run the recipe to get your own warming
+stripes.
+
Note: for the purpose of simplicity in this episode, we have not
+added logging or provenance tracking in the diagnostic script. Once you
+start to develop your own diagnostic scripts and want to add them to the
+ESMValTool repositories, this will be required. Writing your own
+diagnostic script is discussed in a later
+episode.
+
Bonus exercises
+
+
+
Below are a few exercises to practice modifying an ESMValTool recipe.
+For your reference, here’s a copy of the recipe at this point. This
+will be the point of departure for each of the modifications we’ll make
+below.
+
+
+
+
+
+
Specific location selection
+
+
On showyourstripes.org, you can download stripes for specific
+locations. Here we show how this can be done with ESMValTool. Instead of
+the global mean, we can pick a location to plot the stripes for. Can you
+find a suitable preprocessor to do this?
+
+
+
+
+
+
+
+
+
You can use extract_point or extract_region
+to select a location. We used extract_point. Here’s a copy
+of the recipe at this
+point and this is the difference from the previous recipe:
Split the diagnostic in two with two different time periods for the
+same variable. You can choose the time periods yourself. In the example
+below, we have chosen the recent past and the 20th century and have used
+variable grouping.
+
+
+
+
+
+
+
+
+
Here’s a copy of the recipe at this point
+and this is the difference with the previous recipe:
Now that you have different variable groups, we can also use
+different preprocessors. Add a second preprocessor to add another
+location of your choosing.
+
+
Solution
+
Here’s a copy of the recipe at
+this point and this is the difference with the previous recipe:
If you want to avoid retyping the arguments used in your
+preprocessor, you can use YAML anchors as seen in the
+anomalies preprocessor specifications in the recipe
+above.
+
+
+
+
+
+
+
+
+
Additional datasets
+
+
So far we have defined the datasets in the datasets section of the
+recipe. However, it’s also possible to add specific datasets only for
+specific variables or variable groups. Take a look at the documentation
+to learn about the additional_datasets keyword
+[here][additional-datasets]{:target=“_blank”}, and add a second dataset
+only for one of the variable groups.
+
+
+
+
+
+
+
+
+
Here’s a copy of the recipe at
+this point and this is the difference with the previous recipe:
You can choose data from multiple ensemble members for a model in a
+single line.
+
+
+
+
+
+
+
+
+
The dataset section allows you to choose more than one
+ensemble member Here’s a copy of the changed recipe
+to do that. Changes made are shown in the diff output below:
Check out the section on a different way to use multiple ensemble
+members or even multiple experiments at [Concatenating data
+corresponding to multiple facets]
+[concatenating-datasets]{:target=“_blank”}.
+
+
+
+
+
+
+
+
+
Key Points
+
+
+
A recipe can work with different preprocessors at the same
+time.
+
The setting additional_datasets can be used to add a
+different dataset.
+
Variable groups are useful for defining different settings for
+different variables.
+
Multiple ensemble members and experiments can be analysed in a
+single recipe through concatenation.
How can I incorporate my contributions into ESMValTool?
+
+
+
+
+
+
+
+
Objectives
+
+
Execute a successful ESMValTool installation from the source
+code.
+
Contribute to ESMValTool development.
+
+
+
+
+
+
+
We now know how ESMValTool works, but how do we develop it?
+ESMValTool is an open-source project in ESMValGroup. We can contribute
+to its development by:
helping with reviewing process of pull requests, see ESMValTool
+documentation on Review
+of pull requests
+
+
+
In this lesson, we first show how to set up a development
+installation of ESMValTool so you can make changes or additions. We then
+explain how you can contribute these changes to the community.
+
+
+
+
+
+
Git knowledge
+
+
For this episode, you need some knowledge of Git. You can refresh your knowledge in
+the corresponding Git carpentries
+course.
+
+
+
+
Development installation
+
+
+
We’ll explore how ESMValTool can be installed it in a
+develop mode. Even if you aren’t collaborating with the
+community, this installation is needed to run your new codes with
+ESMValTool. Let’s get started.
Download the code from the repository. A ZIP file called
+ESMValTool-main.zip is downloaded. To continue the
+installation, unzip the file, move to the ESMValTool-main
+directory and then follow the sequence of steps starting from the
+section on ESMValTool dependencies below.
+
Clone the repository if you want to contribute to the ESMValTool
+development:
This command will ask your GitHub username and a personal
+token as password. Please follow instructions on
+[GitHub token authentication
+requirements][token-authentication-requirements] to create a personal
+access token. Alternatively, you could [generate a new SSH
+key][generate-ssh-key] and [add it to your GitHub account][add-ssh-key].
+After the authentication, the output might look like:
Now, a folder called ESMValTool has been created in your
+working directory. This folder contains the source code of the tool. To
+continue the installation, we move into the ESMValTool
+directory:
+
+
BASH
+
+
cd ESMValTool
+
+
Note that the main branch is checked out by default. We
+can see this if we run:
+
+
BASH
+
+
git status
+
+
+
OUTPUT
+
+
On branch main
+Your branch is up to date with 'origin/main'.
+
+nothing to commit, working tree clean
+
+
+
+
2 ESMValTool dependencies
+
+
Please don’t forget if an esmvaltool environment is already created
+following the lesson Installation, we
+should choose another name for the new environment in this lesson.
+
ESMValTool now uses mamba instead of conda
+for the recommended installation. For a minimal mamba installation, see
+section Install Mamba in lesson Installation.
+
It is good practice to update the version of mamba and conda on your
+machine before setting up ESMValTool. This can be done as follows:
+
+
BASH
+
+
mamba update --name base mamba conda
+
+
To simplify the installation process, an environment file
+environment.yml is provided in the ESMValTool directory. We
+create an environment by running:
The environment is called esmvaltool by default. If an
+esmvaltool environment is already created following the
+lesson Installation, we should choose
+another name for the new environment in this lesson by:
This will create a new conda environment and install ESMValTool (with
+all dependencies that are needed for development purposes) into it with
+a single command.
+
For more information see [conda managing
+environments][manage-environments].
+
Now, we should activate the environment:
+
+
BASH
+
+
conda activate esmvaltool
+
+
where esmvaltool is the name of the environment (replace
+by a_new_name in case another environment name was
+used).
+
+
+
3 ESMValTool installation
+
+
ESMValTool can be installed in a develop mode by
+running:
+
+
BASH
+
+
pip install --editable'.[develop]'
+
+
This will add the esmvaltool directory to the Python
+path in editable mode and install the development dependencies. We
+should check if the installation works properly. To do this, run the
+tool with:
+
+
BASH
+
+
esmvaltool--help
+
+
If the installation is successful, ESMValTool prints a help message
+to the console.
+
+
+
+
+
+
Checking the development installation
+
+
We can use the command mamba list to list installed
+packages in the esmvaltool environment. Use this command to
+check that ESMValTool is installed in a develop mode.
# Name Version Build Channel
+esmvaltool 2.10.0.dev3+g2dbc2cfcc pypi_0 pypi
+
+
+
+
+
+
+
+
4 Updating ESMValTool
+
+
The main branch has the latest features of ESMValTool.
+Please make sure that the source code on your machine is up-to-date. If
+you obtain the source code using git clone as explained in step
+1 Source code, you can run git pull to
+update the source code. Then ESMValTool installation will be updated
+with changes from the main branch.
+
+
Contribution
+
+
+
We have seen how to install ESMValTool in a develop
+mode. Now, we try to contribute to its development. Let’s see how this
+can be achieved. We first discuss our ideas in an issue
+in ESMValTool repository. This can avoid disappointment at a later
+stage, for example, if more people are doing the same thing. It also
+gives other people an early opportunity to provide input and
+suggestions, which results in more valuable contributions.
+
Then, we create a new branch locally and start
+developing new codes. To create a new branch:
+
+
BASH
+
+
git checkout -b your_branch_name
+
+
If needed, a link to a git tutorial can be found in Setup.
+
Once our development is finished, we can initiate a
+pull request. To this end, we encourage you to join the
+ESMValTool development team.
+
For more extensive documentation on contributing code, including a
+section on the GitHubWorkflow,
+please see the [Contributing code and documentation][code-documentation]
+section in the ESMValtool documentation.
+
+
Review process
+
+
The pull request will be tested, discussed and merged as part of the
+“review process”. The process will take some effort and
+time to learn. However, a few (command line) tools can get you a long
+way, and we’ll cover those essentials in the next sections.
+
Tip: we encourage you to keep the pull requests
+small. Reviewing small incremental changes is more efficient.
+
+
+
Working example
+
+
We saw the ‘warming stripes’ diagnostic in lesson Writing your own recipe. Imagine the
+following task: you want to contribute warming stripes recipe and
+diagnostics to ESMValTool. You have to add the diagnostics warming_stripes.py and the recipe recipe_warming_stripes.yml
+to their locations in ESMValTool directory. After these changes, you
+should also check if everything works fine. This is where we take
+advantage of the tools that are introduced later.
+
Let’s get started. Note that since this is an exercise to get
+familiar with the development and contribution process, we will not
+create a GitHub issue at this time but proceed as though it has been
+done.
+
+
+
Check code quality
+
+
We aim to adhere to best practices and coding standards. There are
+several tools that check our code against those standards like:
+
+
+flake8 for
+checking against the PEP8 style guide
+
+yapf to ensure
+consistent formatting for the whole project
+
+isort to consistently
+sort the import statements
+
+yamllint
+to ensure there are no syntax errors in our recipes and config
+files
The good news is that pre-commit has been already
+installed when we chose development installation.
+pre-commit is a command line and runs all of those tools.
+It also fixes some of those errors. To explore other tools, have a look
+at ESMValTool documentation on Code
+style.
+
+
+
+
+
+
Using pre-commit
+
+
Let’s checkout our local branch and add the script warming_stripes.py to the
+esmvaltool/diag_scripts directory.
+
+
BASH
+
+
cd ESMValTool
+git checkout your_branch_name
+cp path_of_warming_stripes.py esmvaltool/diag_scripts/
+
+
By default, pre-commit only runs on the files that have
+been staged in git:
+
+
BASH
+
+
git status
+git add esmvaltool/diag_scripts/warming_stripes.py
+pre-commit run --files esmvaltool/diag_scripts/warming_stripes.py
+
+
Inspect the output of pre-commit and fix the remaining
+errors.
+
+
+
+
+
+
+
+
+
The tail of the output of pre-commit:
+
+
BASH
+
+
Check for added large files..............................................Passed
+Check python ast.........................................................Passed
+Check for case conflicts.................................................Passed
+Check for merge conflicts................................................Passed
+Debug Statements (Python)................................................Passed
+Fix End of Files.........................................................Passed
+Trim Trailing Whitespace.................................................Passed
+yamllint.............................................(no files to check)Skipped
+nclcodestyle.........................................(no files to check)Skipped
+style-files..........................................(no files to check)Skipped
+lintr................................................(no files to check)Skipped
+codespell................................................................Passed
+isort....................................................................Passed
+yapf.....................................................................Passed
+docformatter.............................................................Failed
+- hook id: docformatter
+- files were modified by this hook
+flake8...................................................................Failed
+- hook id: flake8
+- exit code: 1
+
+esmvaltool/diag_scripts/warming_stripes.py:20:5:
+F841 local variable 'nx' is assigned to but never used
+
+
As can be seen above, there are two Failed check:
+
+
+docformatter: it is mentioned that “files were modified
+by this hook”. We run git diff to see the modifications.
+The output includes the following:
+
+
+
BASH
+
+
+in the form of the popular warming stripes figure by Ed Hawkins."""
+
+
The syntax """ at the end of docstring is moved by one
+line. Shifting it to the next line should fix this error.
+
+
+flake8: the error message is about an unused local
+variable nx. We should check our codes regarding the usage
+of nx. For now, let’s assume that it is added by mistake
+and remove it. Note that you have to run git add again to
+re-stage the file. Then rerun pre-commit and check that it passes.
+
+
+
+
+
+
+
+
Run unit tests
+
+
Previous section introduced some tools to check code style and
+quality. There is lack of mechanism to determine whether or not our code
+is getting the right answer. To achieve that, we need to write and run
+tests for widely-used functions. ESMValTool comes with a lot of tests
+that are in the folder tests.
+
To run tests, first we make sure that the working directory is
+ESMValTool and our local branch is checked out. Then, we
+can run tests using pytest locally:
+
+
BASH
+
+
pytest
+
+
Tests will also be run automatically by CircleCI, when you submit a pull
+request.
+
+
+
+
+
+
Running tests
+
+
Make sure our local branch is checked out and add the recipe recipe_warming_stripes.yml
+to the esmvaltool/recipes directory:
Run pytest and inspect the results, this might take a
+few minutes. If a test is failed, try to fix it.
+
+
+
+
+
+
+
+
+
Run:
+
+
BASH
+
+
pytest
+
+
When pytest run is complete, you can inspect the test
+reports that are printed in the console. Have a look at the second
+section of the report FAILURES:
The test message shows that the recipe
+recipe_warming_stripes.yml is not a valid recipe. Look for
+a line that starts with an E in the rest of the
+message:
+
+
BASH
+
+
+E esmvalcore._task.DiagnosticError: Cannot execute script
+'~/esmvaltool_tutorial/warming_stripes.py'(~/esmvaltool_tutorial/warming_stripes.py):
+file does not exist.
+
+
To fix the recipe, we need to edit the path of the diagnostic script
+as warming_stripes.py:
When we add or update a code, we also update its corresponding
+documentation. The ESMValTool documentation is available on docs.esmvaltool.org.
+The source files are located in
+ESMValTool/doc/sphinx/source/.
+
To build documentation locally, first we make sure that the working
+directory is ESMValTool and our local branch is checked
+out. Then, we run:
Similar to code, documentation should be well written and adhere to
+standards. If the documentation is built properly, the previous command
+prints a message to the console:
+
+
OUTPUT
+
+
build succeeded.
+
+The HTML pages are in doc/sphinx/build.
+
+
The main page of the documentation has been built into
+index.html in doc/sphinx/build/ directory. To
+preview this page locally, we open the file in a web browser:
+
+
BASH
+
+
xdg-open doc/sphinx/build/index.html
+
+
+
+
+
+
+
Creating a documentation
+
+
In previous exercises, we added the recipe recipe_warming_stripes.yml
+to ESMValTool. Now, we create a documentation file
+recipe_warming_stripes.rst for this recipe:
Save and close the file. We can think of this file as one page of a
+book. Examples of documentation pages can be found in the folder
+ESMValTool/doc/sphinx/source/recipes. Then, we need to
+decide where this page should be located inside the book. The table of
+content is defined by index.rst. Let’s have a look at the
+content:
+
+
BASH
+
+
nano doc/sphinx/source/recipes/index.rst
+
+
Add the recipe name i.e. recipe_warming_stripes to the
+section Other in this file and preview the recipe
+documentation page locally.
+
+
+
+
+
+
+
+
+
First, we add the recipe name recipe_warming_stripes to
+the section Other:
How do I use the preprocessor output in a Python diagnostic?
+
+
+
+
+
+
+
+
Objectives
+
+
Write a new Python diagnostic script.
+
Explain how a diagnostic script reads the preprocessor output.
+
+
+
+
+
+
+
Introduction
+
+
+
The diagnostic script is an important component of ESMValTool and it
+is where the scientific analysis or performance metric is implemented.
+With ESMValTool, you can adapt an existing diagnostic or write a new
+script from scratch. Diagnostics can be written in a number of open
+source languages such as Python, R, Julia and NCL but we will focus on
+understanding and writing Python diagnostics in this lesson.
+
In this lesson, we will explain how to find an existing diagnostic
+and run it using ESMValTool installed in editable/development mode. For
+a development installation, see the instructions in the lesson Development and contribution. Also,
+we will work with the recipe [recipe_python.yml][recipe] and the
+diagnostic script [diagnostic.py][diagnostic] called by this recipe that
+we have seen in the lesson Running your first
+recipe .
+
Let’s get started!
+
Understanding an existing Python diagnostic
+
+
+
If you clone the ESMValTool repository, a folder called
+ESMValTool is created in your home/working directory, see
+the instructions in the lesson Development and contribution .
+
The folder ESMValTool contains the source code of the
+tool. We can find the recipe recipe_python.yml and the
+python script diagnostic.py in these directories:
Let’s have look at the code in diagnostic.py. For
+reference, we show the diagnostic code in the dropdown box below. There
+are four main sections in the script:
+
+
A description i.e. the docstring (line 1).
+
Import statements (line 2-16).
+
Functions that implement our analysis (line 21-102).
+
A typical Python top-level script
+i.e. if __name__ == '__main__' (line 105-108).
+
+
+
+
+
+
+
+
PYTHON
+
+
1: """Python example diagnostic."""
+2: import logging
+3: from pathlib import Path
+4: from pprint import pformat
+5:
+6: import iris
+7:
+8: from esmvaltool.diag_scripts.shared import (
+9: group_metadata,
+10: run_diagnostic,
+11: save_data,
+12: save_figure,
+13: select_metadata,
+14: sorted_metadata,
+15: )
+16: from esmvaltool.diag_scripts.shared.plot import quickplot
+17:
+18: logger = logging.getLogger(Path(__file__).stem)
+19:
+20:
+21: def get_provenance_record(attributes, ancestor_files):
+22: """Create a provenance record describing the diagnostic data and plot."""
+23: caption = caption = attributes['caption'].format(**attributes)
+24:
+25: record = {
+26: 'caption': caption,
+27: 'statistics': ['mean'],
+28: 'domains': ['global'],
+29: 'plot_types': ['zonal'],
+30: 'authors': [
+31: 'andela_bouwe',
+32: 'righi_mattia',
+33: ],
+34: 'references': [
+35: 'acknow_project',
+36: ],
+37: 'ancestors': ancestor_files,
+38: }
+39: return record
+40:
+41:
+42: def compute_diagnostic(filename):
+43: """Compute an example diagnostic."""
+44: logger.debug("Loading %s", filename)
+45: cube = iris.load_cube(filename)
+46:
+47: logger.debug("Running example computation")
+48: cube = iris.util.squeeze(cube)
+49: return cube
+50:
+51:
+52: def plot_diagnostic(cube, basename, provenance_record, cfg):
+53: """Create diagnostic data and plot it."""
+54:
+55: # Save the data used for the plot
+56: save_data(basename, provenance_record, cfg, cube)
+57:
+58: if cfg.get('quickplot'):
+59: # Create the plot
+60: quickplot(cube, **cfg['quickplot'])
+61: # And save the plot
+62: save_figure(basename, provenance_record, cfg)
+63:
+64:
+65: def main(cfg):
+66: """Compute the time average for each input dataset."""
+67: # Get a description of the preprocessed data that we will use as input.
+68: input_data = cfg['input_data'].values()
+69:
+70: # Demonstrate use of metadata access convenience functions.
+71: selection = select_metadata(input_data, short_name='tas', project='CMIP5')
+72: logger.info("Example of how to select only CMIP5 temperature data:\n%s",
+73: pformat(selection))
+74:
+75: selection = sorted_metadata(selection, sort='dataset')
+76: logger.info("Example of how to sort this selection by dataset:\n%s",
+77: pformat(selection))
+78:
+79: grouped_input_data = group_metadata(input_data,
+80: 'variable_group',
+81: sort='dataset')
+82: logger.info(
+83: "Example of how to group and sort input data by variable groups from "
+84: "the recipe:\n%s", pformat(grouped_input_data))
+85:
+86: # Example of how to loop over variables/datasets in alphabetical order
+87: groups = group_metadata(input_data, 'variable_group', sort='dataset')
+88: for group_name in groups:
+89: logger.info("Processing variable %s", group_name)
+90: for attributes in groups[group_name]:
+91: logger.info("Processing dataset %s", attributes['dataset'])
+92: input_file = attributes['filename']
+93: cube = compute_diagnostic(input_file)
+94:
+95: output_basename = Path(input_file).stem
+96: if group_name != attributes['short_name']:
+97: output_basename = group_name +'_'+ output_basename
+98: if"caption"notin attributes:
+99: attributes['caption'] = input_file
+100: provenance_record = get_provenance_record(
+101: attributes, ancestor_files=[input_file])
+102: plot_diagnostic(cube, output_basename, provenance_record, cfg)
+103:
+104:
+105: if__name__=='__main__':
+106:
+107: with run_diagnostic() as config:
+108: main(config)
+
+
+
+
+
+
+
+
+
+
+
What is the starting point of a diagnostic?
+
+
+
Can you spot a function called main in the code
+above?
+
What are its input arguments?
+
How many times is this function mentioned?
+
+
+
+
+
+
+
+
+
+
+
The main function is defined in line 65 as
+main(cfg).
+
The input argument to this function is the variable
+cfg, a Python dictionary that holds all the necessary
+information needed to run the diagnostic script such as the location of
+input data and various settings. We will next parse this
+cfg variable in the main function and extract
+information as needed to do our analyses (e.g. in line 68).
+
The main function is called near the very end on line
+108. So, it is mentioned twice in our code - once where it is called by
+the top-level Python script and second where it is defined.
+
+
+
+
+
+
+
+
+
+
+
The function run_diagnostic
+
+
The function run_diagnostic (line 107) is called a
+context manager provided with ESMValTool and is the main entry point for
+most Python diagnostics.
+
+
+
+
Preprocessor-diagnostic interface
+
+
+
In the previous exercise, we have seen that the variable
+cfg is the input argument of the main
+function. The first argument passed to the diagnostic via the
+cfg dictionary is a path to a file called
+settings.yml. The ESMValTool documentation page provides an
+overview of what is in this file, see [Diagnostic script
+interfaces][interface].
+
+
+
+
+
+
What information do I need when writing a diagnostic script?
+
+
From the lesson Configuration, we
+saw how to change the configuration settings before running a recipe.
+First we set the option remove_preproc_dir to
+false in the configuration file, then run the recipe
+recipe_python.yml:
+
+
BASH
+
+
esmvaltool run examples/recipe_python.yml
+
+
+
Find one example of the file settings.yml in the
+run directory?
+
Open the file settings.yml and look at the
+input_files list. It contains paths to some files
+metadata.yml. What information do you think is saved in
+those files?
+
+
+
+
+
+
+
+
+
+
+
One example of settings.yml can be found in the
+directory:
+path_to_recipe_output/run/map/script1/settings.yml
+
+
The metadata.yml files hold information about the
+preprocessed data. There is one file for each variable having detailed
+information on your data including project (e.g., CMIP6, CMIP5), dataset
+names (e.g., BCC-ESM1, CanESM2), variable attributes (e.g.,
+standard_name, units), preprocessor applied and time range of the data.
+You can use all of this information in your own diagnostic.
+
+
+
+
+
+
Diagnostic shared functions
+
+
+
Looking at the code in diagnostic.py, we see that
+input_data is read from the cfg dictionary
+(line 68). Now we can group the input_data according to
+some criteria such as the model or experiment. To do so, ESMValTool
+provides many functions such as select_metadata (line 71),
+sorted_metadata (line 75), and group_metadata
+(line 79). As you can see in line 8, these functions are imported from
+esmvaltool.diag_scripts.shared that means these are shared
+across several diagnostics scripts. A list of available functions and
+their description can be found in [The ESMValTool Diagnostic API
+reference][shared].
+
+
Extracting
+information needed for analyses
+
We have seen the functions used for selecting, sorting and grouping
+data in the script. What do these functions do?
+
+
Answer
+
There is a statement after use of select_metadata,
+sorted_metadata and group_metadata that starts
+with logger.info (lines 72, 76 and 82). These lines print
+output to the log files. In the previous exercise, we ran the recipe
+recipe_python.yml. If you look at the log file
+recipe_python_#_#/run/map/script1/log.txt in
+esmvaltool_output directory, you can see the output from
+each of these functions, for example:
+
2023-06-28 12:47:14,038 [2548510] INFO diagnostic,106 Example of how to
+group and sort input data by variable groups from the recipe:
+{'tas': [{'alias': 'CMIP5',
+ 'caption': 'Global map of {long_name} in January 2000 according to '
+ '{dataset}.\n',
+ 'dataset': 'bcc-csm1-1',
+ 'diagnostic': 'map',
+ 'end_year': 2000,
+ 'ensemble': 'r1i1p1',
+ 'exp': 'historical',
+ 'filename': '~/recipe_python_20230628_124639/preproc/map/tas/
+This is how we can access preprocessed data within our diagnostic.
+
+
+
+
+
Diagnostic computation
+
+
+
After grouping and selecting data, we can read individual attributes
+(such as filename) of each item. Here, we have grouped the input data by
+variables, so we loop over the variables (line 88).
+Following this is a call to the function compute_diagnostic
+(line 93). Let’s look at the definition of this function in line 42,
+where the actual analysis of the data is done.
+
Note that output from the ESMValCore preprocessor is in the form of
+NetCDF files. Here, compute_diagnostic uses Iris
+to read data from a netCDF file and performs an operation
+squeeze to remove any dimensions of length one. We can
+adapt this function to add our own analysis. As an example, here we
+calculate the bias using the average of the data using Iris cubes.
+
+
PYTHON
+
+
def compute_diagnostic(filename):
+"""Compute an example diagnostic."""
+ logger.debug("Loading %s", filename)
+ cube = iris.load_cube(filename)
+
+ logger.debug("Running example computation")
+ cube = iris.util.squeeze(cube)
+
+# Calculate a bias using the average of data
+ cube.data = cube.core_data() - cube.core_data.mean()
+return cube
+
+
+
+
+
+
+
iris cubes
+
+
Iris reads data from NetCDF files into data structures called cubes.
+The data in these cubes can be modified, combined with other cubes’ data
+or plotted.
+
+
+
+
+
+
+
+
+
Reading data using xarray
+
+
Alternately, you can use xarrays to read the data
+instead of Iris.
+
+
+
+
+
+
+
+
+
First, import xarray package at the top of the script
+as:
+
+
PYTHON
+
+
import xarray as xr
+
+
Then, change the compute_diagnostic as:
+
+
PYTHON
+
+
def compute_diagnostic(filename):
+"""Compute an example diagnostic."""
+ logger.debug("Loading %s", filename)
+ dataset = xr.open_dataset(filename)
+
+#do your analyses on the data here
+
+return dataset
+
+
Caution: If you read data using xarray keep in mind to change
+accordingly the other functions in the diagnostic which are dealing at
+the moment with Iris cubes.
+
+
+
+
+
+
+
+
+
+
Reading data using the netCDF4 package
+
+
Yet another option to read the NetCDF file data is to use the
+[netCDF-4 Python interface][netCDF] to the netCDF C library.
+
+
+
+
+
+
+
+
+
First, import the netCDF4 package at the top of the
+script as:
+
+
PYTHON
+
+
import netCDF4
+
+
Then, change compute_diagnostic as:
+
+
PYTHON
+
+
def compute_diagnostic(filename):
+"""Compute an example diagnostic."""
+ logger.debug("Loading %s", filename)
+ nc_data = netCDF4.Dataset(filename,'r')
+
+#do your analyses on the data here
+
+return nc_data
+
+
Caution: If you read data using netCDF4 keep in mind to change
+accordingly the other functions in the diagnostic which are dealing at
+the moment with Iris cubes.
+
+
+
+
+
Diagnostic output
+
+
+
+
Plotting the output
+
+
Often, the end product of a diagnostic script is a plot or figure.
+The Iris cube returned from the compute_diagnostic function
+(line 93) is passed to the plot_diagnostic function (line
+102). Let’s have a look at the definition of this function in line 52.
+This is where we would plug in our plotting routine in the diagnostic
+script.
+
More specifically, the quickplot function (line 60) can
+be replaced with the function of our choice. As can be seen, this
+function uses **cfg['quickplot'] as an input argument. If
+you look at the diagnostic section in the recipe
+recipe_python.yml, you see quickplot is a key
+there:
This way, we can pass arguments such as the type of plot
+pcolormesh and the colormap cmap:Reds from the
+recipe to the quickplot function in the diagnostic.
+
+
+
+
+
+
Passing arguments from the recipe to the diagnostic
+
+
Change the type of the plot and its colormap and inspect the output
+figure.
+
+
+
+
+
+
+
+
+
In the recipe recipe_python.yml, you could change
+plot_type and cmap. As an example, we choose
+plot_type: pcolor and cmap: BuGn:
The plot can be found at
+path_to_recipe_output/plots/map/script1/png.
+
+
+
+
+
+
+
+
+
+
ESMValTool gallery
+
+
ESMValTool makes it possible to produce a wide array of plots and
+figures as seen in the gallery.
+
+
+
+
+
+
Saving the output
+
+
In our example, the function save_data in line 56 is
+used to save the Iris cube. The saved files can be found under the
+work directory in a .nc format. There is also
+the function save_figure in line 62 to save the plots under
+the plot directory in a .png format (or
+preferred format specified in your configuration settings). Again, you
+may choose your own method of saving the output.
+
+
+
Recording the provenance
+
+
When developing a diagnostic script, it is good practice to record
+provenance. To do so, we use the function
+get_provenance_record (line 100). Let us have a look at the
+definition of this function in line 21 where we describe the diagnostic
+data and plot. Using the dictionary record, it is possible
+to add custom provenance to our diagnostics output. Provenance is stored
+in the W3C PROV
+XML format and also in an SVG file under the
+work and plot directory. For more information,
+see [recording provenance][provenance].
+
+
Congratulations!
+
+
+
You now know the basic diagnostic script structure and some available
+tools for putting together your own diagnostics. Have a look at existing
+recipes and diagnostics in the repository for more examples of functions
+you can use in your diagnostics!
+
+
+
+
+
+
Key Points
+
+
+
ESMValTool provides helper functions to interface a Python
+diagnostic script with preprocessor output.
+
Existing diagnostics can be used as templates and modified to write
+new diagnostics.
+
Helper functions can be imported from
+esmvaltool.diag_scripts.shared and used in your own
+diagnostic script.
How to use the existing CMORizer scripts shipped with
+ESMValTool?
+
How to add support for new (observational) datasets?
+
+
+
+
+
+
+
+
Objectives
+
+
Understand what CMORization is and why it is necessary.
+
Use existing scripts to CMORize your data.
+
Write a new CMORizer script to support additional data.
+
+
+
+
+
+
+
Introduction
+
+
+
This episode deals with “CMORization”. ESMValTool is designed to work
+with data that follow the CMOR standards. Unfortunately, not all
+datasets follow these standards. In order to use such datasets in
+ESMValTool we first need to reformat the data. This process is called
+“CMORization”.
+
+
+
+
+
+
What are the CMOR standards?
+
+
The name “CMOR” originates from a tool: the Climate Model Output Rewriter.
+This tool is used to create “CF-Compliant netCDF files for use in the
+CMIP projects”. So CMOR extends the CF-standard with additional
+requirements for the Coupled Model Intercomparison Projects (see e.g. here).
+
Concretely, the CMOR standards dictate e.g. the variable names and
+units, coordinate information, how the data should be structured (e.g. 1
+variable per file), additional metadata requirements, and file naming
+conventions a.k.a. the data reference syntax (DRS).
+All this information is stored in so-called CMOR tables. For example,
+the CMOR tables for the CMIP6 project can be found here.
+
+
+
+
ESMValTool offers two ways to CMORize data:
+
+
A reformatting script can be used to create a CMOR-compliant copy.
+CMORizer scripts for several popular datasets are included in
+ESMValTool, and ESMValTool also provides a convenient way to execute
+them.
+
ESMValCore can execute CMOR fixes ‘[on the fly](https://docs.esmvaltool.org/projects/esmvalcore/en/latest/develop/
+fixing_data.html#fixing-data)’. The advantage is that you don’t need to
+store an additional, reformatted copy of the data. The disadvantage is
+that these fixes should be implemented inside ESMValCore, which is
+beyond the scope of this tutorial.
+
+
In this lesson, we will re-implement a CMORizer script for the
+FLUXCOM dataset that contains observations of the Gross Primary
+Production (GPP), a variable that is important for calculating
+components of the global carbon cycle. See the next section on how to
+obtain data.
The data for this episode is available via the FluxCom Data
+Portal. First you’ll need to register. After registration, in the
+dropdown boxes, select FLUXCOM as the data choice and click download.
+Three files will be displayed. Click the download button on the “FLUXCOM
+(RS+METEO) Global Land Carbon Fluxes using CRUNCEP climate data”. You’ll
+receive an email with the FTP address to access the server. Connect to
+the server, follow the path in your email, and look for the file
+raw/monthly/GPP.ANN.CRUNCEPv6.monthly.2000.nc. Download
+that file and save it in a folder called
+~/data/RAWOBS/Tier3/FLUXCOM.
+
Note: you’ll need a user-friendly ftp client. On Linux,
+ncftp works okay.
+
+
+
+
+
+
What is the deal with those “tiers”?
+
+
Many datasets come with access restrictions. In this way the data
+providers can keep track of how their data is used. In many cases
+“restricted access” just means that one has to register with an email
+address and accept the terms of use, which typically ask that you
+acknowledge the data providers.
+
There are also datasets available that do not need a registration.
+The “obs4MIPs” or “ana4MIPs” datasets, for example, are specifically
+produced to facilitate comparisons with model simulations.
+
To reflect these different levels of access restriction, the
+ESMValTool team has created a tier-system. The definition of the
+different tiers are as follows:
+
+
+Tier1: obs4MIPs and ana4MIPS datasets (can be used
+directly with the ESMValTool)
+
+Tier2: other freely available datasets (most of
+them will need some kind of cmorization)
+
+Tier3: datasets with access restrictions (most of
+these datasets will also need some kind of cmorization)
+
+
These access restrictions are also why the ESMValTool developers
+cannot distribute copies or automate downloading of all observations and
+reanalysis data used in the recipes. As a compromise, we provide the
+CMORization scripts so that each user can CMORize their own copy of the
+access restricted datasets if needed.
+
+
+
+
Run the existing CMORizer script
+
+
+
Before we develop our own CMORizer script, let’s first see what
+happens when we run the existing one. There is a specific command
+available in the ESMValTool to run the CMORizer scripts:
+
+
BASH
+
+
esmvaltool data format --config_file<path to config-user.yml><dataset-name>
+
+
The config-user.yml is the file in which we define the
+different data paths, see the episode on Configuration. In the
+rootpath of your config-user.yml, make sure to
+add the right directory for “RAWOBS” data in which you downloaded the
+FLUXCOM dataset:
+
+
YAML
+
+
rootpath:
+RAWOBS: ~/data/RAWOBS
+
+
This enables ESMValTool to find the raw observational datasets stored
+in the “RAWOBS” folder. The dataset-name needs to be
+identical to the folder name that was created to store the raw
+observation data files, i.e. RAWOBS/TierX/dataset-name. In
+our case this would be “FLUXCOM”.
+
If everything is okay, the output should look something like
+this:
+
+
OUTPUT
+
+
...
+... Starting the CMORization Tool at time: 2022-07-26 14:02:16 UTC
+... ----------------------------------------------------------------------
+... input_dir = /home/peter/data/RAWOBS
+... output_dir = /home/peter/esmvaltool_output/data_formatting_20220726_140216
+... ----------------------------------------------------------------------
+... Running the CMORization scripts.
+... Processing datasets ['FLUXCOM']
+... Input data from: /home/peter/data/RAWOBS/Tier3/FLUXCOM
+... Output will be written to: /home/peter/esmvaltool_output/
+ data_formatting_20220726_140216/Tier3/FLUXCOM
+... Reformat script: /home/peter/mambaforge/envs/esmvaltool/lib/python3.9/
+ site-packages/esmvaltool/cmorizers/data/formatters/datasets/fluxcom
+... CMORizing dataset FLUXCOM using Python script /home/peter/mambaforge/envs/
+ esmvaltool/lib/python3.9/site-packages/esmvaltool/cmorizers/data/formatters/
+ datasets/fluxcom.py
+... Found input file '/home/peter/data/RAWOBS/Tier3/FLUXCOM/GPP.ANN.CRUNCEPv6.monthly.*.nc'
+... CMORizing variable 'gpp'
+... Lmon
+... Var is gpp
+... ... UserWarning: Ignoring netCDF variable 'GPP' invalid units 'gC m-2 day-1'
+
+... Fixing time...
+... Fixing latitude...
+... Fixing longitude...
+... Flipping dimensional coordinate latitude...
+... Saving file
+... Saving: /home/peter/esmvaltool_output/data_formatting_20220726_140216/Tier3/
+ FLUXCOM/OBS_FLUXCOM_reanaly_ANN-v1_Lmon_gpp_200001-200012.nc
+... Cube has lazy data [lazy is preferred]
+... CMORization of dataset FLUXCOM finished!
+... Formatting successful for dataset FLUXCOM
+
+
So you can see that several fixes are applied, and the CMORized file
+is written to the ESMValTool output directory, i.e.
+~/esmvaltool_output/data_formatting_YYYYMMDD_HHMMSS/TierX/dataset-name/filename.nc
+In order to use it, we’ll have to copy it from the output directory to a
+folder called ~/data/OBS/Tier3/FLUXCOM and make sure the
+path to OBS is set correctly in our config-user file:
+
+
YAML
+
+
rootpath:
+OBS: ~/data/OBS
+
+
You can also see the path where ESMValTool stores the reformatting
+script:
+~/ESMValTool/esmvaltool/data/formatters/datasets/fluxcom.py.
+You may have a look at this file if you want. The script also uses a
+configuration file:
+~/ESMValTool/esmvaltool/cmorizers/data/cmor_config/FLUXCOM.yml.
+
Make a test recipe
+
+
+
To verify that the data is correctly CMORized, we will make a simple
+test recipe. As illustrated in the figure at the top of this episode,
+one of the steps that ESMValTool executes is a CMOR-check. If the data
+is not correctly CMORized, ESMValTool will give a warning or error.
+
+
+
+
+
+
Create a test recipe
+
+
Create a simple recipe called recipe_check_fluxcom.yml that
+loads the FLUXCOM data. It should include a datasets section with a
+single entry for the “FLUXCOM” dataset with the correct dataset keys,
+and a diagnostics section with two variables: gpp. We don’t need any
+preprocessors or scripts (set scripts: null), but we have
+to add a documentation section with a description, authors and
+maintainer, otherwise the recipe will fail.
+
Use the following dataset keys:
+
+
project: OBS
+
dataset: FLUXCOM
+
type: reanaly
+
version: ANN-v1
+
mip: Lmon
+
start_year: 2000
+
end_year: 2000
+
tier: 3
+
+
Some of these dataset keys are further explained in the callout boxes
+in this episode.
+
+
+
+
+
+
+
+
+
Here’s an example recipe
+
+
YAML
+
+
documentation:
+
+description: Test recipe for FLUXCOM data
+title: This is a test recipe for the FLUXCOM data.
+
+authors:
+- kalverla_peter
+
+maintainer:
+- kalverla_peter
+
+datasets:
+-{project: OBS,dataset: FLUXCOM,mip: Lmon,tier:3,start_year:2000,
+end_year:2000,type: reanaly,version: ANN-v1}
+
+diagnostics:
+check_fluxcom:
+description: Check that ESMValTool can load the cmorized fluxnet data without errors.
+variables:
+gpp:
+scripts:null
esmvaltool run recipe_check_fluxcom.yml --config_file<path to config-user.yml> --log_level debug
+
+
If everything is okay, the recipe should run without problems.
+
Starting from scratch
+
+
+
Now that you’ve seen how to use an existing CMORizer script, let’s
+think about adding a new one. We will remove the existing CMORizer
+script, and re-implement it from scratch. This exercise allows us to
+point out all the details of what’s going on. We’ll also remove the
+CMORized data that we’ve just created, so our test recipe will not be
+able to use it anymore.
"""ESMValTool CMORizer for FLUXCOM GPP data.
+
+<We will add some useful info here later>
+"""
+import logging
+from esmvaltool.cmorizers.data import utilities as utils
+
+logger = logging.getLogger(__name__)
+
+def cmorization(in_dir, out_dir, cfg, cfg_user, start_date, end_date):
+"""Cmorize the dataset."""
+
+# This is where you'll add the cmorization code
+# 1. find the input data
+# 2. apply the necessary fixes
+# 3. store the data with the correct filename
+
+
Here, in_dir corresponds to the input directory of the
+raw files, out_dir to the output directory of final
+reformatted data set and cfg to a configuration dictionary
+given by a configuration file that we will get to shortly. The last
+three arguments will not be considered in this script but can be used in
+other cases. cfg_user corresponds to the user configuration
+file, start_date to the start of the period to format, and
+end_date to the end of the period to format. When you type
+the command esmvaltool data format in the terminal,
+ESMValTool will call this function with the settings found in your
+configuration files.
+
The ESMValTool CMORizer also needs a dataset configuration file.
+Create a file called
+~/ESMValTool/esmvaltool/cmorizers/data/cmor_config/FLUXCOM.yml
+and fill it with the following boilerplate:
Note: the name of this file must be
+identical to dataset-name.
+
As you can see, the configuration file contains information about the
+original filename of the dataset, and some additional metadata that you
+might recognize from the CMOR filename structure. It also contains a
+list of variables that’s available for this dataset. We’ll add this
+information step by step in the following sections.
+
+
+
+
+
+
RAWOBS, OBS, OBS6!?
+
+
In the configuration above we’ve already filled in the
+project_id. ESMValTool uses these project IDs to find the
+data on your hard drive, and also to find more information about the
+data. The RAWOBS and OBS projects refer to
+external data before and after CMORization, respectively.
+Historically, most external data were observations, hence the
+naming.
+
In going from CMIP5 to CMIP6, the CMOR standards changed a bit. For
+example, some variables were renamed, which posed a dilemma: should
+CMORization reformat to the CMIP5 or CMIP6 definition? To solve this,
+the OBS6 project was created. So OBS6 data
+follow the CMIP6 standards, and that’s what we’ll use for the new
+CMORizer.
+
+
+
+
You can try running the CMORizer at this point, and it should work
+without errors. However, it doesn’t produce any output yet:
+
+
BASH
+
+
esmvaltool data format --config_file<path to config-user.yml> FLUXCOM
+
+
+
1. Find the input data
+
+
First we’ll get the CMORizer script to locate our FLUXCOM data. We
+can use the information from the in_dir and
+cfg variables. Add the following snippet to your CMORizer
+script:
+
+
PYTHON
+
+
# 1. find the input data
+logger.info("in_dir: '%s'", in_dir)
+logger.info("cfg: '%s'", cfg)
+
+
If you run the CMORizer again, it will print out the content of these
+variables and the output should contain something like this:
Try to locate the input data inside the CMORizer script and load it
+(we’ll use iris because ESMValTool includes helper
+utilities for iris cubes). Confirm that you’ve loaded the data by
+logging the correct path and (part of the) file content.
+
+
+
+
+
+
+
+
+
There are many ways to do it. In any case, you should have added the
+original filename to the configuration file (and un-commented this
+line): filename: 'GPP.ANN.CRUNCEPv6.monthly.*.nc'. Note the
+*: this is a useful shorthand to find multiple files for
+different years. In a similar way we can also look for multiple
+variables, etc.
+
Here’s an example solution (inserted directly under the original
+comment):
+
+
PYTHON
+
+
# 1. find the input data
+filename_pattern = cfg['filename']
+matches = Path(in_dir).glob(filename_pattern)
+
+for match in matches:
+ input_file =str(match)
+ logger.info("found: %s", input_file)
+ cube = iris.load_cube(input_file)
+ logger.info("content: %s", cube)
+
+
To make this work we’ve added import iris and
+from pathlib import Path at the top of the file. Note that
+we’ve started a loop, since we may find multiple files if there’s more
+than one year of data available.
+
+
+
+
+
+
+
2. Save the data with the correct filename
+
+
Before we start adding fixes, we’ll first make sure that our CMORizer
+can also write output files with the correct name. This will enable us
+to use the test recipe for the CMOR compatibility check.
+
We can use the save function from the utils
+that we imported at the top. The call signature looks like this:
+utils.save_variables(cube, var, outdir, attrs, **kwargs).
+
We already have the cube and the outdir.
+The variable short name (var) and attributes
+(attrs) are set through the configuration file. So we need
+to find out what the correct short name and attributes are.
+
The standard attributes for CMIP variables are defined in the CMIP
+tables. These tables are differentiated according to the “MIP” they
+belong to. The tables are a copy of the PCMDI guidelines.
+
+
+
+
+
+
Find the variable “gpp” in a CMOR table
+
+
Check the available CMOR tables to find the variable “gpp” with the
+following characteristics:
The variable “gpp” belongs to the land variables. The temporal
+resolution that we are looking for is “monthly”. This information points
+to the “Lmon” CMIP table. And indeed, the variable “gpp” can be found in
+the file [here](https://github.com/ESMValGroup/ESMValCore/blob/main/esmvalcore/
+cmor/tables/cmip6/Tables/CMIP6_Lmon.json).
+
+
+
+
+
If the variable you are interested in is not available in the
+standard CMOR tables, you could write a custom CMOR table entry for the
+variable. This, however, is beyond the scope of this tutorial.
+
+
+
+
+
+
Fill the configuration file
+
+
Uncomment the following entries in your configuration file and fill
+them with appropriate values:
+
+
dataset_id
+
version
+
tier
+
modeling_realm
+
short_name (the ??? immediately under variables)
+
mip
+
+
+
+
+
+
+
+
+
+
The configuration file now look something like this:
Now that we have set this information correctly in the config file,
+we can call the save function. Add the following python code to your
+CMORizer script:
+
+
PYTHON
+
+
# 3. store the data with the correct filename
+attributes = cfg['attributes']
+variables = cfg['variables']
+
+for short_name, variable_info in variables.items():
+ all_attributes = {**attributes, **variable_info} # add the mip to the other attributes
+ utils.save_variable(cube=cube, var=short_name, outdir=out_dir, attrs=all_attributes)
+
+
Since we only have one variable (gpp), the loop is not strictly
+necessary. However, this makes it possible to add more variables later
+on.
+
+
+
+
+
+
Was the CMORization successful so far?
+
+
If you run the CMORizer again, you should see that it creates an
+output file named
+OBS6_FLUXCOM_reanaly_ANN-v1_Lmon_gpp_xxxx01-xxxx12.nc
+stored in your ESMValTool output directory
+~/esmvaltool_output/data_formatting_YYYYMMDD_HHMMSS/Tier3/FLUXCOM/.
+The “xxxx” and “yyyy” represent the start and end year of the data.
+
+
+
+
Great! So we have produced a NetCDF file with the CMORizer that
+follows the naming convention for ESMValTool datasets. Let’s have a look
+at the NetCDF file as it was written with the very basic CMORizer from
+above.
The file contains a variable named “GPP” that contains three
+dimensions: “time”, “lat”, “lon”. Notice the strange time units, and the
+invalid_units in the global attributes section. Also it
+seems that there is not information available about the lat and lon
+coordinates. These are just some of the things we’ll address in the next
+section.
+
+
+
3. Implementing additional fixes
+
+
Copy the output of the CMORizer to your folder
+~/data/OBS6/Tier3/FLUXCOM/ and change the test recipe to
+look for OBS6 data instead of OBS (note: we’re upgrading the CMORizer to
+newer standards here!). Make sure the path to OBS6 is set
+correctly in our config-user file:
+
+
YAML
+
+
rootpath:
+OBS6: ~/data/OBS6
+
+
If we now run the test recipe on our newly ‘CMORized’ data,
+
+
BASH
+
+
esmvaltool run recipe_check_fluxcom.yml --config_file<path to config-user.yml> --log_level debug
+
+
it should be able to find the correct file, but it does not succeed
+yet. The first thing that the ESMValTool CMOR checker brings up is:
+
+
ERROR
+
+
iris.exceptions.UnitConversionError: Cannot convert from unknown units. The
+"units" attribute may be set directly.
+
+
If you look closely at the error messages, you can see that this
+error concerns the units of the coordinates. ESMValTool tries to fix
+them automatically, but since no units are defined on the coordinates,
+this fails.
+
The cmorizer utilities also include a function called
+fix_coords, but before we can use it, we’ll also need to
+make sure the coordinates have the correct standard name. Add the
+following code to your cmorizer:
+
+
PYTHON
+
+
# 2. Apply the necessary fixes
+# 2a. Fix/add coordinate information and metadata
+cube.coord('lat').standard_name ='latitude'
+cube.coord('lon').standard_name ='longitude'
+utils.fix_coords(cube)
+
+
With some additional refactoring, our cmorization function might then
+look something like this:
+
+
PYTHON
+
+
def cmorization(in_dir, out_dir, cfg, cfg_user, start_date, end_date):
+"""Cmorize the dataset."""
+
+# Get general information from the config file
+ attributes = cfg['attributes']
+ variables = cfg['variables']
+
+for short_name, variable_info in variables.items():
+ logger.info("CMORizing variable: %s", short_name)
+
+# 1a. Find the input data (one file for each year)
+ filename_pattern = cfg['filename']
+ matches = Path(in_dir).glob(filename_pattern)
+
+for match in matches:
+# 1b. Load the input data
+ input_file =str(match)
+ logger.info("found: %s", input_file)
+ cube = iris.load_cube(input_file)
+
+# 2. Apply the necessary fixes
+# 2a. Fix/add coordinate information and metadata
+ cube.coord('lat').standard_name ='latitude'
+ cube.coord('lon').standard_name ='longitude'
+ utils.fix_coords(cube)
+
+# 3. Save the CMORized data
+ all_attributes = {**attributes, **variable_info}
+ utils.save_variable(cube=cube, var=short_name, outdir=out_dir, attrs=all_attributes)
+
+
Run the CMORizer script once more. Have a look at the netCDF file,
+and confirm that the coordinates now have much more metadata added to
+them. Then, run the test recipe again with the latest CMORizer output.
+The next error is:
+
+
ERROR
+
+
esmvalcore.cmor.check.CMORCheckError: There were errors in variable GPP:
+Variable GPP units unknown can not be converted to kg m-2 s-1 in cube:
+
+
Okay, so let’s fix the units of the “GPP” variable in the CMORizer.
+Remember that you can find the correct units in the CMOR table. Add the
+following three lines to our CMORizer:
+
+
PYTHON
+
+
# 2b. Fix gpp units
+logger.info("Changing units for gpp from gc/m2/day to kg/m2/s")
+cube.data = cube.core_data() / (1000*86400)
+cube.units ='kg m-2 s-1'
+
+
If everything is okay, the test recipe should now pass. We’re getting
+there. Looking through the output though, there’s still a warning.
+
+
OUTPUT
+
+
WARNING There were warnings in variable GPP:
+Standard name for GPP changed from None to gross_primary_productivity_of_biomass_expressed_as_carbon
+Long name for GPP changed from GPP to Carbon Mass Flux out of Atmosphere Due to
+ Gross Primary Production on Land [kgC m-2 s-1]
+
+
ESMValTool is able to apply automatic fixes here, but if we are
+running a CMORizer script anyway, we might as well fix it
+immediately.
You can see that we’re using the CMOR table here. This was passed on
+by ESMValTool as part of the CFG input variable. So here
+we’re making sure that we’re updating the cubes metadata to conform to
+the CMOR table.
+
Finally, the test recipe should run without errors or warnings.
+
+
+
4. Finalizing the CMORizer
+
+
Once everything works as expected, there’s a couple of things that we
+can still do.
+
+
+Add download instructions. The header of the
+CMORizer contains information about where to obtain the data, when it
+was accessed the last time, which ESMValTool “tier” it is associated
+with, and more detailed information about the necessary downloading and
+processing steps.
+
+
+
+
+
+
+
Fill out the header for the “FLUXCOM” dataset
+
+
Fill out the header of the new CMORizer. The different parts that
+need to be present in the header are the following:
+
+
Caption: the first line of the docstring should summarize what the
+script does.
+
Tier
+
Source
+
Last access
+
Download and processing instructions
+
+
+
+
+
+
+
+
+
+
The header for the “FLUXCOM” dataset could look something like
+this:
+
+
PYTHON
+
+
"""ESMValTool CMORizer for FLUXCOM GPP data.
+
+Tier
+ Tier 3: restricted dataset.
+
+Source
+ http://www.bgc-jena.mpg.de/geodb/BGI/Home
+
+Last access
+ 20190727
+
+Download and processing instructions
+ From the website, select FLUXCOM as the data choice and click download.
+ Two files will be displayed. One for Land Carbon Fluxes and one for
+ Land Energy fluxes. The Land Carbon Flux file (RS + METEO) using
+ CRUNCEP data file has several data files for different variables.
+ The data for GPP generated using the
+ Artificial Neural Network Method will be in files with name:
+ GPP.ANN.CRUNCEPv6.monthly.\*.nc
+ A registration is required for downloading the data.
+ Users in the UK with a CEDA-JASMIN account may request access to the jules
+ workspace and access the data.
+ Note : This data may require rechunking of the netcdf files.
+ This constraint will not exist once iris is updated to
+ version 2.3.0 Aug 2019
+"""
+
+
+
+
+
+
+
+Fill the dataset information list. The file
+[datasets.yml](https://github.com/ESMValGroup/ESMValTool/blob/main/esmvaltool/
+cmorizers/data/datasets.yml) contains the ESMValTool “tier”, the data
+source, the last access time and download instructions for all supported
+datasets in ESMValTool. You can simply reuse the information written in
+the header of the CMORizer.
+
+
+
+
+
+
+
Fill out the FLUXCOM entry in
+datasets.yml
+
+
+
Fill out the FLUXCOM entry in datasets.yml. The
+different parts that need to be present in the entry are the
+following:
+
+
Dataset-name
+
Tier
+
Source
+
Last access
+
Download and processing instructions
+
+
+
+
+
+
+
+
+
+
The entry for the “FLUXCOM” dataset should look like:
+
+
YAML
+
+
FLUXCOM:
+tier:3
+source: http://www.bgc-jena.mpg.de/geodb/BGI/Home
+last_access: 2019-07-27
+ info: |
+ From the website, select FLUXCOM as the data choice and click download.
+ Two files will be displayed. One for Land Carbon Fluxes and one for
+ Land Energy fluxes. The Land Carbon Flux file (RS + METEO) using
+ CRUNCEP data file has several data files for different variables.
+ The data for GPP generated using the
+ Artificial Neural Network Method will be in files with name:
+ GPP.ANN.CRUNCEPv6.monthly.*.nc
+ A registration is required for downloading the data.
+ Users in the UK with a CEDA-JASMIN account may request access to the jules
+ workspace and access the data.
+ Note : This data may require rechunking of the netcdf files.
+ This constraint will not exist once iris is updated to
+ version 2.3.0 Aug 2019
+
+
+
+
+
+
Once the datasets.yml file is filled, you can check that
+ESMValTool can display information about the added dataset with:
+
+
BASH
+
+
esmvaltool data info FLUXCOM
+
+
If everything is okay, the output should look something like
+this:
+
+
OUTPUT
+
+
$ esmvaltool data info FLUXCOM
+FLUXCOM
+
+Tier: 3
+Source: http://www.bgc-jena.mpg.de/geodb/BGI/Home
+Automatic download: No
+
+From the website, select FLUXCOM as the data choice and click download.
+Two files will be displayed. One for Land Carbon Fluxes and one for
+Land Energy fluxes. The Land Carbon Flux file (RS + METEO) using
+CRUNCEP data file has several data files for different variables.
+The data for GPP generated using the
+Artificial Neural Network Method will be in files with name:
+GPP.ANN.CRUNCEPv6.monthly.*.nc
+A registration is required for downloading the data.
+Users in the UK with a CEDA-JASMIN account may request access to the jules
+workspace and access the data.
+Note : This data may require rechunking of the netcdf files.
+This constraint will not exist once iris is updated to
+version 2.3.0 Aug 2019
+
+
Note that Automatic download: No means that no automatic
+downloading script is available in ESMValTool for this dataset. The
+implementation of such a script is beyond the scope of this tutorial. To
+find out which datasets come with an automatic download script, you can
+run: esmvaltool data list to list all datasets supported in
+ESMValTool. More information about the usage of automatic downloading
+scripts can be found in the User
+Guide.
+
+
+Complete the metadata in the config file. We have
+left a few fields empty in the configuration file, such as ‘source’. By
+filling out these fields we can make sure the relevant metadata is
+passed on as attributes in the CMORized data. To make this work, add the
+following line to the CMORizer script:
+
+
+
PYTHON
+
+
# 2d. Update the cubes metadata with all info from the config file
+utils.set_global_atts(cube, attributes)
+
+
+
Add a reference. Make sure that there is a
+reference file available for the dataset, see the instruction [here](https://docs.esmvaltool.org/en
+/latest/community/diagnostic.html#adding-references).
+
Make a pull request. Since you have gone through
+all the trouble to reformat the dataset so that the ESMValTool can work
+with it, it would be great if you could provide the CMORizer, and
+ultimately with that the dataset, to the rest of the community. For more
+information, see the episode on Development and
+contribution.
+
Add documentation. Make sure that you have added
+the info of your dataset to the User Guide so that people know it is
+available for the ESMValTool Obtaining
+input data.
+
+
+
Some final comments
+
+
+
Congratulations! You have just added support for a new dataset to
+ESMValTool! Adding a new CMORizer is definitely already an advanced task
+when working with the ESMValTool. You need to have a basic understanding
+of how the ESMValTool works and how it’s internal structure looks like.
+In addition, you need to have a basic understanding of NetCDF files and
+a programming language. In our example we used python for the CMORizing
+script since we advocate for focusing the code development on only a few
+different programming languages. This helps to maintain the code and to
+ensure the compatibility of the code with possible fundamental changes
+to the structure of the ESMValTool and ESMValCore.
+
More information about adding observations to the ESMValTool can be
+found in the documentation.
+
+
+
+
+
+
Key Points
+
+
+
CMORizers are dataset-specific scripts that can be run once to
+generate CMOR-compliant data.
+
ESMValTool comes with a set of CMORizers readily available, but you
+can also add your own.
Every user encounters errors. Once you know why you get certain types
+of errors, they become much easier to fix. The good news is that
+ESMValTool creates a record of the output messages and stores them in
+log files. They can be used for debugging or monitoring the process.
+This lesson helps you understand the different types of errors and when
+you are likely to encounter them.
+
Log files
+
+
+
Each time we run ESMValTool, it will produce a new output directory.
+This directory should contain the run folder that is
+automatically generated by ESMValTool. To examine this, we run a
+recipe_python.yml that can be found in lesson Running your first recipe . Check lesson Configuration on how to set paths.
+
In a new terminal, run the recipe:
+
+
BASH
+
+
cd esmvaltool_tutorial
+esmvaltool run examples/recipe_python.yml
+
+
+
ERROR
+
+
esmvaltool: command not found
+
+
ESMValTool encounters this error because the conda environment
+esmvaltool has not been activated. To fix the error, before
+running the recipe, activate the environment:
+
+
BASH
+
+
conda activate esmvaltool
+
+
+
+
+
+
+
conda environment
+
+
More information about the conda environment can be found at Installation.
In the main_log_debug.txt and main_log.txt,
+ESMValTool writes the output messages, warnings and possible errors that
+might happen during pre-processings. To inspect them, we can look inside
+the files. For example:
In the log.txt, ESMValTool writes the output messages,
+warnings and possible errors that are related to the diagnostic
+script.
+
If you encounter an error and don’t know what it means, it is
+important to read the log information. Sometimes knowing where the error
+occurred is enough to fix it, even if you don’t entirely understand the
+message. However, note that you may not always be able to find the error
+or fix it. In that case, ESMValTool community helps you figure out what
+went wrong.
+
+
+
+
+
+
Different log files
+
+
In the run directory, there are two log files
+main_log_debug.txt and main_log.txt. What are
+their differences?
+
+
+
+
+
+
+
+
+
The main_log_debug.txt contains the output messages from
+the pre-processor whereas the main_log.txt shows general
+errors and warnings that might happen in running the recipe and
+diagnostics script.
+
+
+
+
+
Let’s change some settings in the recipe to run a regional
+pre-processor. We use a text editor called nano to open the
+recipe file:
+
+
BASH
+
+
nano recipe_python.yml
+
+
+
+
+
+
+
Text editor side note
+
+
No matter what editor you use, you will need to know where it
+searches for and saves files. If you start it from the shell, it will
+(probably) use your current working directory as its default location.
+We use nano in examples here because it is one of the least
+complex text editors. Press ctrl + O to save the
+file, and then ctrl + X to exit
+nano.
+
+
+
+
+
+
+
+
+
+
YAML
+
+
# ESMValTool
+# recipe_python.yml
+---
+# See https://docs.esmvaltool.org/en/latest/recipes/recipe_examples.html
+# for a description of this recipe.
+
+# See https://docs.esmvaltool.org/projects/esmvalcore/en/latest/recipe/overview.html
+# for a description of the recipe format.
+---
+documentation:
+ description: |
+ Example recipe that plots a map and timeseries of temperature.
+
+title: Recipe that runs an example diagnostic written in Python.
+
+authors:
+- andela_bouwe
+- righi_mattia
+
+maintainer:
+- schlund_manuel
+
+references:
+- acknow_project
+
+projects:
+- esmval
+- c3s-magic
+
+datasets:
+-{dataset: BCC-ESM1,project: CMIP6,exp: historical,ensemble: r1i1p1f1,grid: gn}
+-{dataset: bcc-csm1-1,version: v1,project: CMIP5,exp: historical,ensemble: r1i1p1}
+
+preprocessors:
+# See https://docs.esmvaltool.org/projects/esmvalcore/en/latest/recipe/preprocessor.html
+# for a description of the preprocessor functions.
+
+to_degrees_c:
+convert_units:
+units: degrees_C
+
+annual_mean_amsterdam:
+extract_location:
+location: Amsterdam
+scheme: linear
+annual_statistics:
+operator: mean
+multi_model_statistics:
+statistics:
+- mean
+span: overlap
+convert_units:
+units: degrees_C
+
+annual_mean_global:
+area_statistics:
+operator: mean
+annual_statistics:
+operator: mean
+convert_units:
+units: degrees_C
+
+diagnostics:
+
+map:
+description: Global map of temperature in January 2000.
+themes:
+- phys
+realms:
+- atmos
+variables:
+tas:
+mip: Amon
+preprocessor: to_degrees_c
+timerange: 2000/P1M
+ caption: |
+ Global map of {long_name} in January 2000 according to {dataset}.
+scripts:
+script1:
+script: examples/diagnostic.py
+quickplot:
+plot_type: pcolormesh
+cmap: Reds
+
+timeseries:
+description: Annual mean temperature in Amsterdam and global mean since 1850.
+themes:
+- phys
+realms:
+- atmos
+variables:
+tas_amsterdam:
+short_name: tas
+mip: Amon
+preprocessor: annual_mean_amsterdam
+timerange: 1850/2000
+caption: Annual mean {long_name} in Amsterdam according to {dataset}.
+tas_global:
+short_name: tas
+mip: Amon
+preprocessor: annual_mean_global
+timerange: 1850/2000
+caption: Annual global mean {long_name} according to {dataset}.
+scripts:
+script1:
+script: examples/diagnostic.py
+quickplot:
+plot_type: plot
+
+
+
+
+
+
Keys and values in recipe settings
+
+
+
The [ESMValTool pre-processors](https://docs.esmvaltool.org/projects/ESMValCore/en/latest/
+recipe/preprocessor.html?highlight=preprocessor) cover a broad range of
+operations on the input data, like time manipulation, area manipulation,
+land-sea masking, variable derivation, etc. Let’s add the preprocessor
+extract_region to a new section
+annual_mean_regional:
+
+
YAML
+
+
preprocessors:
+annual_mean_regional:
+annual_statistics:
+operator: mean
+extract_region:
+start_longitude:-10
+end_longitude:40
+start_latitude:27
+end_latitude:70
+
+
Also, we change the projects value esmval
+to tutorial:
+
+
YAML
+
+
projects:
+- tutorial
+- c3s-magic
+
+
Then, we save the file and run the recipe:
+
+
BASH
+
+
esmvaltool run recipe_python.yml
+
+
+
ERROR
+
+
ValueError: Tag 'tutorial' does not exist in section 'projects' of
+esmvaltool/config-references.yml 2020-06-29 18:09:56,641 UTC [46055] INFO If you
+have a question or need help, please start a new discussion on
+https://github.com/ESMValGroup/ESMValTool/discussions If you suspect this is a
+bug, please open an issue on https://github.com/ESMValGroup/ESMValTool/issues To
+make it easier to find out what the problem is, please consider attaching the
+files run/recipe_*.yml and run/main_log_debug.txt from the output directory.
+
+
The values for the keys author, maintainer,
+projects and references in the recipe should
+be known by ESMValTool:
ESMValTool references in BibTeX format can be found in
+the [ESMValTool/esmvaltool/references](https://github.com/ESMValGroup/ESMValTool/
+tree/master/esmvaltool/references) directory.
+
+
+
ESMValTool can’t locate the
+data
+
You are assisting a colleague with ESMValTool. The colleague replaces
+the CanESM2 entry in
+dataset: CanESM2, project: CMIP5 to ACCESS1-3
+and runs the recipe. However, ESMValTool encounters an error like:
+
+
BASH
+
+
ERROR No input files found for variable {'short_name': 'tas', 'mip': 'Amon',
+'preprocessor':'annual_mean_amsterdam', 'variable_group': 'tas_amsterdam',
+'diagnostic':'timeseries', 'dataset': 'ACCESS1-3', 'project': 'CMIP5', 'exp':
+'historical','ensemble': 'r1i1p1', 'recipe_dataset_index': 1, 'institute':
+['CSIRO-BOM'],'product': ['output1', 'output2'], 'timerange': '1850/2000',
+'alias':'CMIP5', 'original_short_name': 'tas', 'standard_name':
+'air_temperature','long_name': 'Near-Surface Air Temperature', 'units': 'K',
+'modeling_realm':['atmos'], 'frequency': 'mon', 'start_year': 1850,
+'end_year': 2000}
+ERROR Looked for files matching
+['tas_Amon_ACCESS1-3_historical_r1i1p1*.nc'], but did not find any existing
+input directory
+ERROR Set 'log_level' to 'debug' to get more information
+
+
+
What suggestions would you give the researcher for fixing the
+error?
+
+
+
+
+
+
Challenge
+
+
+
+
+
+
+
+
+
+
+
+
Check user-config.yml to see if the correct directory
+for input data is introduced
+
Check the available data, regarding exp, mip, ensemble, start_year,
+and end_year
+
Check the variable names in the diagnostics section in the
+recipe
+
+
+
+
+
+
Check pre-processed data
+
+
+
The setting save_intermediary_cubes in the configuration
+file can be used to save the pre-processed data. More information about
+this setting can be found at Configuration.
+
+
+
+
+
+
save_intermediary_cubes
+
+
Note that this setting should only be used for debugging, as it
+significantly slows down the recipe and increases disk usage because a
+lot of output files need to be stored.
+
+
+
+
Check diagnostic script path
+
+
+
The result of the pre-processor is passed to the
+examples/diagnostic.py script, that is introduced in the
+recipe as:
The diagnostic scripts are located in the folder
+diag_scripts in the ESMValTool installation directory
+<path_to_esmvaltool>. To find where ESMValTool is
+located on your system, see Installation .
+
Let’s see what happens if we can change the script path as:
esmvalcore._task.DiagnosticError: Cannot execute script
+'diag_scripts/ocean/diagnostic_timeseries.py'
+(~/mambaforge/envs/esmvaltool2.6/lib/python3.10/site-packages/esmvaltool/
+diag_scripts/diag_scripts/ocean/diagnostic_timeseries.py):
+file does not exist. 2022-10-18 11:42:34,136 UTC [39323] INFO If you have a
+question or need help, please start a new discussion on
+https://github.com/ESMValGroup/ESMValTool/discussions If you suspect this is a
+bug, please open an issue on https://github.com/ESMValGroup/ESMValTool/issues To
+make it easier to find out what the problem is, please consider attaching the
+files run/recipe_*.yml and run/main_log_debug.txt from the output directory.
+
+
The script path should be relative to diag_scripts
+directory. It means that the script
+diagnostic_timeseries.py is located in
+<path_to_esmvaltool>/diag_scripts/ocean/.
+Alternatively, the script path can be an absolute path. To examine this,
+we can download the script from the ESMValTool
+repository:
Then, run the recipe again and examine the output to see
+Run was successful!
+
+
+
+
+
+
Available recipe and diagnostic scripts
+
+
ESMValTool provides a broad suite of recipes
+and diagnostic scripts for different disciplines like atmosphere,
+climate metrics, future projections, IPCC, land, ocean, ….
+
+
+
+
+
Re-running a diagnostic
+
Look at the main_log.txt file and answer the following
+question: How to re-run the diagnostic script?
+
+
Solution
+
The main_log.txt file contains information on how to
+re-run the diagnostic script without re-running the pre-processors:
+
+
+
+
BASH
+
+
2020-06-29 20:36:32,844 UTC [52810] INFO To re-run this diagnostic script, run:
+
+
+
+
+
+
+
Challenge
+
+
+
+
+
+
+
+
+
+
+
If you run the command in a terminal, you will be able to re-run the
+diagnostic.
+
+
+
+
+
+
+
+
+
+
Memory issues
+
+
If you run out of memory, try setting max_parallel_tasks
+to 1 in the configuration file. Then, check the amount of memory you
+need for that by inspecting the file run/resource_usage.txt
+in the output directory. Using the number, there you can increase the
+number of parallel tasks again to a reasonable number for the amount of
+memory available in your system.
+
+
+
+
+
+
+
+
+
Key Points
+
+
+
There are three different kinds of log files:
+main_log.txt, and main_log_debug.txt and
+log.txt.
');
+ },
+
+ createChildNavList: function($parent) {
+ var $childList = this.createNavList();
+ $parent.append($childList);
+ return $childList;
+ },
+
+ generateNavEl: function(anchor, text) {
+ var $a = $('');
+ $a.attr('href', '#' + anchor);
+ $a.text(text);
+ var $li = $('');
+ $li.append($a);
+ return $li;
+ },
+
+ generateNavItem: function(headingEl) {
+ var anchor = this.generateAnchor(headingEl);
+ var $heading = $(headingEl);
+ var text = $heading.data('toc-text') || $heading.text();
+ return this.generateNavEl(anchor, text);
+ },
+
+ // Find the first heading level (`
`, then `
`, etc.) that has more than one element. Defaults to 1 (for `
`).
+ getTopLevel: function($scope) {
+ for (var i = 1; i <= 6; i++) {
+ var $headings = this.findOrFilter($scope, 'h' + i);
+ if ($headings.length > 1) {
+ return i;
+ }
+ }
+
+ return 1;
+ },
+
+ // returns the elements for the top level, and the next below it
+ getHeadings: function($scope, topLevel) {
+ var topSelector = 'h' + topLevel;
+
+ var secondaryLevel = topLevel + 1;
+ var secondarySelector = 'h' + secondaryLevel;
+
+ return this.findOrFilter($scope, topSelector + ',' + secondarySelector);
+ },
+
+ getNavLevel: function(el) {
+ return parseInt(el.tagName.charAt(1), 10);
+ },
+
+ populateNav: function($topContext, topLevel, $headings) {
+ var $context = $topContext;
+ var $prevNav;
+
+ var helpers = this;
+ $headings.each(function(i, el) {
+ var $newNav = helpers.generateNavItem(el);
+ var navLevel = helpers.getNavLevel(el);
+
+ // determine the proper $context
+ if (navLevel === topLevel) {
+ // use top level
+ $context = $topContext;
+ } else if ($prevNav && $context === $topContext) {
+ // create a new level of the tree and switch to it
+ $context = helpers.createChildNavList($prevNav);
+ } // else use the current $context
+
+ $context.append($newNav);
+
+ $prevNav = $newNav;
+ });
+ },
+
+ parseOps: function(arg) {
+ var opts;
+ if (arg.jquery) {
+ opts = {
+ $nav: arg
+ };
+ } else {
+ opts = arg;
+ }
+ opts.$scope = opts.$scope || $(document.body);
+ return opts;
+ }
+ },
+
+ // accepts a jQuery object, or an options object
+ init: function(opts) {
+ opts = this.helpers.parseOps(opts);
+
+ // ensure that the data attribute is in place for styling
+ opts.$nav.attr('data-toggle', 'toc');
+
+ var $topContext = this.helpers.createChildNavList(opts.$nav);
+ var topLevel = this.helpers.getTopLevel(opts.$scope);
+ var $headings = this.helpers.getHeadings(opts.$scope, topLevel);
+ this.helpers.populateNav($topContext, topLevel, $headings);
+ }
+ };
+
+ $(function() {
+ $('nav[data-toggle="toc"]').each(function(i, el) {
+ var $nav = $(el);
+ Toc.init($nav);
+ });
+ });
+})();
diff --git a/config.yaml b/config.yaml
new file mode 100644
index 00000000..d2638abe
--- /dev/null
+++ b/config.yaml
@@ -0,0 +1,98 @@
+#------------------------------------------------------------
+# Values for this lesson.
+#------------------------------------------------------------
+
+# Which carpentry is this (swc, dc, lc, or cp)?
+# swc: Software Carpentry
+# dc: Data Carpentry
+# lc: Library Carpentry
+# cp: Carpentries (to use for instructor training for instance)
+# incubator: The Carpentries Incubator
+# Note that you can also use a custom carpentry type. For more info,
+# see the documentation: https://carpentries.github.io/sandpaper-docs/editing.html
+carpentry: 'incubator'
+
+# Custom carpentry description
+# This will be used as the alt text for the logo
+# carpentry_description: "Custom Carpentry"
+
+# Overall title for pages.
+title: 'ESMValTool Tutorial'
+
+# Date the lesson was created (YYYY-MM-DD, this is empty by default)
+created: 2020-02-26
+
+# Comma-separated list of keywords for the lesson
+keywords: 'software, data, lesson, The Carpentries'
+
+# Life cycle stage of the lesson
+# possible values: pre-alpha, alpha, beta, stable
+life_cycle: 'beta'
+
+# License of the lesson materials (recommended CC-BY 4.0)
+license: 'CC-BY 4.0'
+
+# Link to the source repository for this lesson
+source: 'https://github.com/ESMValGroup/ESMValTool_Tutorial'
+
+# Default branch of your lesson
+branch: 'main'
+
+# Who to contact if there are any issues
+contact: 'esmvaltool_user_engagement_team@listserv.dfn.de'
+
+# Navigation ------------------------------------------------
+#
+# Use the following menu items to specify the order of
+# individual pages in each dropdown section. Leave blank to
+# include all pages in the folder.
+#
+# Example -------------
+#
+# episodes:
+# - introduction.md
+# - first-steps.md
+#
+# learners:
+# - setup.md
+#
+# instructors:
+# - instructor-notes.md
+#
+# profiles:
+# - one-learner.md
+# - another-learner.md
+
+# Order of episodes in your lesson
+episodes:
+- 00-introduction.md
+- 01-quickstart.md
+- 02-installation.md
+- 03-configuration.md
+- 04-recipe.md
+- 05-conclusions.md
+- 06-preprocessor.md
+- 07-development-setup.md
+- 08-diagnostics.md
+- 09-cmorization.md
+- 10-debugging.md
+
+# Information for Learners
+learners:
+
+# Information for Instructors
+instructors:
+
+# Learner Profiles
+profiles:
+
+# Customisation ---------------------------------------------
+#
+# This space below is where custom yaml items (e.g. pinning
+# sandpaper and varnish versions) should live
+
+
+carpentry_description: Lesson Description
+url: 'https://ESMValGroup.github.io/ESMValTool_Tutorial'
+analytics: carpentries
+lang: en
diff --git a/data/dataset.urls b/data/dataset.urls
new file mode 100644
index 00000000..966fe43d
--- /dev/null
+++ b/data/dataset.urls
@@ -0,0 +1,6 @@
+# Dataset urls required fo tutorial, described at https://tutorial.esmvaltool.org/setup.html#using-your-own-machine
+# For episodes 4 and 5
+http://esgf2.dkrz.de/thredds/fileServer/lta_dataroot/cmip5/output1/BCC/bcc-csm1-1/historical/mon/atmos/Amon/r1i1p1/v1/tas/tas_Amon_bcc-csm1-1_historical_r1i1p1_185001-201212.nc
+http://esgf2.dkrz.de/thredds/fileServer/lta_dataroot/cmip5/output1/BCC/bcc-csm1-1/historical/fx/atmos/fx/r0i0p0/v1/areacella/areacella_fx_bcc-csm1-1_historical_r0i0p0.nc
+http://esgf3.dkrz.de/thredds/fileServer/cmip6/CMIP/BCC/BCC-ESM1/historical/r1i1p1f1/Amon/tas/gn/v20181214/tas_Amon_BCC-ESM1_historical_r1i1p1f1_gn_185001-201412.nc
+http://esgf3.dkrz.de/thredds/fileServer/cmip6/CMIP/BCC/BCC-ESM1/1pctCO2/r1i1p1f1/fx/areacella/gn/v20190613/areacella_fx_BCC-ESM1_1pctCO2_r1i1p1f1_gn.nc
diff --git a/docsearch.css b/docsearch.css
new file mode 100644
index 00000000..e5f1fe1d
--- /dev/null
+++ b/docsearch.css
@@ -0,0 +1,148 @@
+/* Docsearch -------------------------------------------------------------- */
+/*
+ Source: https://github.com/algolia/docsearch/
+ License: MIT
+*/
+
+.algolia-autocomplete {
+ display: block;
+ -webkit-box-flex: 1;
+ -ms-flex: 1;
+ flex: 1
+}
+
+.algolia-autocomplete .ds-dropdown-menu {
+ width: 100%;
+ min-width: none;
+ max-width: none;
+ padding: .75rem 0;
+ background-color: #fff;
+ background-clip: padding-box;
+ border: 1px solid rgba(0, 0, 0, .1);
+ box-shadow: 0 .5rem 1rem rgba(0, 0, 0, .175);
+}
+
+@media (min-width:768px) {
+ .algolia-autocomplete .ds-dropdown-menu {
+ width: 175%
+ }
+}
+
+.algolia-autocomplete .ds-dropdown-menu::before {
+ display: none
+}
+
+.algolia-autocomplete .ds-dropdown-menu [class^=ds-dataset-] {
+ padding: 0;
+ background-color: rgb(255,255,255);
+ border: 0;
+ max-height: 80vh;
+}
+
+.algolia-autocomplete .ds-dropdown-menu .ds-suggestions {
+ margin-top: 0
+}
+
+.algolia-autocomplete .algolia-docsearch-suggestion {
+ padding: 0;
+ overflow: visible
+}
+
+.algolia-autocomplete .algolia-docsearch-suggestion--category-header {
+ padding: .125rem 1rem;
+ margin-top: 0;
+ font-size: 1.3em;
+ font-weight: 500;
+ color: #00008B;
+ border-bottom: 0
+}
+
+.algolia-autocomplete .algolia-docsearch-suggestion--wrapper {
+ float: none;
+ padding-top: 0
+}
+
+.algolia-autocomplete .algolia-docsearch-suggestion--subcategory-column {
+ float: none;
+ width: auto;
+ padding: 0;
+ text-align: left
+}
+
+.algolia-autocomplete .algolia-docsearch-suggestion--content {
+ float: none;
+ width: auto;
+ padding: 0
+}
+
+.algolia-autocomplete .algolia-docsearch-suggestion--content::before {
+ display: none
+}
+
+.algolia-autocomplete .ds-suggestion:not(:first-child) .algolia-docsearch-suggestion--category-header {
+ padding-top: .75rem;
+ margin-top: .75rem;
+ border-top: 1px solid rgba(0, 0, 0, .1)
+}
+
+.algolia-autocomplete .ds-suggestion .algolia-docsearch-suggestion--subcategory-column {
+ display: block;
+ padding: .1rem 1rem;
+ margin-bottom: 0.1;
+ font-size: 1.0em;
+ font-weight: 400
+ /* display: none */
+}
+
+.algolia-autocomplete .algolia-docsearch-suggestion--title {
+ display: block;
+ padding: .25rem 1rem;
+ margin-bottom: 0;
+ font-size: 0.9em;
+ font-weight: 400
+}
+
+.algolia-autocomplete .algolia-docsearch-suggestion--text {
+ padding: 0 1rem .5rem;
+ margin-top: -.25rem;
+ font-size: 0.8em;
+ font-weight: 400;
+ line-height: 1.25
+}
+
+.algolia-autocomplete .algolia-docsearch-footer {
+ width: 110px;
+ height: 20px;
+ z-index: 3;
+ margin-top: 10.66667px;
+ float: right;
+ font-size: 0;
+ line-height: 0;
+}
+
+.algolia-autocomplete .algolia-docsearch-footer--logo {
+ background-image: url("data:image/svg+xml;utf8,");
+ background-repeat: no-repeat;
+ background-position: 50%;
+ background-size: 100%;
+ overflow: hidden;
+ text-indent: -9000px;
+ width: 100%;
+ height: 100%;
+ display: block;
+ transform: translate(-8px);
+}
+
+.algolia-autocomplete .algolia-docsearch-suggestion--highlight {
+ color: #FF8C00;
+ background: rgba(232, 189, 54, 0.1)
+}
+
+
+.algolia-autocomplete .algolia-docsearch-suggestion--text .algolia-docsearch-suggestion--highlight {
+ box-shadow: inset 0 -2px 0 0 rgba(105, 105, 105, .5)
+}
+
+.algolia-autocomplete .ds-suggestion.ds-cursor .algolia-docsearch-suggestion--content {
+ background-color: rgba(192, 192, 192, .15)
+}
diff --git a/docsearch.js b/docsearch.js
new file mode 100644
index 00000000..b35504cd
--- /dev/null
+++ b/docsearch.js
@@ -0,0 +1,85 @@
+$(function() {
+
+ // register a handler to move the focus to the search bar
+ // upon pressing shift + "/" (i.e. "?")
+ $(document).on('keydown', function(e) {
+ if (e.shiftKey && e.keyCode == 191) {
+ e.preventDefault();
+ $("#search-input").focus();
+ }
+ });
+
+ $(document).ready(function() {
+ // do keyword highlighting
+ /* modified from https://jsfiddle.net/julmot/bL6bb5oo/ */
+ var mark = function() {
+
+ var referrer = document.URL ;
+ var paramKey = "q" ;
+
+ if (referrer.indexOf("?") !== -1) {
+ var qs = referrer.substr(referrer.indexOf('?') + 1);
+ var qs_noanchor = qs.split('#')[0];
+ var qsa = qs_noanchor.split('&');
+ var keyword = "";
+
+ for (var i = 0; i < qsa.length; i++) {
+ var currentParam = qsa[i].split('=');
+
+ if (currentParam.length !== 2) {
+ continue;
+ }
+
+ if (currentParam[0] == paramKey) {
+ keyword = decodeURIComponent(currentParam[1].replace(/\+/g, "%20"));
+ }
+ }
+
+ if (keyword !== "") {
+ $(".contents").unmark({
+ done: function() {
+ $(".contents").mark(keyword);
+ }
+ });
+ }
+ }
+ };
+
+ mark();
+ });
+});
+
+/* Search term highlighting ------------------------------*/
+
+function matchedWords(hit) {
+ var words = [];
+
+ var hierarchy = hit._highlightResult.hierarchy;
+ // loop to fetch from lvl0, lvl1, etc.
+ for (var idx in hierarchy) {
+ words = words.concat(hierarchy[idx].matchedWords);
+ }
+
+ var content = hit._highlightResult.content;
+ if (content) {
+ words = words.concat(content.matchedWords);
+ }
+
+ // return unique words
+ var words_uniq = [...new Set(words)];
+ return words_uniq;
+}
+
+function updateHitURL(hit) {
+
+ var words = matchedWords(hit);
+ var url = "";
+
+ if (hit.anchor) {
+ url = hit.url_without_anchor + '?q=' + escape(words.join(" ")) + '#' + hit.anchor;
+ } else {
+ url = hit.url + '?q=' + escape(words.join(" "));
+ }
+
+ return url;
+}
diff --git a/favicon-16x16.png b/favicon-16x16.png
new file mode 100644
index 00000000..d44f8acb
Binary files /dev/null and b/favicon-16x16.png differ
diff --git a/favicon-32x32.png b/favicon-32x32.png
new file mode 100644
index 00000000..63441d4c
Binary files /dev/null and b/favicon-32x32.png differ
diff --git a/favicons/cp/apple-touch-icon-114x114.png b/favicons/cp/apple-touch-icon-114x114.png
new file mode 100644
index 00000000..a60b7581
Binary files /dev/null and b/favicons/cp/apple-touch-icon-114x114.png differ
diff --git a/favicons/cp/apple-touch-icon-120x120.png b/favicons/cp/apple-touch-icon-120x120.png
new file mode 100644
index 00000000..8f20a8f1
Binary files /dev/null and b/favicons/cp/apple-touch-icon-120x120.png differ
diff --git a/favicons/cp/apple-touch-icon-144x144.png b/favicons/cp/apple-touch-icon-144x144.png
new file mode 100644
index 00000000..4be151b1
Binary files /dev/null and b/favicons/cp/apple-touch-icon-144x144.png differ
diff --git a/favicons/cp/apple-touch-icon-152x152.png b/favicons/cp/apple-touch-icon-152x152.png
new file mode 100644
index 00000000..7d1d9439
Binary files /dev/null and b/favicons/cp/apple-touch-icon-152x152.png differ
diff --git a/favicons/cp/apple-touch-icon-57x57.png b/favicons/cp/apple-touch-icon-57x57.png
new file mode 100644
index 00000000..92309cef
Binary files /dev/null and b/favicons/cp/apple-touch-icon-57x57.png differ
diff --git a/favicons/cp/apple-touch-icon-60x60.png b/favicons/cp/apple-touch-icon-60x60.png
new file mode 100644
index 00000000..de8148e5
Binary files /dev/null and b/favicons/cp/apple-touch-icon-60x60.png differ
diff --git a/favicons/cp/apple-touch-icon-72x72.png b/favicons/cp/apple-touch-icon-72x72.png
new file mode 100644
index 00000000..81d7e3d8
Binary files /dev/null and b/favicons/cp/apple-touch-icon-72x72.png differ
diff --git a/favicons/cp/apple-touch-icon-76x76.png b/favicons/cp/apple-touch-icon-76x76.png
new file mode 100644
index 00000000..15bca5c7
Binary files /dev/null and b/favicons/cp/apple-touch-icon-76x76.png differ
diff --git a/favicons/cp/favicon-128.png b/favicons/cp/favicon-128.png
new file mode 100644
index 00000000..e612cdc1
Binary files /dev/null and b/favicons/cp/favicon-128.png differ
diff --git a/favicons/cp/favicon-16x16.png b/favicons/cp/favicon-16x16.png
new file mode 100644
index 00000000..65b33111
Binary files /dev/null and b/favicons/cp/favicon-16x16.png differ
diff --git a/favicons/cp/favicon-196x196.png b/favicons/cp/favicon-196x196.png
new file mode 100644
index 00000000..0da938b2
Binary files /dev/null and b/favicons/cp/favicon-196x196.png differ
diff --git a/favicons/cp/favicon-32x32.png b/favicons/cp/favicon-32x32.png
new file mode 100644
index 00000000..0c1442e3
Binary files /dev/null and b/favicons/cp/favicon-32x32.png differ
diff --git a/favicons/cp/favicon-96x96.png b/favicons/cp/favicon-96x96.png
new file mode 100644
index 00000000..bed74ec8
Binary files /dev/null and b/favicons/cp/favicon-96x96.png differ
diff --git a/favicons/cp/favicon.ico b/favicons/cp/favicon.ico
new file mode 100644
index 00000000..4f2f2f11
Binary files /dev/null and b/favicons/cp/favicon.ico differ
diff --git a/favicons/cp/mstile-144x144.png b/favicons/cp/mstile-144x144.png
new file mode 100644
index 00000000..4be151b1
Binary files /dev/null and b/favicons/cp/mstile-144x144.png differ
diff --git a/favicons/cp/mstile-150x150.png b/favicons/cp/mstile-150x150.png
new file mode 100644
index 00000000..bf7ad5e7
Binary files /dev/null and b/favicons/cp/mstile-150x150.png differ
diff --git a/favicons/cp/mstile-310x150.png b/favicons/cp/mstile-310x150.png
new file mode 100644
index 00000000..6ac80484
Binary files /dev/null and b/favicons/cp/mstile-310x150.png differ
diff --git a/favicons/cp/mstile-310x310.png b/favicons/cp/mstile-310x310.png
new file mode 100644
index 00000000..b7781475
Binary files /dev/null and b/favicons/cp/mstile-310x310.png differ
diff --git a/favicons/cp/mstile-70x70.png b/favicons/cp/mstile-70x70.png
new file mode 100644
index 00000000..e612cdc1
Binary files /dev/null and b/favicons/cp/mstile-70x70.png differ
diff --git a/favicons/dc/apple-touch-icon-114x114.png b/favicons/dc/apple-touch-icon-114x114.png
new file mode 100644
index 00000000..edafbda1
Binary files /dev/null and b/favicons/dc/apple-touch-icon-114x114.png differ
diff --git a/favicons/dc/apple-touch-icon-120x120.png b/favicons/dc/apple-touch-icon-120x120.png
new file mode 100644
index 00000000..ee145ec5
Binary files /dev/null and b/favicons/dc/apple-touch-icon-120x120.png differ
diff --git a/favicons/dc/apple-touch-icon-144x144.png b/favicons/dc/apple-touch-icon-144x144.png
new file mode 100644
index 00000000..bf507014
Binary files /dev/null and b/favicons/dc/apple-touch-icon-144x144.png differ
diff --git a/favicons/dc/apple-touch-icon-152x152.png b/favicons/dc/apple-touch-icon-152x152.png
new file mode 100644
index 00000000..bd596c81
Binary files /dev/null and b/favicons/dc/apple-touch-icon-152x152.png differ
diff --git a/favicons/dc/apple-touch-icon-57x57.png b/favicons/dc/apple-touch-icon-57x57.png
new file mode 100644
index 00000000..61c15273
Binary files /dev/null and b/favicons/dc/apple-touch-icon-57x57.png differ
diff --git a/favicons/dc/apple-touch-icon-60x60.png b/favicons/dc/apple-touch-icon-60x60.png
new file mode 100644
index 00000000..9daad363
Binary files /dev/null and b/favicons/dc/apple-touch-icon-60x60.png differ
diff --git a/favicons/dc/apple-touch-icon-72x72.png b/favicons/dc/apple-touch-icon-72x72.png
new file mode 100644
index 00000000..2069520f
Binary files /dev/null and b/favicons/dc/apple-touch-icon-72x72.png differ
diff --git a/favicons/dc/apple-touch-icon-76x76.png b/favicons/dc/apple-touch-icon-76x76.png
new file mode 100644
index 00000000..3db01ca7
Binary files /dev/null and b/favicons/dc/apple-touch-icon-76x76.png differ
diff --git a/favicons/dc/favicon-128.png b/favicons/dc/favicon-128.png
new file mode 100644
index 00000000..9e3de2a4
Binary files /dev/null and b/favicons/dc/favicon-128.png differ
diff --git a/favicons/dc/favicon-16x16.png b/favicons/dc/favicon-16x16.png
new file mode 100644
index 00000000..4c9f9b8c
Binary files /dev/null and b/favicons/dc/favicon-16x16.png differ
diff --git a/favicons/dc/favicon-196x196.png b/favicons/dc/favicon-196x196.png
new file mode 100644
index 00000000..588afc21
Binary files /dev/null and b/favicons/dc/favicon-196x196.png differ
diff --git a/favicons/dc/favicon-32x32.png b/favicons/dc/favicon-32x32.png
new file mode 100644
index 00000000..9c2ecbfb
Binary files /dev/null and b/favicons/dc/favicon-32x32.png differ
diff --git a/favicons/dc/favicon-96x96.png b/favicons/dc/favicon-96x96.png
new file mode 100644
index 00000000..ff13fc06
Binary files /dev/null and b/favicons/dc/favicon-96x96.png differ
diff --git a/favicons/dc/favicon.ico b/favicons/dc/favicon.ico
new file mode 100644
index 00000000..e4715f32
Binary files /dev/null and b/favicons/dc/favicon.ico differ
diff --git a/favicons/dc/mstile-144x144.png b/favicons/dc/mstile-144x144.png
new file mode 100644
index 00000000..bf507014
Binary files /dev/null and b/favicons/dc/mstile-144x144.png differ
diff --git a/favicons/dc/mstile-150x150.png b/favicons/dc/mstile-150x150.png
new file mode 100644
index 00000000..c5844cca
Binary files /dev/null and b/favicons/dc/mstile-150x150.png differ
diff --git a/favicons/dc/mstile-310x150.png b/favicons/dc/mstile-310x150.png
new file mode 100644
index 00000000..786813af
Binary files /dev/null and b/favicons/dc/mstile-310x150.png differ
diff --git a/favicons/dc/mstile-310x310.png b/favicons/dc/mstile-310x310.png
new file mode 100644
index 00000000..9580653c
Binary files /dev/null and b/favicons/dc/mstile-310x310.png differ
diff --git a/favicons/dc/mstile-70x70.png b/favicons/dc/mstile-70x70.png
new file mode 100644
index 00000000..9e3de2a4
Binary files /dev/null and b/favicons/dc/mstile-70x70.png differ
diff --git a/favicons/lc/apple-touch-icon-114x114.png b/favicons/lc/apple-touch-icon-114x114.png
new file mode 100644
index 00000000..6c83127c
Binary files /dev/null and b/favicons/lc/apple-touch-icon-114x114.png differ
diff --git a/favicons/lc/apple-touch-icon-120x120.png b/favicons/lc/apple-touch-icon-120x120.png
new file mode 100644
index 00000000..8334648f
Binary files /dev/null and b/favicons/lc/apple-touch-icon-120x120.png differ
diff --git a/favicons/lc/apple-touch-icon-144x144.png b/favicons/lc/apple-touch-icon-144x144.png
new file mode 100644
index 00000000..5f32151e
Binary files /dev/null and b/favicons/lc/apple-touch-icon-144x144.png differ
diff --git a/favicons/lc/apple-touch-icon-152x152.png b/favicons/lc/apple-touch-icon-152x152.png
new file mode 100644
index 00000000..4e5c177c
Binary files /dev/null and b/favicons/lc/apple-touch-icon-152x152.png differ
diff --git a/favicons/lc/apple-touch-icon-57x57.png b/favicons/lc/apple-touch-icon-57x57.png
new file mode 100644
index 00000000..61f9c9c7
Binary files /dev/null and b/favicons/lc/apple-touch-icon-57x57.png differ
diff --git a/favicons/lc/apple-touch-icon-60x60.png b/favicons/lc/apple-touch-icon-60x60.png
new file mode 100644
index 00000000..ccb5ada1
Binary files /dev/null and b/favicons/lc/apple-touch-icon-60x60.png differ
diff --git a/favicons/lc/apple-touch-icon-72x72.png b/favicons/lc/apple-touch-icon-72x72.png
new file mode 100644
index 00000000..517d459a
Binary files /dev/null and b/favicons/lc/apple-touch-icon-72x72.png differ
diff --git a/favicons/lc/apple-touch-icon-76x76.png b/favicons/lc/apple-touch-icon-76x76.png
new file mode 100644
index 00000000..17454b31
Binary files /dev/null and b/favicons/lc/apple-touch-icon-76x76.png differ
diff --git a/favicons/lc/favicon-128.png b/favicons/lc/favicon-128.png
new file mode 100644
index 00000000..9d781c90
Binary files /dev/null and b/favicons/lc/favicon-128.png differ
diff --git a/favicons/lc/favicon-16x16.png b/favicons/lc/favicon-16x16.png
new file mode 100644
index 00000000..3c20abcc
Binary files /dev/null and b/favicons/lc/favicon-16x16.png differ
diff --git a/favicons/lc/favicon-196x196.png b/favicons/lc/favicon-196x196.png
new file mode 100644
index 00000000..46baaf8f
Binary files /dev/null and b/favicons/lc/favicon-196x196.png differ
diff --git a/favicons/lc/favicon-32x32.png b/favicons/lc/favicon-32x32.png
new file mode 100644
index 00000000..ed6701ea
Binary files /dev/null and b/favicons/lc/favicon-32x32.png differ
diff --git a/favicons/lc/favicon-96x96.png b/favicons/lc/favicon-96x96.png
new file mode 100644
index 00000000..bc468c73
Binary files /dev/null and b/favicons/lc/favicon-96x96.png differ
diff --git a/favicons/lc/favicon.ico b/favicons/lc/favicon.ico
new file mode 100644
index 00000000..5c14e809
Binary files /dev/null and b/favicons/lc/favicon.ico differ
diff --git a/favicons/lc/mstile-144x144.png b/favicons/lc/mstile-144x144.png
new file mode 100644
index 00000000..5f32151e
Binary files /dev/null and b/favicons/lc/mstile-144x144.png differ
diff --git a/favicons/lc/mstile-150x150.png b/favicons/lc/mstile-150x150.png
new file mode 100644
index 00000000..924953a8
Binary files /dev/null and b/favicons/lc/mstile-150x150.png differ
diff --git a/favicons/lc/mstile-310x150.png b/favicons/lc/mstile-310x150.png
new file mode 100644
index 00000000..e4dcda44
Binary files /dev/null and b/favicons/lc/mstile-310x150.png differ
diff --git a/favicons/lc/mstile-310x310.png b/favicons/lc/mstile-310x310.png
new file mode 100644
index 00000000..a12c8763
Binary files /dev/null and b/favicons/lc/mstile-310x310.png differ
diff --git a/favicons/lc/mstile-70x70.png b/favicons/lc/mstile-70x70.png
new file mode 100644
index 00000000..9d781c90
Binary files /dev/null and b/favicons/lc/mstile-70x70.png differ
diff --git a/favicons/swc/apple-touch-icon-114x114.png b/favicons/swc/apple-touch-icon-114x114.png
new file mode 100644
index 00000000..e5125f8c
Binary files /dev/null and b/favicons/swc/apple-touch-icon-114x114.png differ
diff --git a/favicons/swc/apple-touch-icon-120x120.png b/favicons/swc/apple-touch-icon-120x120.png
new file mode 100644
index 00000000..0f97a0ae
Binary files /dev/null and b/favicons/swc/apple-touch-icon-120x120.png differ
diff --git a/favicons/swc/apple-touch-icon-144x144.png b/favicons/swc/apple-touch-icon-144x144.png
new file mode 100644
index 00000000..7441446c
Binary files /dev/null and b/favicons/swc/apple-touch-icon-144x144.png differ
diff --git a/favicons/swc/apple-touch-icon-152x152.png b/favicons/swc/apple-touch-icon-152x152.png
new file mode 100644
index 00000000..45cc338e
Binary files /dev/null and b/favicons/swc/apple-touch-icon-152x152.png differ
diff --git a/favicons/swc/apple-touch-icon-57x57.png b/favicons/swc/apple-touch-icon-57x57.png
new file mode 100644
index 00000000..e180a4a3
Binary files /dev/null and b/favicons/swc/apple-touch-icon-57x57.png differ
diff --git a/favicons/swc/apple-touch-icon-60x60.png b/favicons/swc/apple-touch-icon-60x60.png
new file mode 100644
index 00000000..c96fd6ce
Binary files /dev/null and b/favicons/swc/apple-touch-icon-60x60.png differ
diff --git a/favicons/swc/apple-touch-icon-72x72.png b/favicons/swc/apple-touch-icon-72x72.png
new file mode 100644
index 00000000..aae014aa
Binary files /dev/null and b/favicons/swc/apple-touch-icon-72x72.png differ
diff --git a/favicons/swc/apple-touch-icon-76x76.png b/favicons/swc/apple-touch-icon-76x76.png
new file mode 100644
index 00000000..2167f94a
Binary files /dev/null and b/favicons/swc/apple-touch-icon-76x76.png differ
diff --git a/favicons/swc/favicon-128.png b/favicons/swc/favicon-128.png
new file mode 100644
index 00000000..f61df620
Binary files /dev/null and b/favicons/swc/favicon-128.png differ
diff --git a/favicons/swc/favicon-16x16.png b/favicons/swc/favicon-16x16.png
new file mode 100644
index 00000000..2d20a406
Binary files /dev/null and b/favicons/swc/favicon-16x16.png differ
diff --git a/favicons/swc/favicon-196x196.png b/favicons/swc/favicon-196x196.png
new file mode 100644
index 00000000..2a20d3a6
Binary files /dev/null and b/favicons/swc/favicon-196x196.png differ
diff --git a/favicons/swc/favicon-32x32.png b/favicons/swc/favicon-32x32.png
new file mode 100644
index 00000000..f622b73a
Binary files /dev/null and b/favicons/swc/favicon-32x32.png differ
diff --git a/favicons/swc/favicon-96x96.png b/favicons/swc/favicon-96x96.png
new file mode 100644
index 00000000..5e57f66a
Binary files /dev/null and b/favicons/swc/favicon-96x96.png differ
diff --git a/favicons/swc/favicon.ico b/favicons/swc/favicon.ico
new file mode 100644
index 00000000..f771790f
Binary files /dev/null and b/favicons/swc/favicon.ico differ
diff --git a/favicons/swc/mstile-144x144.png b/favicons/swc/mstile-144x144.png
new file mode 100644
index 00000000..7441446c
Binary files /dev/null and b/favicons/swc/mstile-144x144.png differ
diff --git a/favicons/swc/mstile-150x150.png b/favicons/swc/mstile-150x150.png
new file mode 100644
index 00000000..d1594bcb
Binary files /dev/null and b/favicons/swc/mstile-150x150.png differ
diff --git a/favicons/swc/mstile-310x150.png b/favicons/swc/mstile-310x150.png
new file mode 100644
index 00000000..f7d58b2b
Binary files /dev/null and b/favicons/swc/mstile-310x150.png differ
diff --git a/favicons/swc/mstile-310x310.png b/favicons/swc/mstile-310x310.png
new file mode 100644
index 00000000..b632b421
Binary files /dev/null and b/favicons/swc/mstile-310x310.png differ
diff --git a/favicons/swc/mstile-70x70.png b/favicons/swc/mstile-70x70.png
new file mode 100644
index 00000000..f61df620
Binary files /dev/null and b/favicons/swc/mstile-70x70.png differ
diff --git a/fig/MultipleModels__thetaoga_prep_timeseries_diag_timeseries_temperature_1900_2000_timeseries_.png b/fig/MultipleModels__thetaoga_prep_timeseries_diag_timeseries_temperature_1900_2000_timeseries_.png
new file mode 100644
index 00000000..0554abf0
Binary files /dev/null and b/fig/MultipleModels__thetaoga_prep_timeseries_diag_timeseries_temperature_1900_2000_timeseries_.png differ
diff --git a/fig/data_flow.png b/fig/data_flow.png
new file mode 100644
index 00000000..21445d06
Binary files /dev/null and b/fig/data_flow.png differ
diff --git a/fig/data_flow.pptx b/fig/data_flow.pptx
new file mode 100644
index 00000000..11edf0e2
Binary files /dev/null and b/fig/data_flow.pptx differ
diff --git a/fig/diag_CMIP5_HadGEM2-ES_Omon_historical_r1i1p1_thetaoga_prep_timeseries_diag_timeseries_temperature_1900_2000_timeseries_0.png b/fig/diag_CMIP5_HadGEM2-ES_Omon_historical_r1i1p1_thetaoga_prep_timeseries_diag_timeseries_temperature_1900_2000_timeseries_0.png
new file mode 100644
index 00000000..3ad21872
Binary files /dev/null and b/fig/diag_CMIP5_HadGEM2-ES_Omon_historical_r1i1p1_thetaoga_prep_timeseries_diag_timeseries_temperature_1900_2000_timeseries_0.png differ
diff --git a/fig/esmvaltool_architecture.png b/fig/esmvaltool_architecture.png
new file mode 100644
index 00000000..43167b33
Binary files /dev/null and b/fig/esmvaltool_architecture.png differ
diff --git a/fig/warming_stripes.png b/fig/warming_stripes.png
new file mode 100644
index 00000000..f8b5551b
Binary files /dev/null and b/fig/warming_stripes.png differ
diff --git a/files/esgf-pyclient.yml b/files/esgf-pyclient.yml
new file mode 100644
index 00000000..688d69e8
--- /dev/null
+++ b/files/esgf-pyclient.yml
@@ -0,0 +1,15 @@
+# If you're using ESMValTool 2.6.0 or 2.7.0, you can copy this file to
+# ~/.esmvaltool/esgf-pyclient.yml to use the current URL for the CEDA ESGF
+# index node. See
+# https://docs.esmvaltool.org/projects/ESMValCore/en/v2.7.0/quickstart/configure.html#configuration-file
+# for more information on the use of this file.
+search_connection:
+ urls:
+ - 'https://esgf.ceda.ac.uk/esg-search'
+ - 'https://esgf-node.llnl.gov/esg-search'
+ - 'https://esgf-data.dkrz.de/esg-search'
+ - 'https://esgf-node.ipsl.upmc.fr/esg-search'
+ - 'https://esg-dn1.nsc.liu.se/esg-search'
+ - 'https://esgf.nci.org.au/esg-search'
+ - 'https://esgf.nccs.nasa.gov/esg-search'
+ - 'https://esgdata.gfdl.noaa.gov/esg-search'
diff --git a/files/recipe_check_fluxcom.yml b/files/recipe_check_fluxcom.yml
new file mode 100644
index 00000000..1253196b
--- /dev/null
+++ b/files/recipe_check_fluxcom.yml
@@ -0,0 +1,21 @@
+documentation:
+
+ description: Test recipe for FLUXCOM data
+
+ title: This is a test recipe for the FLUXCOM data.
+
+ authors:
+ - kalverla_peter
+
+ maintainer:
+ - kalverla_peter
+
+datasets:
+ - {project: OBS6, dataset: FLUXCOM, mip: Lmon, tier: 3, start_year: 2000, end_year: 2000, type: reanaly, version: ANN-v1}
+
+diagnostics:
+ check_fluxcom:
+ description: Check that ESMValTool can load the cmorized fluxnet data without errors.
+ variables:
+ gpp:
+ scripts: null
diff --git a/files/recipe_warming_stripes.yml b/files/recipe_warming_stripes.yml
new file mode 100644
index 00000000..6eb48e6c
--- /dev/null
+++ b/files/recipe_warming_stripes.yml
@@ -0,0 +1,40 @@
+# ESMValTool
+# recipe_warming_stripes.yml
+---
+documentation:
+ description: Reproducing Ed Hawkins' warming stripes visualization
+ title: Reproducing Ed Hawkins' warming stripes visualization.
+ authors:
+ - righi_mattia
+ maintainer:
+ - righi_mattia
+
+datasets:
+ - {dataset: BCC-ESM1, project: CMIP6, mip: Amon, exp: historical, ensemble: r1i1p1f1, grid: gn, start_year: 1850, end_year: 2014}
+
+preprocessors:
+ global_anomalies:
+ area_statistics:
+ operator: mean
+ anomalies:
+ period: month
+ reference:
+ start_year: 1981
+ start_month: 1
+ start_day: 1
+ end_year: 2010
+ end_month: 12
+ end_day: 31
+ standardize: false
+
+diagnostics:
+ diagnostic_warming_stripes:
+ description: visualize global temperature anomalies as warming stripes
+ variables:
+ global_temperature_anomalies:
+ short_name: tas
+ preprocessor: global_anomalies
+ scripts:
+ warming_stripes_script:
+ script: ~/esmvaltool_tutorial/warming_stripes.py
+ colormap: 'bwr'
diff --git a/files/recipe_warming_stripes_additional_datasets.yml b/files/recipe_warming_stripes_additional_datasets.yml
new file mode 100644
index 00000000..1bad5f96
--- /dev/null
+++ b/files/recipe_warming_stripes_additional_datasets.yml
@@ -0,0 +1,54 @@
+# ESMValTool
+# recipe_warming_stripes_additional_datasets.yml
+---
+documentation:
+ description: Reproducing Ed Hawkins' warming stripes visualization
+ title: Reproducing Ed Hawkins' warming stripes visualization.
+ authors:
+ - righi_mattia
+
+datasets:
+ - {dataset: BCC-ESM1, project: CMIP6, mip: Amon, exp: historical, ensemble: r1i1p1f1, grid: gn}
+
+preprocessors:
+ anomalies_amsterdam:
+ extract_point:
+ latitude: 52.379189
+ longitude: 4.899431
+ scheme: linear
+ anomalies: &anomalies
+ period: month
+ reference:
+ start_year: 1981
+ start_month: 1
+ start_day: 1
+ end_year: 2010
+ end_month: 12
+ end_day: 31
+ standardize: false
+ anomalies_london:
+ extract_point:
+ latitude: 51.5074
+ longitude: 0.1278
+ scheme: linear
+ anomalies: *anomalies
+
+diagnostics:
+ diagnostic_warming_stripes:
+ variables:
+ temperature_anomalies_recent_amsterdam:
+ short_name: tas
+ preprocessor: anomalies_amsterdam
+ start_year: 1950
+ end_year: 2014
+ temperature_anomalies_20th_century_london:
+ short_name: tas
+ preprocessor: anomalies_london
+ start_year: 1900
+ end_year: 1999
+ additional_datasets:
+ - {dataset: CanESM2, project: CMIP5, mip: Amon, exp: historical, ensemble: r1i1p1}
+ scripts:
+ warming_stripes_script:
+ script: ~/esmvaltool_tutorial/warming_stripes.py
+ colormap: 'bwr'
diff --git a/files/recipe_warming_stripes_local.yml b/files/recipe_warming_stripes_local.yml
new file mode 100644
index 00000000..cd19bcbc
--- /dev/null
+++ b/files/recipe_warming_stripes_local.yml
@@ -0,0 +1,39 @@
+# ESMValTool
+# recipe_warming_stripes_local.yml
+---
+documentation:
+ description: Reproducing Ed Hawkins' warming stripes visualization
+ title: Reproducing Ed Hawkins' warming stripes visualization.
+ authors:
+ - righi_mattia
+
+datasets:
+ - {dataset: BCC-ESM1, project: CMIP6, mip: Amon, exp: historical, ensemble: r1i1p1f1, grid: gn, start_year: 1850, end_year: 2014}
+
+preprocessors:
+ anomalies_amsterdam:
+ extract_point:
+ latitude: 52.379189
+ longitude: 4.899431
+ scheme: linear
+ anomalies:
+ period: month
+ reference:
+ start_year: 1981
+ start_month: 1
+ start_day: 1
+ end_year: 2010
+ end_month: 12
+ end_day: 31
+ standardize: false
+
+diagnostics:
+ diagnostic_warming_stripes:
+ variables:
+ temperature_anomalies_amsterdam:
+ short_name: tas
+ preprocessor: anomalies_amsterdam
+ scripts:
+ warming_stripes_script:
+ script: ~/esmvaltool_tutorial/warming_stripes.py
+ colormap: 'bwr'
diff --git a/files/recipe_warming_stripes_multiple_ensemble_members.yml b/files/recipe_warming_stripes_multiple_ensemble_members.yml
new file mode 100644
index 00000000..a2fce156
--- /dev/null
+++ b/files/recipe_warming_stripes_multiple_ensemble_members.yml
@@ -0,0 +1,37 @@
+# ESMValTool
+# recipe_warming_stripes.yml
+---
+documentation:
+ description: Reproducing Ed Hawkins' warming stripes visualization
+ title: Reproducing Ed Hawkins' warming stripes visualization.
+ authors:
+ - righi_mattia
+
+datasets:
+ - {dataset: BCC-ESM1, project: CMIP6, mip: Amon, exp: historical, ensemble: "r(1:2)i1p1f1", grid: gn, start_year: 1850, end_year: 2014}
+
+preprocessors:
+ global_anomalies:
+ area_statistics:
+ operator: mean
+ anomalies:
+ period: month
+ reference:
+ start_year: 1981
+ start_month: 1
+ start_day: 1
+ end_year: 2010
+ end_month: 12
+ end_day: 31
+ standardize: false
+
+diagnostics:
+ diagnostic_warming_stripes:
+ variables:
+ global_temperature_anomalies:
+ short_name: tas
+ preprocessor: global_anomalies
+ scripts:
+ warming_stripes_script:
+ script: ~/esmvaltool_tutorial/warming_stripes.py
+ colormap: 'bwr'
diff --git a/files/recipe_warming_stripes_multiple_locations.yml b/files/recipe_warming_stripes_multiple_locations.yml
new file mode 100644
index 00000000..40a68959
--- /dev/null
+++ b/files/recipe_warming_stripes_multiple_locations.yml
@@ -0,0 +1,52 @@
+# ESMValTool
+# recipe_warming_stripes_multiple_locations.yml
+---
+documentation:
+ description: Reproducing Ed Hawkins' warming stripes visualization
+ title: Reproducing Ed Hawkins' warming stripes visualization.
+ authors:
+ - righi_mattia
+
+datasets:
+ - {dataset: BCC-ESM1, project: CMIP6, mip: Amon, exp: historical, ensemble: r1i1p1f1, grid: gn}
+
+preprocessors:
+ anomalies_amsterdam:
+ extract_point:
+ latitude: 52.379189
+ longitude: 4.899431
+ scheme: linear
+ anomalies: &anomalies
+ period: month
+ reference:
+ start_year: 1981
+ start_month: 1
+ start_day: 1
+ end_year: 2010
+ end_month: 12
+ end_day: 31
+ standardize: false
+ anomalies_london:
+ extract_point:
+ latitude: 51.5074
+ longitude: 0.1278
+ scheme: linear
+ anomalies: *anomalies
+
+diagnostics:
+ diagnostic_warming_stripes:
+ variables:
+ temperature_anomalies_recent_amsterdam:
+ short_name: tas
+ preprocessor: anomalies_amsterdam
+ start_year: 1950
+ end_year: 2014
+ temperature_anomalies_20th_century_london:
+ short_name: tas
+ preprocessor: anomalies_london
+ start_year: 1900
+ end_year: 1999
+ scripts:
+ warming_stripes_script:
+ script: ~/esmvaltool_tutorial/warming_stripes.py
+ colormap: 'bwr'
diff --git a/files/recipe_warming_stripes_periods.yml b/files/recipe_warming_stripes_periods.yml
new file mode 100644
index 00000000..4754489f
--- /dev/null
+++ b/files/recipe_warming_stripes_periods.yml
@@ -0,0 +1,46 @@
+# ESMValTool
+# recipe_warming_stripes_periods.yml
+---
+documentation:
+ description: Reproducing Ed Hawkins' warming stripes visualization
+ title: Reproducing Ed Hawkins' warming stripes visualization.
+ authors:
+ - righi_mattia
+
+datasets:
+ - {dataset: BCC-ESM1, project: CMIP6, mip: Amon, exp: historical, ensemble: r1i1p1f1, grid: gn}
+
+preprocessors:
+ anomalies_amsterdam:
+ extract_point:
+ latitude: 52.379189
+ longitude: 4.899431
+ scheme: linear
+ anomalies:
+ period: month
+ reference:
+ start_year: 1981
+ start_month: 1
+ start_day: 1
+ end_year: 2010
+ end_month: 12
+ end_day: 31
+ standardize: false
+
+diagnostics:
+ diagnostic_warming_stripes:
+ variables:
+ temperature_anomalies_recent:
+ short_name: tas
+ preprocessor: anomalies_amsterdam
+ start_year: 1950
+ end_year: 2014
+ temperature_anomalies_20th_century:
+ short_name: tas
+ preprocessor: anomalies_amsterdam
+ start_year: 1900
+ end_year: 1999
+ scripts:
+ warming_stripes_script:
+ script: ~/esmvaltool_tutorial/warming_stripes.py
+ colormap: 'bwr'
diff --git a/files/warming_stripes.py b/files/warming_stripes.py
new file mode 100644
index 00000000..27afd00e
--- /dev/null
+++ b/files/warming_stripes.py
@@ -0,0 +1,58 @@
+"""Python script that visualizes a timeseries of global temperature anomalies
+in the form of the popular warming stripes figure by Ed Hawkins.
+"""
+
+from pathlib import Path
+
+import iris
+import matplotlib.pyplot as plt
+import numpy as np
+
+from esmvaltool.diag_scripts.shared import run_diagnostic
+from esmvaltool.diag_scripts.shared._base import get_plot_filename
+
+
+def plot_warming_stripes(cube, name, cmap):
+ """Plot timeseries of global temperature as warming stripes.
+
+ cube: iris cube of temperature anomalies with only time dimension.
+ """
+ temperature = cube.data.data
+ nx = len(temperature)
+ x = np.arange(len(temperature))
+ y = np.array([0, 1])
+ temperature = np.vstack([temperature, temperature])
+
+ figure, axes = plt.subplots()
+ axes.pcolormesh(x, y, temperature, cmap=cmap, shading='auto')
+ axes.set_axis_off()
+ axes.set_title(f'Warming stripes for {name}')
+ return figure
+
+
+def main(cfg):
+ """Compute the time average for each input dataset."""
+ # Get a description of the preprocessed data that we will use as input.
+ input_data = cfg['input_data'].values()
+
+ # Example of how to loop over variables/datasets in alphabetical order
+ for dataset in input_data:
+ # Load the data
+ input_file = dataset['filename']
+ name = dataset['variable_group']
+ cube = iris.load_cube(input_file)
+
+ # Plot as warming stripes
+ cmap = cfg['colormap']
+ fig = plot_warming_stripes(cube, name, cmap)
+
+ # Save output
+ output_file = Path(input_file).stem.replace('tas', name)
+ output_path = get_plot_filename(output_file, cfg)
+ fig.savefig(output_path)
+
+
+if __name__ == '__main__':
+
+ with run_diagnostic() as config:
+ main(config)
diff --git a/glossary.html b/glossary.html
new file mode 100644
index 00000000..e3c29bac
--- /dev/null
+++ b/glossary.html
@@ -0,0 +1,537 @@
+
+ESMValTool Tutorial: Glossary
+ Skip to main content
+
+
+
+
+
+
+
+ Beta
+
+ This lesson is in the beta phase, which means that it is ready for teaching by instructors outside of the original author team.
+
+
+
+
CMIP: The Coupled Model Intercomparison Project
+(CMIP) aims at better understanding past, present and future climate
+changes arising from natural, unforced variability or in response to
+changes in radiative forcing in a multi-model context. For more
+information, check the WCR-Climate webpage.
+
CMOR: CMOR stands for Climate Model
+Output Rewriter library. It comprises a set of C-based
+functions, with bindings to both Python and FORTRAN 90, that can be used
+to produce CF-compliant NetCDF files that fulfill the requirements of
+many of the climate community’s standard model experiments.
+
dataset: In a recipe, dataset refers to the name
+of the model or observation data, e.g. HadGEM2-ES.
+
diagnostic: In a recipe, the diagnostic
+section(s) define the tasks that will be executed when running the
+recipe. It can also include a diagnostic script that converts the
+preprocessed input data to the desired output like plots or NetCDF
+files.
+
ensemble: An ensemble is a collection of
+experiments based on standardised inputs. This ensures that ensemble
+calculations use the same initial states, initialisation methods, and
+physics details. Ensemble members are named in the rip-nomenclature,
+r for realization, i for initialisation and p
+for physics, followed by an integer, e.g. r1i1p1.
+
experiment: An experiment is defined as an
+activity aimed at addressing a specific scientific problem. In CMIP,
+experiments are numerical experiments with climate models.
+
grid: In the dataset section of a recipe, a grid
+is the array of coordinates on which a variable has been
+sampled.
+
mask: Masks are used to isolate regions. For
+example, if you apply an ocean mask, then coordinates in the ocean will
+not be taken into account for the preprocessors/diagnostics. Masks can
+also be used to work with regions that have missing or invalid
+entries.
+
mip: Typically used to refer to a table of
+variables in ESMValTool. These are standardized per project, e.g. the MIP
+tables for CMIP6.
+
Omon: Monthly ocean data.
+
pr: CMIP short-name for
+‘precipitation’.
+
preprocessor: A preprocessor is an operation
+that is applied to a dataset.
+
project: In the dataset section of a recipe,
+“project” Typically refers to a standard experimental protocol for
+global models, e.g. CMIP.
+
RCP: Representative Concentration
+Pathways are scenarios that represent the full bandwidth of
+possible future emission trajectories. Depending on population growth
+and the development of energy production, food production and land use,
+various emission trajectories are possible.
+
recipe: A recipe contains the instructions to be
+carried out by the ESMValTool. This includes four main sections:
+datasets, preprocessors, diagnostics and description.
+
ta: CMIP short-name for ‘air
+temperature’.
+
tas: CMIP short-name for ‘near-surface air
+temperature’.
+
thetao: Sea water potential temperature variable
+(assume reference height is sea level).
+
thetaoga: Global average sea water potential
+temperature
+
tos: CMIP short-name for ‘sea surface
+temperature’.
+
ts: CMIP short-name for ‘surface
+temperature’.
+
yaml: YAML is a human-readable
+data-serialization language. It is commonly used for configuration files
+and in applications where data is being stored or exchanged.
+
+
diff --git a/images.html b/images.html
new file mode 100644
index 00000000..451d0b31
--- /dev/null
+++ b/images.html
@@ -0,0 +1,521 @@
+
+
+
+
+
+ESMValTool Tutorial: All Images
+
+
+
+
+
+
+
+
+
+
+
+
+ Skip to main content
+
+
+
+
+
+
+
+ Beta
+
+ This lesson is in the beta phase, which means that it is ready for teaching by instructors outside of the original author team.
+
+
+
+
+
+
+
+
diff --git a/index.html b/index.html
new file mode 100644
index 00000000..e125afd9
--- /dev/null
+++ b/index.html
@@ -0,0 +1,774 @@
+
+ESMValTool Tutorial: Summary and Setup
+ Skip to main content
+
+
+
+
+
+
+
+ Beta
+
+ This lesson is in the beta phase, which means that it is ready for teaching by instructors outside of the original author team.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ESMValTool Tutorial
+
+
+
+
+
+
+
+
+
+
+
+
+
+
Summary and Setup
+
+
+
This tutorial helps you to use ESMValTool.
+
The Earth System Model Evaluation Tool (ESMValTool) is a community
+developed software toolkit that aims to facilitate the diagnosis and
+evaluation of the causes and effects of model biases and inter-model
+spread within the CMIP model ensemble.
+
This tutorial is structured into basic and
+advanced topics such that episodes starting from the
+[Introduction][lesson-introduction] up to the episode on Conclusion of the basic tutorial all
+cover basic topics and can be done in one sitting.
+
The remaining episodes cover the advanced topics and each episode is
+a mini-tutorial covering an advanced aspect of working with ESMValTool.
+These mini-tutorials can be appended to the main tutorial or worked
+through independently.
+
+
+
+
+
+
What will you learn in this course
+
+
What is ESMValTool
+
How to install ESMValTool
+
How to configure ESMValTool for your local system
+
How to run ESMValTool
+
How to work with ESMValTool’s suite of preprocessors
+
How to debug your recipes
+
How to access and deploy recipes from the ESMValTools gallery
+(Advanced)
+
How to develop your own diagnostics and recipes (Advanced)
+
How to contribute your recipes and diagnostics back into ESMValTool
+(Advanced)
+
How to include new observational datasets (Advanced)
Main things you need to know before starting this course
+
This tutorial can be taken online independently or taught by one
+of our instructors.
+
Don’t be alarmed if you can’t work through the entire tutorial in
+one sitting. It may take some time to get used to working with
+ESMValTool.
+
If you get stuck, help is always available from the tutors, from
+ESMValTool developers via the github issues
+page or via the ESMValTool email list. Please see information
+on how to subscribe to user mailing list.
+
This tutorial includes several advanced lessons after the
+conclusion. These advanced lessons should be treated like
+“mini-tutorials”, and include aspects like “developing your own
+diagnostic” or “how to include observations”.
This page includes some information on how to prepare for
+participating in this tutorial.
+
+
+
+
+
+
Prerequisites
+
+
Minimal requirements:
+
Basic understanding of your preferred command line interface (ie a
+bash terminal)
+
Access to CMIP data
+
Optional, but useful:
+
Basic understanding of git
+
Access to a suitable computing system (eg CEDA-Jasmin,
+DKRZ-Mistral)
+
GitHub account
+
+
+
+
Command line & git tutorials
+
We typically use the command line to interact with ESMValTool. While
+most of us are likely to have experience with the command line, novices
+may want to work through this software carpentry unix shell course.
Git is a distributed version-control system for tracking changes in
+source code during software development. It’s how we distribute, share,
+and manage the ESMValTool code.
Access to CMIP and Observational data and a suitable compute
+cluster
+
To complete this tutorial and use ESMValTool, you will need access to
+data in a reasonable format. Some data will be provided, but there is
+simply too much data available for your tutors to make it all available
+directly.
+
ESMValTool may be run on multiple platforms, from your local machine
+to large computing clusters. The best option is to use a computing
+cluster with an Earth System
+Grid Federation (ESGF) node. The benefit of using a compute cluster
+with an ESGF node is that the Coupled
+Model Intercomparison Project (CMIP) is locally stored on disk and
+accessible directly by the tool. Similarly, observational data would
+also be available at these sites. The ESGF also hosts observations for
+Model Intercomparison Projects (obs4MIPs) and reanalyses data
+(ana4MIPs).
+
Here are a few options for compute clusters with ESGF nodes:
Also note that if you are working from home, JASMIN may not be
+directly accessible from your home. You may need to use ssh to connect
+to a machine in your institute and then on to JASMIN. Please test your
+connection before the tutorial.
+
+
Jasmin-login
+
Note that you have only created an account for the web-interface. To
+log into the jasmin machine and do work, you’ll need to create a login
+account too using this
+page.
Once you have access to the data archive on CEDA, make sure to link
+your CEDA and JASMIN accounts. This can be done by checking the link to
+CEDA box on your JASMIN
+profile page.
+
The linking may take a few hours to take effect and is necessary for
+you to access the BADC archives via JASMIN. Some CMIP5 data sets such as
+MIROC are not accessible by default and special permission has to be
+requested to access them via the
+CEDA catalogue page.
+
+
+
Test your Setup
+
Log into jasmin-login:
+
+
BASH
+
+
ssh-X JASMIN-USERNAME@login1.jasmin.ac.uk
+
+
Then log into the sci1 machine:
+
+
BASH
+
+
ssh-X jasmin-sci1
+
+
Can you see the following locations:
+
+
BASH
+
+
ls /gws/nopw/j04/esmeval/obsdata-v2/
+ls /badc/cmip5/data/cmip5/output1/MOHC/HadGEM2-ES
+ls /badc/cmip6/data/CMIP6/CMIP/*/*/historical/r1i1p1f?/Omon/[ts]os/gn/latest/*.nc
+
+
Note that the JASMIN is only open to certain locations (mostly
+universities, and research centres). You may need a VPN if you wish to
+connect from your home network.
Once you have access to the data archive on CEDA, make sure to link
+your CEDA and JASMIN accounts. This can be done by checking the link to
+CEDA box on your JASMIN
+profile page.
+
The linking may take a few hours to take effect and is necessary for
+you to access the BADC archives via JASMIN. Some CMIP5 data sets such as
+MIROC are not accessible by default and special permission has to be
+requested to access them via the
+CEDA catalogue page.
Please skip this section if you are not going to use DKRZ and go here.
+
If you do not already have an account at the DKRZ, then register as soon as
+possible. You could find a short introduction on how to get started at
+DKRZ here.
+
There is also a user manual for
+Levante which is DKRZ’s current supercomputer.
+
+
Join a project
+
To use the resources on DKRZ you have to join a project. One option
+is to join an existing project by logging into https://luv.dkrz.de/ with
+your account and select ‘Join existing project’. Once you are accepted
+by the manager of your chosen project, your web account will be turned
+into a full LDAP account which will allow you to log into and use the
+DKRZ’s resources. If you do not have access to an existing project,
+another option for you would be to apply for resources at DKRZ. Here are
+some instructions on how
+to apply for resources.
+
+
+
Access to data on DKRZ
+
CMIP5 and CMIP6 data are available in these directories:
Temporary data: scratch directory on /scratch/*/username is
+automatically deleted after 14 days (15TiB) (Please use this directory
+for all your testing! Do not use the work directory for tests.) (see
+also this)
+
Running batch jobs: Info and examples on SLURM job scheduling system
+at DKRZ can be found here.
Please skip this section if you are not going to use ESMValTool on
+your local machine and go here.
+
If you are planning on running ESMValTool on your own machine, please
+make sure that you are able to download CMIP data and that you have a
+few GB of space available to install conda and ESMValTool, but also
+enough to make a copy of some data (~125MB) needed for this
+tutorial.
+
You can use ESMValTool to automatically download data needed for test
+recipes. Please see the [Configuration][lesson-configuration] episode or
+the [configuration file documentation][config-file] for more
+information. This the recommended option as it has the advantage that
+data is stored in subdirectories, and features such as wildcards and
+recording the version of the data will work automatically.
+
Alternatively, you can run the following command using wget:
You don’t need a github account to participate in the tutorial.
+However, if you want to raise an issue, contribute to the discussions,
+or share your code, please create a github
+account.
You may hear a few of the following phrases during the tutorial.
+Don’t be alarmed, they will make sense eventually.
+
+
GitHub issues
+
Issues are github’s ticketing system. They allow users and developers
+to discuss problems, identify bugs, or to make suggestions. Each issue
+is assigned a number and will have it’s own page on GitHub.
Raising an issue is the act of creating a new issue. If you are asked
+to raise an issue, please follow any instructions that you are given,
+and also make sure that you read the default issue text.
+
+
+
GitHub pull requests
+
A GitHub pull request is the act of requesting that a branch is
+merged with another branch.
+
This is an advanced feature of GitHub, and will generally be
+performed by the ESMValTool development team.
+
+
+
+
+ Beta
+
+ This lesson is in the beta phase, which means that it is ready for teaching by instructors outside of the original author team.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ESMValTool Tutorial
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
Instructor Notes
+
+
+
This page includes some tips, reminders and advice for giving this
+tutorial.
+
Tutorial Content
+
+
+
Well ahead of the planned tutorial date, please go over the available
+tutorial material in your own time at least once to note specifics such
+as areas where your cohort may need extra time, any changes you want to
+make or additional examples you would like to include.
+
If you want to add to or adapt the material, you can create your own
+copy to modify by forking the repository. See here for a guide on how
+to fork a repository.
+
Please use our guide for Contributing
+when creating your own material just in case you would like to add to
+the current repository in the future.
+
Once you are satisified with the material you would like to use and
+have an estimate of the time required, please proceed with meeting
+logistics.
+
Meeting Logistics
+
+
+
Decide if the tutorial will be held on site or online. If on site,
+make sure of venue availability before finalizing the date. If online,
+set up connection details.
+
Once you have agreed to a date, and attendees have signed up, send
+them an email with the following information:
a contact email address so they can post any questions that arise
+during setup
+
venue details (if on site) or meeting connection details (if
+online).
+
+
Also allow for sufficient time to get access to any accounts etc.
+which could take more than a few days in some cases (a good thumb rule
+is to have the time to send a reminder email a few days after the first
+email, so that participants will still have time after the reminder to
+complete setup).
+
Prepare an estimate for how long the tutorial will last (this could
+vary based on your choice of modules and any changes you have made to
+the tutorial). Do include comfort breaks as needed.
+
If more than one person will be conducting the tutorial, discuss
+ahead and decide who will handle each component for a smooth flow of the
+material.
+
Remind people a few days before the meeting about the pre-meeting
+exercises.
+
If conducting the tutorial on site, confirm room bookings and check
+for internet connectivity or firewall issues at the venue. Coffee and
+biscuits also usually go down well with participants!
+
If conducting the tutorial online, have a backup plan for failed or
+poor internet connections. This could involve sending participants some
+of the material or previously recorded meeting videos ahead of time.
+Establish rules for questions, microphone and video usage before
+starting the meeting. Send meeting connection details (Zoom/Teams etc.)
+ahead of time and the day before the meeting.
+
Have a feedback form ready if you would like attendees to send you
+feedback. Request that they do so at the end of the tutorial and send it
+out as soon after the tutorial as possible.
+
Tips
+
+
+
+
Test all recipes, diagnostics and instructions in
+advance.
+
Make sure you don’t have issues with accessing the compute node
+via VPN or Wi-Fi.
+
Check with your computing service for any planned downtime or
+potential interuptions.
+
Download a local copy of the data, just in case.
+
Decide on one or two advanced mini-tutorials as a stretch goal
+but don’t expect to do all of them.
+
+
+
+
diff --git a/instructor/00-introduction.html b/instructor/00-introduction.html
new file mode 100644
index 00000000..4f9f78fd
--- /dev/null
+++ b/instructor/00-introduction.html
@@ -0,0 +1,676 @@
+
+ESMValTool Tutorial: Introduction
+ Skip to main content
+
+
+
+
+
+
+
+ Beta
+
+ This lesson is in the beta phase, which means that it is ready for teaching by instructors outside of the original author team.
+
+
+
+
This tutorial is a first introduction to ESMValTool. Before diving
+into the technical steps, let’s talk about what ESMValTool is all
+about.
+
+
+
+
+
+
What is ESMValTool?
+
+
What do you already know about or expect from ESMValTool?
+
+
+
+
+
+
+
+
+
EMSValTool is many things, but in this tutorial we will focus on the
+following traits:
+
✓ A tool to analyse climate data
+
✓ A collection of diagnostics for reproducible climate
+science
+
✓ A community effort
+
+
+
+
+
A tool to analyse climate data
+
ESMValTool takes care of finding, opening, checking, fixing,
+concatenating, and preprocessing CMIP data and several other supported
+datasets.
+
The central component of ESMValTool that we will see in this tutorial
+is the recipe. Any ESMValTool recipe is basically a set
+of instructions to reproduce a certain result. The basic structure of a
+recipe is as follows:
+
+Documentation with relevant (citation)
+information
+
+Datasets that should be analysed
+
+Preprocessor steps that must be applied
+
+Diagnostic scripts performing more specific
+evaluation steps
+
An example recipe could look like this:
+
+
YAML
+
+
documentation:
+title: This is an example recipe.
+description: Example recipe
+authors:
+- lastname_firstname
+
+datasets:
+-{dataset: HadGEM2-ES,project: CMIP5,exp: historical,mip: Amon,
+ensemble: r1i1p1,start_year:1960,end_year:2005}
+
+preprocessors:
+global_mean:
+area_statistics:
+operator: mean
+
+diagnostics:
+hockeystick_plot:
+description: plot of global mean temperature change
+variables:
+temperature:
+short_name: tas
+preprocessor: global_mean
+scripts: hockeystick.py
+
+
+
+
+
+
+
Understanding the different section of the recipe
+
+
Try to figure out the meaning of the different dataset keys. Hint:
+they can be found in the documentation of ESMValTool.
A collection of diagnostics for reproducible climate science
+
More than a tool, ESMValTool is a collection of publicly available
+recipes and diagnostic scripts. This makes it possible to easily
+reproduce important results.
ESMValTool is built and maintained by an active community of
+scientists and software engineers. It is an open source project to which
+anyone can contribute. Many of the interactions take place on GitHub.
+Here, we briefly introduce you to some of the most important pages.
+
+
+
+
+
+
Meet the ESMValGroup
+
+
Go to github.com/ESMValGroup. This
+is the GitHub page of our ‘organization’. Have a look around. How many
+collaborators are there? Do you know any of them?
+
Near the top of the page there are 2 pinned repositories: ESMValTool
+and ESMValCore. Visit each of the repositories. How many people have
+contributed to each of them? Can you also find out how many people have
+contributed to this tutorial?
+
+
+
+
+
+
+
+
+
Issues and pull requests
+
+
Go back to the repository pages of ESMValTool or ESMValCore. There
+are tabs for ‘issues’ and ‘pull requests’. You can use the labels to
+navigate them a bit more. How many open issues are about enhancements of
+ESMValTool? And how many bugs have been fixed in ESMValCore? There is
+also an ‘insights’ tab, where you can see a summary of recent activity.
+How many issues have been opened and closed in the past month?
+
+
+
+
Conclusion
+
This concludes the introduction of the tutorial. You now have a basic
+knowledge of ESMValTool and its community. The following episodes will
+walk you through the installation, configuration and running your first
+recipes.
+
+
+
+
+
+
Key Points
+
+
ESMValTool provides a reliable interface to analyse and evaluate
+climate data
+
A large collection of recipes and diagnostic scripts is already
+available
+
ESMValTool is built and maintained by an active community of
+scientists and developers
+
+
diff --git a/instructor/01-quickstart.html b/instructor/01-quickstart.html
new file mode 100644
index 00000000..f726b7fd
--- /dev/null
+++ b/instructor/01-quickstart.html
@@ -0,0 +1,618 @@
+
+ESMValTool Tutorial: Quickstart guide
+ Skip to main content
+
+
+
+
+
+
+
+ Beta
+
+ This lesson is in the beta phase, which means that it is ready for teaching by instructors outside of the original author team.
+
+
+
+
How do I load and check the ESMValTool environment?
+
How do I configure ESMValTool?
+
How do I run a recipe?
+
+
+
+
+
+
+
Objectives
+
Understand the purpose of the quickstart guide
+
Load and check the ESMValTool environment
+
Configure ESMValTool
+
Run a recipe
+
+
+
+
+
+
+
+
+
+
+
What is the purpose of the quickstart guide?
+
+
The purpose of the quickstart guide is to enable a user of
+ESMValTool to run ESMValTool as quickly as possible by making the bare
+minimum number of changes.
+
+
+
+
+
+
+
+
+
How do I load and check the ESMValTool environment?
+
+
For this quickstart guide, an assumption is made that ESMValTool
+has already been installed at the site where ESMValTool will be run. If
+this is not the case, see the [Installation][lesson-installation]
+episode in this tutorial.
+
Load the ESMValTool environment by following the instructions at
+[ESMValTool: Pre-installed versions on HPC clusters / other
+servers][activate-environment].
+
+
Check the ESMValTool environment by accessing the help for
+ESMValTool:
+
+
BASH
+
+
esmvaltool--help
+
+
+
+
+
+
+
+
+
+
+
How do I configure ESMValTool?
+
+
+
Create the ESMValTool user configuration file (the file is
+written by default to ~/.esmvaltool/config-user.yml):
+
+
BASH
+
+
esmvaltool config get_config_user
+
+
+
Edit the ESMValTool user configuration file using your favourite
+text editor to uncomment the lines relating to the site where ESMValTool
+will be run.
+
For more details about the ESMValTool user configuration file see
+the [Configuration][lesson-configuration] episode in this
+tutorial.
+
+
+
+
+
+
+
+
+
How do I run a recipe?
+
+
+
Run the example Python recipe:
+
+
BASH
+
+
esmvaltool run examples/recipe_python.yml
+
+
+
+
Wait for the recipe to complete. If the recipe completes
+successfully, the last line printed to screen at the end of the log will
+look something like:
+
+
BASH
+
+
YYYY-MM-DD HH:mm:SS, NNN UTC [NNNNN] INFO Run was successful
+
+
+
+
View the output of the recipe by opening the HTML file produced
+by ESMValTool (the location of this file is printed to screen near the
+end of the log):
+
+
BASH
+
+
YYYY-MM-DD HH:mm:SS, NNN UTC [NNNNN] INFO Wrote recipe output to:
+file:///$HOME/esmvaltool_output/recipe_python_<date>_<time>/index.html
+
+
+
For more details about running recipes see the [Running your
+first recipe][lesson-recipe] episode in this tutorial.
+
+
+
+
+
+
+
+
+
Key Points
+
+
The purpose of the quickstart guide is to enable a user of
+ESMValTool to run ESMValTool as quickly as possible without having to go
+through the whole tutorial
+
Use the module load command to load the ESMValTool
+environment, see the \[Installation\]\[lesson-installation\] episode for more
+details and use esmvaltool --help to check the ESMValTool
+environment
+
Use esmvaltool config get_config_user to create the
+ESMValTool user configuration file
+
+
diff --git a/instructor/02-installation.html b/instructor/02-installation.html
new file mode 100644
index 00000000..96a72eb4
--- /dev/null
+++ b/instructor/02-installation.html
@@ -0,0 +1,750 @@
+
+ESMValTool Tutorial: Installation
+ Skip to main content
+
+
+
+
+
+
+
+ Beta
+
+ This lesson is in the beta phase, which means that it is ready for teaching by instructors outside of the original author team.
+
+
+
+
What are the prerequisites for installing ESMValTool?
+
How do I confirm that the installation was successful?
+
+
+
+
+
+
+
Objectives
+
Install ESMValTool
+
Demonstrate that the installation was successful
+
+
+
+
+
+
Overview
+
The instructions help with the installation of ESMValTool on
+operating systems like Linux/MacOSX/Windows. We use the Mamba
+package manager to install the ESMValTool. Other installation methods
+are also available; they can be found in the documentation.
+We will first install Mamba, and then ESMValTool. We end this chapter by
+testing that the installation was successful.
+
Before we begin, here are all the possible ways in which you can use
+ESMValTool depending on your level of expertise or involvement with
+ESMValTool and associated software such as GitHub and Mamba.
+
If you have access to a server where ESMValTool is already installed
+as a module, for e.g., the [CEDA JASMIN](https://help.jasmin.ac.uk/article
+/4955-community-software-esmvaltool) server, you can simply load the
+module with the following command:
+
+
BASH
+
+
module load esmvaltool
+
+
After loading esmvaltool, we can start using ESMValTool
+right away. Please see the next
+lesson. 2. If you would like to install ESMValTool as a mamba
+package, then this lesson will tell you how! 3. If you would like to
+start experimenting with existing diagnostics or contributing to
+ESMvalTool, please see the instructions for source installation in the
+lesson Development and
+contribution and in the documentation.
+
+
+
+
+
+
Install ESMValTool on Windows
+
+
ESMValTool does not directly support Windows, but successful usage
+has been reported through the Windows Subsystem
+for Linux(WSL), available in Windows 10. To install the WSL please
+follow the instructions on the
+Windows Documentation page. After installing the WSL, installation
+can be done using the same instructions for Linux/MacOSX.
+
+
+
+
Install ESMValTool on Linux/MacOSX
+
+
Install Mamba
+
ESMValTool is distributed using Mamba. To
+install mamba on Linux or MacOSX, follow the
+instructions below:
+
Please download the installation file for the latest Mamba
+version here.
+
Next, run the installer from the place where you downloaded
+it:
+
On Linux:
+
+
BASH
+
+
bash Mambaforge-Linux-x86_64.sh
+
+
On MacOSX:
+
+
BASH
+
+
bash Mambaforge-MacOSX-x86_64.sh
+
+
Follow the instructions in the installer. The defaults should
+normally suffice.
+
You will need to restart your terminal for the changes to have
+effect.
+
We recommend updating mamba before the esmvaltool installation.
+To do so, run:
+
+
BASH
+
+
mamba update --name base mamba
+
+
Verify you have a working mamba installation by:
+
+
BASH
+
+
which mamba
+
+
This should show the path to your mamba executable,
+e.g. ~/mambaforge/bin/mamba.
+
For more information about installing mamba, see [the mamba
+installation documentation](https://docs.esmvaltool.org/en
+/latest/quickstart/installation.html#mamba-installation).
+
+
+
Install the ESMValTool package
+
The ESMValTool package contains diagnostics scripts in four
+languages: R, Python, Julia and NCL. This introduces a lot of
+dependencies, and therefore the installation can take quite long. It is,
+however, possible to install ‘subpackages’ for each of the languages.
+The following (sub)packages are available:
+
esmvaltool-python
+
esmvaltool-ncl
+
esmvaltool-r
+
+esmvaltool –> the complete package, i.e. the
+combination of the above.
+
For the tutorial, we will install the complete package. Thus, to
+install the ESMValTool package, run
+
+
BASH
+
+
mamba create --name esmvaltool esmvaltool
+
+
On MacOSX ESMValTool functionalities in Julia, NCL, and R are not
+supported. To install a Mamba environment on MacOSX, please refer to
+specific information.
Some ESMValTool diagnostics are written in the Julia programming
+language. If you want a full installation of ESMValTool including Julia
+diagnostics, you need to make sure Julia is installed before installing
+ESMValTool.
+
In this tutorial, we will not use Julia, but for reference, we have
+listed the steps to install Julia below. Complete instructions for
+installing Julia can be found on the Julia
+installation page.
+
+
+
+
+
+
First, open a bash terminal and activate the newly created
+esmvaltool environment.
+
+
BASH
+
+
conda activate esmvaltool
+
+
Next, to install Julia via mamba, you can use the
+following command:
+
+
BASH
+
+
mamba install julia
+
+
To check that the Julia executable can be found, run
+
+
BASH
+
+
which julia
+
+
to display the path to the Julia executable, it should be
+
+
OUTPUT
+
+
~/mambaforge/envs/esmvaltool/bin/julia
+
+
To test that Julia is installed correctly, run
+
+
BASH
+
+
julia
+
+
to start the interactive Julia interpreter. Press Ctrl+D
+to exit.
+
+
+
+
+
+
+
Test that the installation was successful
+
To test that the installation was successful, run
+
+
BASH
+
+
conda activate esmvaltool
+
+
to activate the conda environment called esmvaltool. In
+the shell prompt the active conda environment should have been changed
+from (base) to (esmvaltool).
+
Next, run
+
+
BASH
+
+
esmvaltool--help
+
+
to display the command line help.
+
+
+
+
+
+
Version of ESMValTool
+
+
Can you figure out which version of ESMValTool has been
+installed?
+
+
+
+
+
+
+
+
+
The esmvaltool --help command lists version
+as a command to get the version
+
When you run
+
+
BASH
+
+
esmvaltool version
+
+
The version of ESMValTool installed should be displayed on the screen
+as:
+
+
OUTPUT
+
+
ESMValCore: 2.11.0
+ESMValTool: 2.11.0
+
+
Note that on HPC servers such as JASMIN, sometimes a more recent
+development version may be displayed for ESMValTool, for e.g.
+ESMValTool: 2.11.0.dev71+g2c60b4d97
+
+
+
+
+
+
+
+
+
+
Key Points
+
+
All the required packages can be installed using mamba.
+
You can find more information about installation in the documentation.
+
+
diff --git a/instructor/03-configuration.html b/instructor/03-configuration.html
new file mode 100644
index 00000000..b7080d9c
--- /dev/null
+++ b/instructor/03-configuration.html
@@ -0,0 +1,936 @@
+
+ESMValTool Tutorial: Configuration
+ Skip to main content
+
+
+
+
+
+
+
+ Beta
+
+ This lesson is in the beta phase, which means that it is ready for teaching by instructors outside of the original author team.
+
+
+
+
What is the user configuration file and how should I use it?
+
+
+
+
+
+
+
Objectives
+
Understand the contents of the user-config.yml file
+
Prepare a personalized user-config.yml file
+
Configure ESMValTool to use some settings
+
+
+
+
+
+
The configuration file
+
For the purposes of this tutorial, we will create a directory in our
+home directory called esmvaltool_tutorial and use that as
+our working directory. The following steps should do that:
+
+
BASH
+
+
mkdir esmvaltool_tutorial
+cd esmvaltool_tutorial
+
+
The config-user.yml configuration file contains all the
+global level information needed by ESMValTool to run. This is a YAML file.
+
You can get the default configuration file by running:
The default configuration file will be downloaded to the directory
+specified with the --path variable. For instance, you can
+provide the path to your working directory as the
+target_dir. If this option is not used, the file will be
+saved to the default location:
+~/.esmvaltool/config-user.yml, where ~ is the
+path to your home directory. Note that files and directories starting
+with a period are “hidden”, to see the .esmvaltool
+directory in the terminal use ls -la ~. Note that if a
+configuration file by that name already exists in the default location,
+the get_config_user command will not update the file as
+ESMValTool will not overwrite the file. You will have to move the file
+first if you want an updated copy of the user configuration file.
+
We run a text editor called nano to have a look inside
+the configuration file and then modify it if needed:
+
+
BASH
+
+
nano ~/.esmvaltool/config-user.yml
+
+
Any other editor can be used, e.g.vim.
+
This file contains the information for:
+
Output settings
+
Destination directory
+
Auxiliary data directory
+
Number of tasks that can be run in parallel
+
Rootpath to input data
+
Directory structure for the data from different projects
+
+
+
+
+
+
Text editor side note
+
+
No matter what editor you use, you will need to know where it
+searches for and saves files. If you start it from the shell, it will
+(probably) use your current working directory as its default location.
+We use nano in examples here because it is one of the least
+complex text editors. Press ctrl + O to save the
+file, and then ctrl + X to exit
+nano.
+
+
+
+
Output settings
+
The configuration file starts with output settings that inform
+ESMValTool about your preference for output. You can turn on or off the
+setting by true or false values. Most of these
+settings are fairly self-explanatory.
+
+
+
+
+
+
Saving preprocessed data
+
+
Later in this tutorial, we will want to look at the contents of the
+preproc folder. This folder contains preprocessed data and
+is removed by default when ESMValTool is run. In the configuration file,
+which settings can be modified to prevent this from happening?
+
+
+
+
+
+
+
+
+
If the option remove_preproc_dir is set to
+false, then the preproc/ directory contains
+all the pre-processed data and the metadata interface files. If the
+option save_intermediary_cubes is set to true
+then data will also be saved after each preprocessor step in the folder
+preproc. Note that saving all intermediate results to file
+will result in a considerable slowdown, and can quickly fill your
+disk.
+
+
+
+
+
Destination directory
+
The destination directory is the rootpath where ESMValTool will store
+its output folders containing e.g. figures, data, logs, etc. With every
+run, ESMValTool automatically generates a new output folder determined
+by recipe name, and date and time using the format: YYYYMMDD_HHMMSS.
+
+
+
+
+
+
Set the destination directory
+
+
Let’s name our destination directory esmvaltool_output
+in the working directory. ESMValTool should write the output to this
+path, so make sure you have the disk space to write output to this
+directory. How do we set this in the config-user.yml?
+
+
+
+
+
+
+
+
+
We use output_dir entry in the
+config-user.yml file as:
+
+
YAML
+
+
output_dir: ./esmvaltool_output
+
+
If the esmvaltool_output does not exist, ESMValTool will
+generate it for you.
+
+
+
+
+
Rootpath to input data
+
ESMValTool uses several categories (in ESMValTool, this is referred
+to as projects) for input data based on their source. The current
+categories in the configuration file are mentioned below. For example,
+CMIP is used for a dataset from the Climate Model Intercomparison
+Project whereas OBS may be used for an observational dataset. More
+information about the projects used in ESMValTool is available in the
+[documentation](https://docs.esmvaltool.org/projects/esmvalcore/en/latest/
+quickstart/find_data.html). When using ESMValTool on your own machine,
+you can create a directory to download climate model data or observation
+data sets and let the tool use data from there. It is also possible to
+ask ESMValTool to download climate model data as needed. This can be
+done by specifying a download directory and by setting the option to
+download data as shown below.
+
+
YAML
+
+
# Directory for storing downloaded climate data
+download_dir: ~/climate_data
+search_esgf: always
+
+
If you are working offline or do not want to download the data then
+set the option above to never. If you want to download data
+only when the necessary files are missing at the usual location, you can
+set the option to when_missing.
+
The rootpath specifies the directories where ESMValTool
+will look for input data. For each category, you can define either one
+path or several paths as a list. For example:
These are typically available in the default configuration file you
+downloaded, so simply removing the machine specific lines should be
+sufficient to access input data.
+
+
+
+
+
+
Set the correct rootpath
+
+
In this tutorial, we will work with data from CMIP5 and CMIP6. How can we
+modify the rootpath to make sure the data path is set
+correctly for both CMIP5 and CMIP6? Note: to get the data, check the
+instructions in Setup.
+
+
+
+
+
+
+
+
+
Are you working on your own local machine? You need to add the root
+path of the folder where the data is available to the
+config-user.yml file as:
Are you working on your local machine and have downloaded data using
+ESMValTool? You need to add the root path of the folder where the data
+has been downloaded to as specified in the
+download_dir.
Are you working on a computer cluster like Jasmin or DKRZ?
+Site-specific path to the data for JASMIN/DKRZ/ETH/IPSL are already
+listed at the end of the config-user.yml file. You need to
+uncomment the related lines. For example, on JASMIN:
Directory structure for the data from different projects
+
Input data can be from various models, observations and reanalysis
+data that adhere to the CF/CMOR
+standard. The drs setting describes the file
+structure.
+
The drs setting describes the file structure for several
+projects (e.g. CMIP6, CMIP5, obs4mips, OBS6, OBS) on several key
+machines (e.g. BADC, CP4CDS, DKRZ, ETHZ, SMHI, BSC). For more
+information about drs, you can visit the ESMValTool
+documentation on [Data Reference Syntax (DRS)](https://docs.esmvaltool.org/projects/esmvalcore/
+en/latest/quickstart/find_data.html#cmor-drs).
+
+
+
+
+
+
Set the correct drs
+
+
In this lesson, we will work with data from CMIP5 and CMIP6. How can we
+set the correct drs?
+
+
+
+
+
+
+
+
+
Are you working on your own local machine? You need to set the
+drs of the data in the config-user.yml file
+as:
+
+
YAML
+
+
drs:
+CMIP5: default
+CMIP6: default
+
+
Are you asking ESMValTool to download the data for use with your
+diagnostics? You need to set the drs of the data in the
+config-user.yml file as:
Are you working on a computer cluster like Jasmin or DKRZ?
+Site-specific drs of the data are already listed at the end
+of the config-user.yml file. You need to uncomment the
+related lines. For example, on Jasmin:
+
+
YAML
+
+
# Site-specific entries: Jasmin
+ # Uncomment the lines below to locate data on JASMIN
+drs:
+CMIP6: BADC
+CMIP5: BADC
+OBS: default
+OBS6: default
+obs4mips: default
+ana4mips: default
+
+
+
+
+
+
+
+
+
+
+
Explain the default drs (if working on local machine)
+
+
In the previous exercise, we set the drs of CMIP5 data
+to default. Can you explain why?
+
Have a look at the directory structure of the OBS data.
+There is a folder called Tier1. What does it mean?
+
+
+
+
+
+
+
+
+
drs: default is one way to retrieve data from a ROOT
+directory that has no DRS-like structure. default indicates
+that all the files are in a folder without any structure.
+
Observational data are organized in Tiers depending on their
+level of public availability. Therefore the default directory must be
+structured accordingly with sub-directories TierX
+e.g. Tier1, Tier2 or Tier3, even when drs: default. More
+details can be found in the [documentation](https://docs.esmvaltool.org/projects/esmvalcore/en/latest/
+quickstart/find_data.html#observational-data).
+
+
+
+
+
Other settings
+
+
+
+
+
+
Auxiliary data directory
+
+
The auxiliary_data_dir setting is the path where any
+required additional auxiliary data files are stored. This location
+allows us to tell the diagnostic script where to find the files if they
+can not be downloaded at runtime. This option should not be used for
+model or observational datasets, but for data files (e.g. shape files)
+used in plotting such as coastline descriptions and if you want to feed
+some additional data (e.g. shape files) to your recipe.
This option enables you to perform parallel processing. You can
+choose the number of tasks in parallel as 1/2/3/4/… or you can set it to
+null. That tells ESMValTool to use the maximum number of
+available CPUs. For the purpose of the tutorial, please set ESMValTool
+use only 1 cpu:
+
+
YAML
+
+
max_parallel_tasks:1
+
+
In general, if you run out of memory, try setting
+max_parallel_tasks to 1. Then, check the amount of memory
+you need for that by inspecting the file
+run/resource_usage.txt in the output directory. Using the
+number there you can increase the number of parallel tasks again to a
+reasonable number for the amount of memory available in your system.
+
+
+
+
+
+
+
+
+
Make your own configuration file
+
+
It is possible to have several configuration files with different
+purposes, for example: config-user_formalised_runs.yml,
+config-user_debugging.yml. In this case, you have to pass the path of
+your own configuration file as a command-line option when running the
+ESMValTool. We will learn how to do this in the next lesson.
+
+
+
+
+
+
+
+
+
Key Points
+
+
The config-user.yml tells ESMValTool where to find
+input data.
+
+
diff --git a/instructor/04-recipe.html b/instructor/04-recipe.html
new file mode 100644
index 00000000..ab5055d1
--- /dev/null
+++ b/instructor/04-recipe.html
@@ -0,0 +1,1070 @@
+
+ESMValTool Tutorial: Running your first recipe
+ Skip to main content
+
+
+
+
+
+
+
+ Beta
+
+ This lesson is in the beta phase, which means that it is ready for teaching by instructors outside of the original author team.
+
+
+
+
This episode describes how ESMValTool recipes work, how to run a
+recipe and how to explore the recipe output. By the end of this episode,
+you should be able to run your first recipe, look at the recipe output,
+and make small modifications.
+
Running an existing recipe
+
The recipe format has briefly been introduced in the
+[Introduction][lesson-introduction] episode. To see all the recipes that
+are shipped with ESMValTool, type
+
+
BASH
+
+
esmvaltool recipes list
+
+
We will start by running [examples/recipe_python.yml](https://docs.esmvaltool.
+org/en/latest/recipes/recipe_examples.html)
+
esmvaltool run examples/recipe_python.yml
+
or if you have the user configuration file in your current directory
+then
+
esmvaltool run --config_file ./config-user.yml examples/recipe_python.yml
+
If everything is okay, you should see that ESMValTool is printing a
+lot of output to the command line. The final message should be “Run was
+successful”. The exact output varies depending on your machine, but it
+should look something like the example log output on terminal below.
+
{% include example_output.txt %}
+
+
+
+
+
+
Pro tip: ESMValTool search paths
+
+
You might wonder how ESMValTool was able find the recipe file, even
+though it’s not in your working directory. All the recipe paths printed
+from esmvaltool recipes list are relative to ESMValTool’s
+installation location. This is where ESMValTool will look if it cannot
+find the file by following the path from your working directory.
+
+
+
+
Investigating the log messages
+
Let’s dissect what’s happening here.
+
+
+
+
+
+
Output files and directories
+
+
After the banner and general information, the output starts with some
+important locations.
+
Did ESMValTool use the right config file?
+
What is the path to the example recipe?
+
What is the main output folder generated by ESMValTool?
+
Can you guess what the different output directories are for?
+
ESMValTool creates two log files. What is the difference?
+
+
+
+
+
+
+
+
+
The config file should be the one we edited in the previous episode,
+something like
+/home/<username>/.esmvaltool/config-user.yml or
+~/esmvaltool_tutorial/config-user.yml.
+
ESMValTool found the recipe in its installation directory, something
+like
+/home/users/username/mambaforge/envs/esmvaltool/bin/esmvaltool/recipes/examples/
+or if you are using a pre-installed module on a server, something like
+/apps/jasmin/community/esmvaltool/ESMValTool_<version> /esmvaltool/recipes/examples/recipe_python.yml,
+where <version> is the latest release.
+
ESMValTool creates a time-stamped output directory for every run. In
+this case, it should be something like
+recipe_python_YYYYMMDD_HHMMSS. This folder is made inside
+the output directory specified in the previous episode:
+~/esmvaltool_tutorial/esmvaltool_output.
+
There should be four output folders:
+
+plots/: this is where output figures are stored.
+
+preproc/: this is where pre-processed data are
+stored.
+
+run/: this is where esmvaltool stores general
+information about the run, such as log messages and a copy of the recipe
+file.
+
+work/: this is where output files (not figures) are
+stored.
+
The log files are:
+
+main_log.txt is a copy of the command-line output
+
+main_log_debug.txt contains more detailed information
+that may be useful for debugging.
+
+
+
+
+
+
+
+
+
+
Debugging: No ‘preproc’ directory?
+
+
If you’re missing the preproc directory, then your
+config-user.yml file has the value
+remove_preproc_dir set to true (this is used
+to save disk space). Please set this value to false and run
+the recipe again.
+
+
+
+
After the output locations, there are two main sections that can be
+distinguished in the log messages:
+
Creating tasks
+
Executing tasks
+
+
+
+
+
+
Analyse the tasks
+
+
List all the tasks that ESMValTool is executing for this recipe. Can
+you guess what this recipe does?
+
+
+
+
+
+
+
+
+
Just after all the ‘creating tasks’ and before ‘executing tasks’, we
+find the following line in the output:
+
[134535] INFO These tasks will be executed: map/tas, timeseries/tas_global,
+timeseries/script1, map/script1, timeseries/tas_amsterdam
+
So there are three tasks related to timeseries: global temperature,
+Amsterdam temperature, and a script (tas: near-surface air
+temperature). And then there are two tasks related to a map: something
+with temperature, and again a script.
+
+
+
+
+
Examining the recipe file
+
To get more insight into what is happening, we will have a look at
+the recipe file itself. Use the following command to copy the recipe to
+your working directory
+
+
BASH
+
+
esmvaltool recipes get examples/recipe_python.yml
+
+
Now you should see the recipe file in your working directory (type
+ls to verify). Use the nano editor to open
+this file:
+
+
BASH
+
+
nano recipe_python.yml
+
+
For reference, you can also view the recipe by unfolding the box
+below.
+
+
+
+
+
+
+
YAML
+
+
# ESMValTool
+# recipe_python.yml
+#
+# See https://docs.esmvaltool.org/en/latest/recipes/recipe_examples.html
+# for a description of this recipe.
+#
+# See https://docs.esmvaltool.org/projects/esmvalcore/en/latest/recipe/overview.html
+# for a description of the recipe format.
+---
+documentation:
+ description: |
+ Example recipe that plots a map and timeseries of temperature.
+
+title: Recipe that runs an example diagnostic written in Python.
+
+authors:
+- andela_bouwe
+- righi_mattia
+
+maintainer:
+- schlund_manuel
+
+references:
+- acknow_project
+
+projects:
+- esmval
+- c3s-magic
+
+datasets:
+-{dataset: BCC-ESM1,project: CMIP6,exp: historical,ensemble: r1i1p1f1,grid: gn}
+-{dataset: bcc-csm1-1,project: CMIP5,exp: historical,ensemble: r1i1p1}
+
+preprocessors:
+ # See https://docs.esmvaltool.org/projects/esmvalcore/en/latest/recipe/preprocessor.html
+ # for a description of the preprocessor functions.
+
+to_degrees_c:
+convert_units:
+units: degrees_C
+
+annual_mean_amsterdam:
+extract_location:
+location: Amsterdam
+scheme: linear
+annual_statistics:
+operator: mean
+multi_model_statistics:
+statistics:
+- mean
+span: overlap
+convert_units:
+units: degrees_C
+
+annual_mean_global:
+area_statistics:
+operator: mean
+annual_statistics:
+operator: mean
+convert_units:
+units: degrees_C
+
+diagnostics:
+
+map:
+description: Global map of temperature in January 2000.
+themes:
+- phys
+realms:
+- atmos
+variables:
+tas:
+mip: Amon
+preprocessor: to_degrees_c
+timerange: 2000/P1M
+ caption: |
+ Global map of {long_name} in January 2000 according to {dataset}.
+scripts:
+script1:
+script: examples/diagnostic.py
+quickplot:
+plot_type: pcolormesh
+cmap: Reds
+
+timeseries:
+description: Annual mean temperature in Amsterdam and global mean since 1850.
+themes:
+- phys
+realms:
+- atmos
+variables:
+tas_amsterdam:
+short_name: tas
+mip: Amon
+preprocessor: annual_mean_amsterdam
+timerange: 1850/2000
+caption: Annual mean {long_name} in Amsterdam according to {dataset}.
+tas_global:
+short_name: tas
+mip: Amon
+preprocessor: annual_mean_global
+timerange: 1850/2000
+caption: Annual global mean {long_name} according to {dataset}.
+scripts:
+script1:
+script: examples/diagnostic.py
+quickplot:
+plot_type: plot
+
+
+
+
+
+
Do you recognize the basic recipe structure that was introduced in
+episode 1?
+
+Documentation with relevant (citation)
+information
+
+Datasets that should be analysed
+
+Preprocessors groups of common preprocessing
+steps
+
+Diagnostics scripts performing more specific
+evaluation steps
+
+
+
+
+
+
Analyse the recipe
+
+
Try to answer the following questions:
+
Who wrote this recipe?
+
Who should be approached if there is a problem with this
+recipe?
+
How many datasets are analyzed?
+
What does the preprocessor called annual_mean_global
+do?
+
Which script is applied for the diagnostic called
+map?
+
Can you link specific lines in the recipe to the tasks that we saw
+before?
+
How is the location of the city specified?
+
How is the temporal range of the data specified?
+
+
+
+
+
+
+
+
+
The example recipe is written by Bouwe Andela and Mattia
+Righi.
+
Manuel Schlund is listed as the maintainer of this
+recipe.
+
Two datasets are analysed:
+
CMIP6 data from the model BCC-ESM1
+
CMIP5 data from the model bcc-csm1-1
+
The preprocessor annual_mean_global computes an area
+mean as well as annual means
+
The diagnostic called map executes a script referred
+to as script1. This is a python script named
+examples/diagnostic.py
+
There are two diagnostics: map and
+timeseries. Under the diagnostic map we find
+two tasks:
+
a preprocessor task called tas, applying the
+preprocessor called to_degrees_c to the variable
+tas.
+
a diagnostic task called script1, applying the script
+examples/diagnostic.py to the preprocessed data
+(map/tas).
+
Under the diagnostic timeseries we find three tasks:
+
a preprocessor task called tas_amsterdam, applying the
+preprocessor called annual_mean_amsterdam to the variable
+tas.
+
a preprocessor task called tas_global, applying the
+preprocessor called annual_mean_global to the variable
+tas.
+
a diagnostic task called script1, applying the script
+examples/diagnostic.py to the preprocessed data
+(timeseries/tas_global and
+timeseries/tas_amsterdam).
+
The extract_location preprocessor is used to get
+data for a specific location here. ESMValTool interpolates to the
+location based on the chosen scheme. Can you tell the scheme used here?
+For more ways to extract areas, see the [Area
+operations][preproc-area-manipulation] page.
+
The timerange tag is used to extract data from a
+specific time period here. The start time is 01/01/2000 and
+the span of time to calculate means is 1 Month given by
+P1M. For more options on how to specify time ranges, see
+the [timerange documentation][timeranges].
+
+
+
+
+
+
+
+
+
+
Pro tip: short names and variable groups
+
+
The preprocessor tasks in ESMValTool are called ‘variable groups’.
+For the diagnostic timeseries, we have two variable groups:
+tas_amsterdam and tas_global. Both of them
+operate on the variable tas (as indicated by the
+short_name), but they apply different preprocessors. For
+the diagnostic map the variable group itself is named
+tas, and you’ll notice that we do not explicitly provide
+the short_name. This is a shorthand built into
+ESMValTool.
+
+
+
+
+
+
+
+
+
Output files
+
+
Have another look at the output directory created by the ESMValTool
+run.
+
Which files/folders are created by each task?
+
+
+
+
+
+
+
+
+
+map/tas: creates /preproc/map/tas,
+which contains preprocessed data for each of the input datasets, a file
+called metadata.yml describing the contents of these
+datasets and provenance information in the form of .xml
+files.
+
+timeseries/tas_global: creates
+/preproc/timeseries/tas_global, which contains preprocessed
+data for each of the input datasets, a metadata.yml file
+and provenance information in the form of .xml files.
+
+timeseries/tas_amsterdam: creates
+/preproc/timeseries/tas_amsterdam, which contains
+preprocessed data for each of the input datasets, plus a combined
+MultiModelMean, a metadata.yml file and
+provenance files.
+
+map/script1: creates /run/map/script1
+with general information and a log of the diagnostic script run. It also
+creates /plots/map/script1/ and
+/work/map/script1, which contain output figures and output
+datasets, respectively. For each output file, there is also
+corresponding provenance information in the form of .xml,
+.bibtex and .txt files.
+
+timeseries/script1: creates
+/run/timeseries/script1 with general information and a log
+of the diagnostic script run. It also creates
+/plots/timeseries/script1 and
+/work/timeseries/script1, which contain output figures and
+output datasets, respectively. For each output file, there is also
+corresponding provenance information in the form of .xml,
+.bibtex and .txt files.
+
+
+
+
+
+
+
+
+
+
Pro tip: diagnostic logs
+
+
When you run ESMValTool, any log messages from the diagnostic script
+are not printed on the terminal. But they are written to the
+log.txt files in the folder
+/run/<diag_name>/log.txt.
+
ESMValTool does print a command that can be used to re-run a
+diagnostic script. When you use this the output will be printed
+to the command line.
+
+
+
+
Modifying the example recipe
+
Let’s make a small modification to the example recipe. Notice that
+now that you have copied and edited the recipe, you can use
+
esmvaltool run recipe_python.yml
+
to refer to your local file rather than the default version shipped
+with ESMValTool.
+
+
+
+
+
+
Change your location
+
+
Modify and run the recipe to analyse the temperature for your own
+location.
+
+
+
+
+
+
+
+
+
In principle, you only have to modify the location in the
+preprocessor called annual_mean_amsterdam. However, it is
+good practice to also replace all instances of amsterdam
+with the correct name of your location. Otherwise the log messages and
+output will be confusing. You are free to modify the names of
+preprocessors or diagnostics.
+
In the diff file below you will see the changes we have
+made to the file. The top 2 lines are the filenames and the lines like
+@@ -39,9 +39,9 @@ represent the line numbers in the
+original and modified file, respectively. For more info on this format,
+see here.
+
+
DIFF
+
+
--- recipe_python.yml
++++ recipe_python_london.yml
+@@ -39,9 +39,9 @@
+ convert_units:
+ units: degrees_C
+
+- annual_mean_amsterdam:
++ annual_mean_london:
+ extract_location:
+- location: Amsterdam
++ location: London
+ scheme: linear
+ annual_statistics:
+ operator: mean
+@@ -83,7 +83,7 @@
+ cmap: Reds
+
+ timeseries:
+- description: Annual mean temperature in Amsterdam and global mean since 1850.
++ description: Annual mean temperature in London and global mean since 1850.
+ themes:
+ - phys
+ realms:
+@@ -92,9 +92,9 @@
+ tas_amsterdam:
+ short_name: tas
+ mip: Amon
+- preprocessor: annual_mean_amsterdam
++ preprocessor: annual_mean_london
+ timerange: 1850/2000
+- caption: Annual mean {long_name} in Amsterdam according to {dataset}.
++ caption: Annual mean {long_name} in London according to {dataset}.
+ tas_global:
+ short_name: tas
+ mip: Amon
+
+
+
+
+
+
+
+
+
+
+
Key Points
+
+
ESMValTool recipes work ‘out of the box’ (if input data is
+available)
+
There are strong links between the recipe, log file, and output
+folders
+
Recipes can easily be modified to re-use existing code for your own
+use case
+
+
diff --git a/instructor/05-conclusions.html b/instructor/05-conclusions.html
new file mode 100644
index 00000000..c922f3af
--- /dev/null
+++ b/instructor/05-conclusions.html
@@ -0,0 +1,597 @@
+
+ESMValTool Tutorial: Conclusion of the basic tutorial
+ Skip to main content
+
+
+
+
+
+
+
+ Beta
+
+ This lesson is in the beta phase, which means that it is ready for teaching by instructors outside of the original author team.
+
+
+
+
There are lots of resources available to assist you in using
+ESMValTool.
+
The ESMValTool Discussions
+page is a good place to find information on general issues, or check
+if your question has already been addressed. If you have a GitHub
+account, you can also post your questions there.
+
If you encounter difficulties, a great starting point is to visit
+issues page issue page
+to check whether your issues have already been reported or not. If they
+have been reported before, suggestions provided by developers can help
+you to solve the issues you encountered. Note that you will need a
+GitHub account for this.
+
Additionally, there is an ESMValTool email list. Please see information
+on how to subscribe to the user mailing list.
+
+
+
What if I find a bug?
+
If you find a bug, please report it to the ESMValTool team. This will
+help us fix issues, ensuring not only your uninterrupted workflow but
+also contributing to the overall stability of ESMValTool for all
+users.
+
To report a bug, please create a new issue using the issue
+page.
+
In your bug report, please describe the problem as clearly and as
+completely as possible. You may need to include a recipe or the output
+log as well.
+
+
diff --git a/instructor/06-preprocessor.html b/instructor/06-preprocessor.html
new file mode 100644
index 00000000..b6ba39d0
--- /dev/null
+++ b/instructor/06-preprocessor.html
@@ -0,0 +1,1236 @@
+
+ESMValTool Tutorial: Writing your own recipe
+ Skip to main content
+
+
+
+
+
+
+
+ Beta
+
+ This lesson is in the beta phase, which means that it is ready for teaching by instructors outside of the original author team.
+
+
+
+
Can I use different preprocessors for different variables?
+
Can I use different datasets for different variables?
+
How can I combine different preprocessor functions?
+
Can I run the same recipe for multiple ensemble members?
+
+
+
+
+
+
+
Objectives
+
Create a recipe with multiple preprocessors
+
Use different preprocessors for different variables
+
Run a recipe with variables from different datasets
+
+
+
+
+
+
Introduction
+
One of the key strengths of ESMValTool is in making complex analyses
+reusable and reproducible. But that doesn’t mean everything in
+ESMValTool needs to be complex. Sometimes, the biggest challenge is in
+keeping things simple. You probably know the ‘warming stripes’
+visualization by Professor Ed Hawkins. On the site https://showyourstripes.info{:target=“_blank”} you can
+find the same visualization for many regions in the world.
+
Shared by Ed Hawkins under a Creative Commons 4.0 Attribution
+International licence. Source: https://showyourstripes.info
+
In this episode, we will reproduce and extend this functionality with
+ESMValTool. We have prepared a small Python script that takes a NetCDF
+file with timeseries data, and visualizes it in the form of our desired
+warming stripes figure.
+
The diagnostic script that we will use is called
+warming_stripes.py and can be downloaded here.
+
Download the file and store it in your working directory. If you
+want, you may also have a look at the contents, but it is not necessary
+to do so for this lesson.
+
We will write an ESMValTool recipe that takes some data, performs the
+necessary preprocessing, and then runs this Python script.
+
+
+
+
+
+
Drawing up a plan
+
+
Previously, we saw that running ESMValTool executes a number of
+tasks. What tasks do you think we will need to execute and what should
+each of these tasks do to generate the warming stripes?
+
+
+
+
+
+
+
+
+
In this episode, we will need to do the following two tasks:
+
A preprocessing task that converts the gridded temperature data to a
+timeseries of global temperature anomalies
+
A diagnostic tasks that calls our Python script, taking our
+preprocessed timeseries data as input.
+
+
+
+
+
Building a recipe from scratch
+
The easiest way to make a new recipe is to start from an existing
+one, and modify it until it does exactly what you need. However, in this
+episode we will start from scratch. This forces us to think about all
+the steps involved in processing the data. We will also deal with
+commonly occurring errors through the development of the recipe.
+
Remember the basic structure of a recipe, and notice that each
+component is extensively described in the documentation under the
+section, [“Overview”][recipe-overview]{:target=“_blank”}:
This is the first place to look for help if you get stuck.
+
Open a new file called recipe_warming_stripes.yml:
+
+
BASH
+
+
nano recipe_warming_stripes.yml
+
+
Let’s add the standard header comments (these do not do anything),
+and a first description.
+
+
YAML
+
+
# ESMValTool
+# recipe_warming_stripes.yml
+---
+documentation:
+description: Reproducing Ed Hawkins' warming stripes visualization
+title: Reproducing Ed Hawkins' warming stripes visualization.
+
+
Notice that yaml always requires two spaces
+indentation between the different levels. Pressing ctrl+o
+will save the file. Verify the filename at the bottom and press enter.
+Then use ctrl+x to exit the editor.
+
We will try to run the recipe after every modification we make, to
+see if it (still) works!
+
+
BASH
+
+
esmvaltool run recipe_warming_stripes.yml
+
+
In this case, it gives an error. Below you see the last few lines of
+the error message.
+
+
ERROR
+
+
...
+yamale.yamale_error.YamaleError:
+Error validating data '/home/users/username/esmvaltool_tutorial/recipe_warming_stripes.yml'
+with schema
+'/apps/jasmin/community/esmvaltool/miniconda3_py311_23.11.0-2/envs/esmvaltool/lib/python3.11/
+site-packages/esmvalcore/_recipe/recipe_schema.yml'
+ documentation.authors: Required field missing
+2024-05-27 13:21:23,805 UTC [41924] INFO
+If you have a question or need help, please start a new discussion on
+https://github.com/ESMValGroup/ESMValTool/discussions
+If you suspect this is a bug, please open an issue on
+https://github.com/ESMValGroup/ESMValTool/issues
+To make it easier to find out what the problem is, please consider attaching the
+files run/recipe_*.yml and run/main_log_debug.txt from the output directory.
+
+
+
We can use the the log message above, to understand why ESMValTool
+failed. Here, this is because we missed a required field with author
+names. The text
+documentation.authors: Required field missing tells us
+that. We see that ESMValTool always tries to validate the recipe at an
+early stage. Note also the suggestion to open a GitHub issue if you need
+help debugging the error message. This is something most users do when
+they cannot understand the error or are not able to fix it on their
+own.
+
Let’s add some additional information to the recipe. Open the recipe
+file again, and add an authors section below the description. ESMValTool
+expects the authors as a list, like so:
+
+
YAML
+
+
authors:
+- lastname_firstname
+
+
To bypass a number of similar error messages, add a minimal
+diagnostics section below the documentation. The file should now look
+like:
This is the minimal recipe layout that is required by ESMValTool. If
+we now run the recipe again, you will probably see the following
+error:
+
+
ERROR
+
+
ValueError: Tag 'doe_john' does not exist in section
+'authors' of /apps/jasmin/community/esmvaltool/ESMValTool_2.10.0/esmvaltool/config-references.yml
+
+
+
+
+
+
+
+
Pro tip: config-references.yml
+
+
The error message above points to a file named
+[config-references.yml][config-references] This is where ESMValTool
+stores all its citation information. To add yourself as an author, add
+your name in the form lastname_firstname in alphabetical
+order following the existing entries, under the
+# Development team section. See the [List of
+authors][list-of-authors]{:target=“_blank”} section in the ESMValTool
+documentation for more information.
+
+
+
+
For now, let’s just use one of the existing references. Change the
+author field to righi_mattia, who cannot receive enough
+credit for all the effort he put into ESMValTool. If you now run the
+recipe again, you should see the final message
+
+
OUTPUT
+
+
ERROR No tasks to run!
+
+
Although there is no actual error in the recipe, ESMValTool assumes
+you mistakenly left out a variable name to process and alerts you with
+this error message.
+
Adding a dataset entry
+
Let’s add a datasets section.
+
+
+
+
+
+
Filling in the dataset keys
+
+
Use the paths specified in the configuration file to explore the data
+directory, and look at the explanation of the dataset entry in the
+[ESMValTool documentation][recipe-section-datasets]{:target=“_blank”}.
+For both the datasets, write down the following properties:
Note that the grid key is only required for CMIP6 data, and that the
+extent of the historical period has changed between CMIP5 and CMIP6.
+
+
+
+
+
Let us start with the BCC-ESM1 dataset and add a datasets section to
+the recipe, listing this single dataset, as shown below. Note that key
+fields such as mip or start_year are included
+in the datasets section here but are part of the
+diagnostic section in the recipe example seen in Running your first recipe.
The recipe should run but produce the same message as in the previous
+case since we still have not included a variable to actually process. We
+have not included the short name of the variable in this dataset section
+because this allows us to reuse this dataset entry with different
+variable names later on. This is not really necessary for our simple use
+case, but it is common practice in ESMValTool.
+
+
+
+
+
+
Pro-tip: Automatically populating a recipe with all available datasets
+
+
You can select all available models for processing using
+glob patterns or wildcards. An example
+datasets section that uses all available CMIP6 models and
+ensemble members for the historical experiment is available
+[here] [include-all-datasets]{:target=“_blank”}. Note that you will have
+to set the search_esgf option in the
+config_file to always so that you can download
+data from ESGF nodes as needed.
+
+
+
+
Adding the preprocessor section
+
Above, we already described the preprocessing task that needs to
+convert the standard, gridded temperature data to a timeseries of
+temperature anomalies.
+
+
+
+
+
+
Defining the preprocessor
+
+
Have a look at the available preprocessors in the
+[documentation][preprocessor]{:target=“_blank”}. Write down
+
Which preprocessor functions do you think we should use?
+
What are the parameters that we can pass to these functions?
+
What do you think should be the order of the preprocessors?
+
A suitable name for the overall preprocessor
+
+
+
+
+
+
+
+
+
We need to calculate anomalies and global means. There is an
+anomalies preprocessor which takes in as arguments, a time
+period, a reference period, and whether or not to standardize the data.
+The global means can be calculated with the area_statistics
+preprocessor, which takes an operator as argument (in our case we want
+to compute the mean).
+
The default order in which these preprocessors are applied can be
+seen [here][preprocessor-functions]{:target=“_blank”}:
+area_statistics comes before anomalies. If you
+want to change this, you can use the custom_order
+preprocessor as described
+[here][recipe-section-preprocessors]{:target=“_blank”}. For this
+example, we will keep the default order..
+
Let’s name our preprocessor global_anomalies.
+
+
+
+
+
Add the following block to your recipe file between the
+datasets and diagnostics block:
We are now ready to finish our diagnostics section. Remember that we
+want to create two tasks: a preprocessor task, and a diagnostic task. To
+illustrate that we can also pass settings to the diagnostic script, we
+add the option to specify a custom colormap.
+
+
+
+
+
+
Fill in the blanks
+
+
Extend the diagnostics section in your recipe by filling in the
+blanks in the following template:
+
+
YAML
+
+
diagnostics:
+<... (suitable name for our diagnostic)>:
+ description: <...>
+variables:
+<... (suitable name for the preprocessed variable)>:
+ short_name: <...>
+ preprocessor: <...>
+scripts:
+<... (suitable name for our python script)>:
+script: <full path to python script>
+colormap: <... choose from matplotlib colormaps>
+
+
+
+
+
+
+
+
+
+
+
YAML
+
+
diagnostics:
+diagnostic_warming_stripes:
+description: visualize global temperature anomalies as warming stripes
+variables:
+global_temperature_anomalies_global:
+short_name: tas
+preprocessor: global_anomalies
+scripts:
+warming_stripes_script:
+script: ~/esmvaltool_tutorial/warming_stripes.py
+colormap:'bwr'
+
+
+
+
+
+
You should now be able to run the recipe to get your own warming
+stripes.
+
Note: for the purpose of simplicity in this episode, we have not
+added logging or provenance tracking in the diagnostic script. Once you
+start to develop your own diagnostic scripts and want to add them to the
+ESMValTool repositories, this will be required. Writing your own
+diagnostic script is discussed in a later
+episode.
+
Bonus exercises
+
Below are a few exercises to practice modifying an ESMValTool recipe.
+For your reference, here’s a copy of the recipe at this point. This
+will be the point of departure for each of the modifications we’ll make
+below.
+
+
+
+
+
+
Specific location selection
+
+
On showyourstripes.org, you can download stripes for specific
+locations. Here we show how this can be done with ESMValTool. Instead of
+the global mean, we can pick a location to plot the stripes for. Can you
+find a suitable preprocessor to do this?
+
+
+
+
+
+
+
+
+
You can use extract_point or extract_region
+to select a location. We used extract_point. Here’s a copy
+of the recipe at this
+point and this is the difference from the previous recipe:
Split the diagnostic in two with two different time periods for the
+same variable. You can choose the time periods yourself. In the example
+below, we have chosen the recent past and the 20th century and have used
+variable grouping.
+
+
+
+
+
+
+
+
+
Here’s a copy of the recipe at this point
+and this is the difference with the previous recipe:
Now that you have different variable groups, we can also use
+different preprocessors. Add a second preprocessor to add another
+location of your choosing.
+
+
Solution
+
Here’s a copy of the recipe at
+this point and this is the difference with the previous recipe:
If you want to avoid retyping the arguments used in your
+preprocessor, you can use YAML anchors as seen in the
+anomalies preprocessor specifications in the recipe
+above.
+
+
+
+
+
+
+
+
+
Additional datasets
+
+
So far we have defined the datasets in the datasets section of the
+recipe. However, it’s also possible to add specific datasets only for
+specific variables or variable groups. Take a look at the documentation
+to learn about the additional_datasets keyword
+[here][additional-datasets]{:target=“_blank”}, and add a second dataset
+only for one of the variable groups.
+
+
+
+
+
+
+
+
+
Here’s a copy of the recipe at
+this point and this is the difference with the previous recipe:
You can choose data from multiple ensemble members for a model in a
+single line.
+
+
+
+
+
+
+
+
+
The dataset section allows you to choose more than one
+ensemble member Here’s a copy of the changed recipe
+to do that. Changes made are shown in the diff output below:
Check out the section on a different way to use multiple ensemble
+members or even multiple experiments at [Concatenating data
+corresponding to multiple facets]
+[concatenating-datasets]{:target=“_blank”}.
+
+
+
+
+
+
+
+
+
Key Points
+
+
A recipe can work with different preprocessors at the same
+time.
+
The setting additional_datasets can be used to add a
+different dataset.
+
Variable groups are useful for defining different settings for
+different variables.
+
Multiple ensemble members and experiments can be analysed in a
+single recipe through concatenation.
+
+
diff --git a/instructor/07-development-setup.html b/instructor/07-development-setup.html
new file mode 100644
index 00000000..dffcfd9e
--- /dev/null
+++ b/instructor/07-development-setup.html
@@ -0,0 +1,1072 @@
+
+ESMValTool Tutorial: Development and contribution
+ Skip to main content
+
+
+
+
+
+
+
+ Beta
+
+ This lesson is in the beta phase, which means that it is ready for teaching by instructors outside of the original author team.
+
+
+
+
How can I incorporate my contributions into ESMValTool?
+
+
+
+
+
+
+
Objectives
+
Execute a successful ESMValTool installation from the source
+code.
+
Contribute to ESMValTool development.
+
+
+
+
+
+
We now know how ESMValTool works, but how do we develop it?
+ESMValTool is an open-source project in ESMValGroup. We can contribute
+to its development by:
helping with reviewing process of pull requests, see ESMValTool
+documentation on Review
+of pull requests
+
+
In this lesson, we first show how to set up a development
+installation of ESMValTool so you can make changes or additions. We then
+explain how you can contribute these changes to the community.
+
+
+
+
+
+
Git knowledge
+
+
For this episode, you need some knowledge of Git. You can refresh your knowledge in
+the corresponding Git carpentries
+course.
+
+
+
+
Development installation
+
We’ll explore how ESMValTool can be installed it in a
+develop mode. Even if you aren’t collaborating with the
+community, this installation is needed to run your new codes with
+ESMValTool. Let’s get started.
Download the code from the repository. A ZIP file called
+ESMValTool-main.zip is downloaded. To continue the
+installation, unzip the file, move to the ESMValTool-main
+directory and then follow the sequence of steps starting from the
+section on ESMValTool dependencies below.
+
Clone the repository if you want to contribute to the ESMValTool
+development:
This command will ask your GitHub username and a personal
+token as password. Please follow instructions on
+[GitHub token authentication
+requirements][token-authentication-requirements] to create a personal
+access token. Alternatively, you could [generate a new SSH
+key][generate-ssh-key] and [add it to your GitHub account][add-ssh-key].
+After the authentication, the output might look like:
Now, a folder called ESMValTool has been created in your
+working directory. This folder contains the source code of the tool. To
+continue the installation, we move into the ESMValTool
+directory:
+
+
BASH
+
+
cd ESMValTool
+
+
Note that the main branch is checked out by default. We
+can see this if we run:
+
+
BASH
+
+
git status
+
+
+
OUTPUT
+
+
On branch main
+Your branch is up to date with 'origin/main'.
+
+nothing to commit, working tree clean
+
+
+
+
2 ESMValTool dependencies
+
Please don’t forget if an esmvaltool environment is already created
+following the lesson Installation, we
+should choose another name for the new environment in this lesson.
+
ESMValTool now uses mamba instead of conda
+for the recommended installation. For a minimal mamba installation, see
+section Install Mamba in lesson Installation.
+
It is good practice to update the version of mamba and conda on your
+machine before setting up ESMValTool. This can be done as follows:
+
+
BASH
+
+
mamba update --name base mamba conda
+
+
To simplify the installation process, an environment file
+environment.yml is provided in the ESMValTool directory. We
+create an environment by running:
The environment is called esmvaltool by default. If an
+esmvaltool environment is already created following the
+lesson Installation, we should choose
+another name for the new environment in this lesson by:
This will create a new conda environment and install ESMValTool (with
+all dependencies that are needed for development purposes) into it with
+a single command.
+
For more information see [conda managing
+environments][manage-environments].
+
Now, we should activate the environment:
+
+
BASH
+
+
conda activate esmvaltool
+
+
where esmvaltool is the name of the environment (replace
+by a_new_name in case another environment name was
+used).
+
+
+
3 ESMValTool installation
+
ESMValTool can be installed in a develop mode by
+running:
+
+
BASH
+
+
pip install --editable'.[develop]'
+
+
This will add the esmvaltool directory to the Python
+path in editable mode and install the development dependencies. We
+should check if the installation works properly. To do this, run the
+tool with:
+
+
BASH
+
+
esmvaltool--help
+
+
If the installation is successful, ESMValTool prints a help message
+to the console.
+
+
+
+
+
+
Checking the development installation
+
+
We can use the command mamba list to list installed
+packages in the esmvaltool environment. Use this command to
+check that ESMValTool is installed in a develop mode.
# Name Version Build Channel
+esmvaltool 2.10.0.dev3+g2dbc2cfcc pypi_0 pypi
+
+
+
+
+
+
+
+
4 Updating ESMValTool
+
The main branch has the latest features of ESMValTool.
+Please make sure that the source code on your machine is up-to-date. If
+you obtain the source code using git clone as explained in step
+1 Source code, you can run git pull to
+update the source code. Then ESMValTool installation will be updated
+with changes from the main branch.
+
+
Contribution
+
We have seen how to install ESMValTool in a develop
+mode. Now, we try to contribute to its development. Let’s see how this
+can be achieved. We first discuss our ideas in an issue
+in ESMValTool repository. This can avoid disappointment at a later
+stage, for example, if more people are doing the same thing. It also
+gives other people an early opportunity to provide input and
+suggestions, which results in more valuable contributions.
+
Then, we create a new branch locally and start
+developing new codes. To create a new branch:
+
+
BASH
+
+
git checkout -b your_branch_name
+
+
If needed, a link to a git tutorial can be found in Setup.
+
Once our development is finished, we can initiate a
+pull request. To this end, we encourage you to join the
+ESMValTool development team.
+
For more extensive documentation on contributing code, including a
+section on the GitHubWorkflow,
+please see the [Contributing code and documentation][code-documentation]
+section in the ESMValtool documentation.
+
+
Review process
+
The pull request will be tested, discussed and merged as part of the
+“review process”. The process will take some effort and
+time to learn. However, a few (command line) tools can get you a long
+way, and we’ll cover those essentials in the next sections.
+
Tip: we encourage you to keep the pull requests
+small. Reviewing small incremental changes is more efficient.
+
+
+
Working example
+
We saw the ‘warming stripes’ diagnostic in lesson Writing your own recipe. Imagine the
+following task: you want to contribute warming stripes recipe and
+diagnostics to ESMValTool. You have to add the diagnostics warming_stripes.py and the recipe recipe_warming_stripes.yml
+to their locations in ESMValTool directory. After these changes, you
+should also check if everything works fine. This is where we take
+advantage of the tools that are introduced later.
+
Let’s get started. Note that since this is an exercise to get
+familiar with the development and contribution process, we will not
+create a GitHub issue at this time but proceed as though it has been
+done.
+
+
+
Check code quality
+
We aim to adhere to best practices and coding standards. There are
+several tools that check our code against those standards like:
+
+flake8 for
+checking against the PEP8 style guide
+
+yapf to ensure
+consistent formatting for the whole project
+
+isort to consistently
+sort the import statements
+
+yamllint
+to ensure there are no syntax errors in our recipes and config
+files
The good news is that pre-commit has been already
+installed when we chose development installation.
+pre-commit is a command line and runs all of those tools.
+It also fixes some of those errors. To explore other tools, have a look
+at ESMValTool documentation on Code
+style.
+
+
+
+
+
+
Using pre-commit
+
+
Let’s checkout our local branch and add the script warming_stripes.py to the
+esmvaltool/diag_scripts directory.
+
+
BASH
+
+
cd ESMValTool
+git checkout your_branch_name
+cp path_of_warming_stripes.py esmvaltool/diag_scripts/
+
+
By default, pre-commit only runs on the files that have
+been staged in git:
+
+
BASH
+
+
git status
+git add esmvaltool/diag_scripts/warming_stripes.py
+pre-commit run --files esmvaltool/diag_scripts/warming_stripes.py
+
+
Inspect the output of pre-commit and fix the remaining
+errors.
+
+
+
+
+
+
+
+
+
The tail of the output of pre-commit:
+
+
BASH
+
+
Check for added large files..............................................Passed
+Check python ast.........................................................Passed
+Check for case conflicts.................................................Passed
+Check for merge conflicts................................................Passed
+Debug Statements (Python)................................................Passed
+Fix End of Files.........................................................Passed
+Trim Trailing Whitespace.................................................Passed
+yamllint.............................................(no files to check)Skipped
+nclcodestyle.........................................(no files to check)Skipped
+style-files..........................................(no files to check)Skipped
+lintr................................................(no files to check)Skipped
+codespell................................................................Passed
+isort....................................................................Passed
+yapf.....................................................................Passed
+docformatter.............................................................Failed
+- hook id: docformatter
+- files were modified by this hook
+flake8...................................................................Failed
+- hook id: flake8
+- exit code: 1
+
+esmvaltool/diag_scripts/warming_stripes.py:20:5:
+F841 local variable 'nx' is assigned to but never used
+
+
As can be seen above, there are two Failed check:
+
+docformatter: it is mentioned that “files were modified
+by this hook”. We run git diff to see the modifications.
+The output includes the following:
+
+
BASH
+
+
+in the form of the popular warming stripes figure by Ed Hawkins."""
+
+
The syntax """ at the end of docstring is moved by one
+line. Shifting it to the next line should fix this error.
+
+flake8: the error message is about an unused local
+variable nx. We should check our codes regarding the usage
+of nx. For now, let’s assume that it is added by mistake
+and remove it. Note that you have to run git add again to
+re-stage the file. Then rerun pre-commit and check that it passes.
+
+
+
+
+
+
+
Run unit tests
+
Previous section introduced some tools to check code style and
+quality. There is lack of mechanism to determine whether or not our code
+is getting the right answer. To achieve that, we need to write and run
+tests for widely-used functions. ESMValTool comes with a lot of tests
+that are in the folder tests.
+
To run tests, first we make sure that the working directory is
+ESMValTool and our local branch is checked out. Then, we
+can run tests using pytest locally:
+
+
BASH
+
+
pytest
+
+
Tests will also be run automatically by CircleCI, when you submit a pull
+request.
+
+
+
+
+
+
Running tests
+
+
Make sure our local branch is checked out and add the recipe recipe_warming_stripes.yml
+to the esmvaltool/recipes directory:
Run pytest and inspect the results, this might take a
+few minutes. If a test is failed, try to fix it.
+
+
+
+
+
+
+
+
+
Run:
+
+
BASH
+
+
pytest
+
+
When pytest run is complete, you can inspect the test
+reports that are printed in the console. Have a look at the second
+section of the report FAILURES:
The test message shows that the recipe
+recipe_warming_stripes.yml is not a valid recipe. Look for
+a line that starts with an E in the rest of the
+message:
+
+
BASH
+
+
+E esmvalcore._task.DiagnosticError: Cannot execute script
+'~/esmvaltool_tutorial/warming_stripes.py'(~/esmvaltool_tutorial/warming_stripes.py):
+file does not exist.
+
+
To fix the recipe, we need to edit the path of the diagnostic script
+as warming_stripes.py:
When we add or update a code, we also update its corresponding
+documentation. The ESMValTool documentation is available on docs.esmvaltool.org.
+The source files are located in
+ESMValTool/doc/sphinx/source/.
+
To build documentation locally, first we make sure that the working
+directory is ESMValTool and our local branch is checked
+out. Then, we run:
Similar to code, documentation should be well written and adhere to
+standards. If the documentation is built properly, the previous command
+prints a message to the console:
+
+
OUTPUT
+
+
build succeeded.
+
+The HTML pages are in doc/sphinx/build.
+
+
The main page of the documentation has been built into
+index.html in doc/sphinx/build/ directory. To
+preview this page locally, we open the file in a web browser:
+
+
BASH
+
+
xdg-open doc/sphinx/build/index.html
+
+
+
+
+
+
+
Creating a documentation
+
+
In previous exercises, we added the recipe recipe_warming_stripes.yml
+to ESMValTool. Now, we create a documentation file
+recipe_warming_stripes.rst for this recipe:
Save and close the file. We can think of this file as one page of a
+book. Examples of documentation pages can be found in the folder
+ESMValTool/doc/sphinx/source/recipes. Then, we need to
+decide where this page should be located inside the book. The table of
+content is defined by index.rst. Let’s have a look at the
+content:
+
+
BASH
+
+
nano doc/sphinx/source/recipes/index.rst
+
+
Add the recipe name i.e. recipe_warming_stripes to the
+section Other in this file and preview the recipe
+documentation page locally.
+
+
+
+
+
+
+
+
+
First, we add the recipe name recipe_warming_stripes to
+the section Other:
+
+
diff --git a/instructor/08-diagnostics.html b/instructor/08-diagnostics.html
new file mode 100644
index 00000000..158b02dd
--- /dev/null
+++ b/instructor/08-diagnostics.html
@@ -0,0 +1,1124 @@
+
+ESMValTool Tutorial: Writing your own diagnostic script
+ Skip to main content
+
+
+
+
+
+
+
+ Beta
+
+ This lesson is in the beta phase, which means that it is ready for teaching by instructors outside of the original author team.
+
+
+
+
How do I use the preprocessor output in a Python diagnostic?
+
+
+
+
+
+
+
Objectives
+
Write a new Python diagnostic script.
+
Explain how a diagnostic script reads the preprocessor output.
+
+
+
+
+
+
Introduction
+
The diagnostic script is an important component of ESMValTool and it
+is where the scientific analysis or performance metric is implemented.
+With ESMValTool, you can adapt an existing diagnostic or write a new
+script from scratch. Diagnostics can be written in a number of open
+source languages such as Python, R, Julia and NCL but we will focus on
+understanding and writing Python diagnostics in this lesson.
+
In this lesson, we will explain how to find an existing diagnostic
+and run it using ESMValTool installed in editable/development mode. For
+a development installation, see the instructions in the lesson Development and contribution. Also,
+we will work with the recipe [recipe_python.yml][recipe] and the
+diagnostic script [diagnostic.py][diagnostic] called by this recipe that
+we have seen in the lesson Running your first
+recipe .
+
Let’s get started!
+
Understanding an existing Python diagnostic
+
If you clone the ESMValTool repository, a folder called
+ESMValTool is created in your home/working directory, see
+the instructions in the lesson Development and contribution .
+
The folder ESMValTool contains the source code of the
+tool. We can find the recipe recipe_python.yml and the
+python script diagnostic.py in these directories:
Let’s have look at the code in diagnostic.py. For
+reference, we show the diagnostic code in the dropdown box below. There
+are four main sections in the script:
+
A description i.e. the docstring (line 1).
+
Import statements (line 2-16).
+
Functions that implement our analysis (line 21-102).
+
A typical Python top-level script
+i.e. if __name__ == '__main__' (line 105-108).
+
+
+
+
+
+
+
PYTHON
+
+
1: """Python example diagnostic."""
+2: import logging
+3: from pathlib import Path
+4: from pprint import pformat
+5:
+6: import iris
+7:
+8: from esmvaltool.diag_scripts.shared import (
+9: group_metadata,
+10: run_diagnostic,
+11: save_data,
+12: save_figure,
+13: select_metadata,
+14: sorted_metadata,
+15: )
+16: from esmvaltool.diag_scripts.shared.plot import quickplot
+17:
+18: logger = logging.getLogger(Path(__file__).stem)
+19:
+20:
+21: def get_provenance_record(attributes, ancestor_files):
+22: """Create a provenance record describing the diagnostic data and plot."""
+23: caption = caption = attributes['caption'].format(**attributes)
+24:
+25: record = {
+26: 'caption': caption,
+27: 'statistics': ['mean'],
+28: 'domains': ['global'],
+29: 'plot_types': ['zonal'],
+30: 'authors': [
+31: 'andela_bouwe',
+32: 'righi_mattia',
+33: ],
+34: 'references': [
+35: 'acknow_project',
+36: ],
+37: 'ancestors': ancestor_files,
+38: }
+39: return record
+40:
+41:
+42: def compute_diagnostic(filename):
+43: """Compute an example diagnostic."""
+44: logger.debug("Loading %s", filename)
+45: cube = iris.load_cube(filename)
+46:
+47: logger.debug("Running example computation")
+48: cube = iris.util.squeeze(cube)
+49: return cube
+50:
+51:
+52: def plot_diagnostic(cube, basename, provenance_record, cfg):
+53: """Create diagnostic data and plot it."""
+54:
+55: # Save the data used for the plot
+56: save_data(basename, provenance_record, cfg, cube)
+57:
+58: if cfg.get('quickplot'):
+59: # Create the plot
+60: quickplot(cube, **cfg['quickplot'])
+61: # And save the plot
+62: save_figure(basename, provenance_record, cfg)
+63:
+64:
+65: def main(cfg):
+66: """Compute the time average for each input dataset."""
+67: # Get a description of the preprocessed data that we will use as input.
+68: input_data = cfg['input_data'].values()
+69:
+70: # Demonstrate use of metadata access convenience functions.
+71: selection = select_metadata(input_data, short_name='tas', project='CMIP5')
+72: logger.info("Example of how to select only CMIP5 temperature data:\n%s",
+73: pformat(selection))
+74:
+75: selection = sorted_metadata(selection, sort='dataset')
+76: logger.info("Example of how to sort this selection by dataset:\n%s",
+77: pformat(selection))
+78:
+79: grouped_input_data = group_metadata(input_data,
+80: 'variable_group',
+81: sort='dataset')
+82: logger.info(
+83: "Example of how to group and sort input data by variable groups from "
+84: "the recipe:\n%s", pformat(grouped_input_data))
+85:
+86: # Example of how to loop over variables/datasets in alphabetical order
+87: groups = group_metadata(input_data, 'variable_group', sort='dataset')
+88: for group_name in groups:
+89: logger.info("Processing variable %s", group_name)
+90: for attributes in groups[group_name]:
+91: logger.info("Processing dataset %s", attributes['dataset'])
+92: input_file = attributes['filename']
+93: cube = compute_diagnostic(input_file)
+94:
+95: output_basename = Path(input_file).stem
+96: if group_name != attributes['short_name']:
+97: output_basename = group_name +'_'+ output_basename
+98: if"caption"notin attributes:
+99: attributes['caption'] = input_file
+100: provenance_record = get_provenance_record(
+101: attributes, ancestor_files=[input_file])
+102: plot_diagnostic(cube, output_basename, provenance_record, cfg)
+103:
+104:
+105: if__name__=='__main__':
+106:
+107: with run_diagnostic() as config:
+108: main(config)
+
+
+
+
+
+
+
+
+
+
+
What is the starting point of a diagnostic?
+
+
Can you spot a function called main in the code
+above?
+
What are its input arguments?
+
How many times is this function mentioned?
+
+
+
+
+
+
+
+
+
The main function is defined in line 65 as
+main(cfg).
+
The input argument to this function is the variable
+cfg, a Python dictionary that holds all the necessary
+information needed to run the diagnostic script such as the location of
+input data and various settings. We will next parse this
+cfg variable in the main function and extract
+information as needed to do our analyses (e.g. in line 68).
+
The main function is called near the very end on line
+108. So, it is mentioned twice in our code - once where it is called by
+the top-level Python script and second where it is defined.
+
+
+
+
+
+
+
+
+
+
The function run_diagnostic
+
+
The function run_diagnostic (line 107) is called a
+context manager provided with ESMValTool and is the main entry point for
+most Python diagnostics.
+
+
+
+
Preprocessor-diagnostic interface
+
In the previous exercise, we have seen that the variable
+cfg is the input argument of the main
+function. The first argument passed to the diagnostic via the
+cfg dictionary is a path to a file called
+settings.yml. The ESMValTool documentation page provides an
+overview of what is in this file, see [Diagnostic script
+interfaces][interface].
+
+
+
+
+
+
What information do I need when writing a diagnostic script?
+
+
From the lesson Configuration, we
+saw how to change the configuration settings before running a recipe.
+First we set the option remove_preproc_dir to
+false in the configuration file, then run the recipe
+recipe_python.yml:
+
+
BASH
+
+
esmvaltool run examples/recipe_python.yml
+
+
Find one example of the file settings.yml in the
+run directory?
+
Open the file settings.yml and look at the
+input_files list. It contains paths to some files
+metadata.yml. What information do you think is saved in
+those files?
+
+
+
+
+
+
+
+
+
One example of settings.yml can be found in the
+directory:
+path_to_recipe_output/run/map/script1/settings.yml
+
+
The metadata.yml files hold information about the
+preprocessed data. There is one file for each variable having detailed
+information on your data including project (e.g., CMIP6, CMIP5), dataset
+names (e.g., BCC-ESM1, CanESM2), variable attributes (e.g.,
+standard_name, units), preprocessor applied and time range of the data.
+You can use all of this information in your own diagnostic.
+
+
+
+
+
Diagnostic shared functions
+
Looking at the code in diagnostic.py, we see that
+input_data is read from the cfg dictionary
+(line 68). Now we can group the input_data according to
+some criteria such as the model or experiment. To do so, ESMValTool
+provides many functions such as select_metadata (line 71),
+sorted_metadata (line 75), and group_metadata
+(line 79). As you can see in line 8, these functions are imported from
+esmvaltool.diag_scripts.shared that means these are shared
+across several diagnostics scripts. A list of available functions and
+their description can be found in [The ESMValTool Diagnostic API
+reference][shared].
+
+
Extracting
+information needed for analyses
+
We have seen the functions used for selecting, sorting and grouping
+data in the script. What do these functions do?
+
+
Answer
+
There is a statement after use of select_metadata,
+sorted_metadata and group_metadata that starts
+with logger.info (lines 72, 76 and 82). These lines print
+output to the log files. In the previous exercise, we ran the recipe
+recipe_python.yml. If you look at the log file
+recipe_python_#_#/run/map/script1/log.txt in
+esmvaltool_output directory, you can see the output from
+each of these functions, for example:
+
2023-06-28 12:47:14,038 [2548510] INFO diagnostic,106 Example of how to
+group and sort input data by variable groups from the recipe:
+{'tas': [{'alias': 'CMIP5',
+ 'caption': 'Global map of {long_name} in January 2000 according to '
+ '{dataset}.\n',
+ 'dataset': 'bcc-csm1-1',
+ 'diagnostic': 'map',
+ 'end_year': 2000,
+ 'ensemble': 'r1i1p1',
+ 'exp': 'historical',
+ 'filename': '~/recipe_python_20230628_124639/preproc/map/tas/
+This is how we can access preprocessed data within our diagnostic.
+
+
+
+
+
Diagnostic computation
+
After grouping and selecting data, we can read individual attributes
+(such as filename) of each item. Here, we have grouped the input data by
+variables, so we loop over the variables (line 88).
+Following this is a call to the function compute_diagnostic
+(line 93). Let’s look at the definition of this function in line 42,
+where the actual analysis of the data is done.
+
Note that output from the ESMValCore preprocessor is in the form of
+NetCDF files. Here, compute_diagnostic uses Iris
+to read data from a netCDF file and performs an operation
+squeeze to remove any dimensions of length one. We can
+adapt this function to add our own analysis. As an example, here we
+calculate the bias using the average of the data using Iris cubes.
+
+
PYTHON
+
+
def compute_diagnostic(filename):
+"""Compute an example diagnostic."""
+ logger.debug("Loading %s", filename)
+ cube = iris.load_cube(filename)
+
+ logger.debug("Running example computation")
+ cube = iris.util.squeeze(cube)
+
+# Calculate a bias using the average of data
+ cube.data = cube.core_data() - cube.core_data.mean()
+return cube
+
+
+
+
+
+
+
iris cubes
+
+
Iris reads data from NetCDF files into data structures called cubes.
+The data in these cubes can be modified, combined with other cubes’ data
+or plotted.
+
+
+
+
+
+
+
+
+
Reading data using xarray
+
+
Alternately, you can use xarrays to read the data
+instead of Iris.
+
+
+
+
+
+
+
+
+
First, import xarray package at the top of the script
+as:
+
+
PYTHON
+
+
import xarray as xr
+
+
Then, change the compute_diagnostic as:
+
+
PYTHON
+
+
def compute_diagnostic(filename):
+"""Compute an example diagnostic."""
+ logger.debug("Loading %s", filename)
+ dataset = xr.open_dataset(filename)
+
+#do your analyses on the data here
+
+return dataset
+
+
Caution: If you read data using xarray keep in mind to change
+accordingly the other functions in the diagnostic which are dealing at
+the moment with Iris cubes.
+
+
+
+
+
+
+
+
+
+
Reading data using the netCDF4 package
+
+
Yet another option to read the NetCDF file data is to use the
+[netCDF-4 Python interface][netCDF] to the netCDF C library.
+
+
+
+
+
+
+
+
+
First, import the netCDF4 package at the top of the
+script as:
+
+
PYTHON
+
+
import netCDF4
+
+
Then, change compute_diagnostic as:
+
+
PYTHON
+
+
def compute_diagnostic(filename):
+"""Compute an example diagnostic."""
+ logger.debug("Loading %s", filename)
+ nc_data = netCDF4.Dataset(filename,'r')
+
+#do your analyses on the data here
+
+return nc_data
+
+
Caution: If you read data using netCDF4 keep in mind to change
+accordingly the other functions in the diagnostic which are dealing at
+the moment with Iris cubes.
+
+
+
+
+
Diagnostic output
+
+
Plotting the output
+
Often, the end product of a diagnostic script is a plot or figure.
+The Iris cube returned from the compute_diagnostic function
+(line 93) is passed to the plot_diagnostic function (line
+102). Let’s have a look at the definition of this function in line 52.
+This is where we would plug in our plotting routine in the diagnostic
+script.
+
More specifically, the quickplot function (line 60) can
+be replaced with the function of our choice. As can be seen, this
+function uses **cfg['quickplot'] as an input argument. If
+you look at the diagnostic section in the recipe
+recipe_python.yml, you see quickplot is a key
+there:
This way, we can pass arguments such as the type of plot
+pcolormesh and the colormap cmap:Reds from the
+recipe to the quickplot function in the diagnostic.
+
+
+
+
+
+
Passing arguments from the recipe to the diagnostic
+
+
Change the type of the plot and its colormap and inspect the output
+figure.
+
+
+
+
+
+
+
+
+
In the recipe recipe_python.yml, you could change
+plot_type and cmap. As an example, we choose
+plot_type: pcolor and cmap: BuGn:
The plot can be found at
+path_to_recipe_output/plots/map/script1/png.
+
+
+
+
+
+
+
+
+
+
ESMValTool gallery
+
+
ESMValTool makes it possible to produce a wide array of plots and
+figures as seen in the gallery.
+
+
+
+
+
+
Saving the output
+
In our example, the function save_data in line 56 is
+used to save the Iris cube. The saved files can be found under the
+work directory in a .nc format. There is also
+the function save_figure in line 62 to save the plots under
+the plot directory in a .png format (or
+preferred format specified in your configuration settings). Again, you
+may choose your own method of saving the output.
+
+
+
Recording the provenance
+
When developing a diagnostic script, it is good practice to record
+provenance. To do so, we use the function
+get_provenance_record (line 100). Let us have a look at the
+definition of this function in line 21 where we describe the diagnostic
+data and plot. Using the dictionary record, it is possible
+to add custom provenance to our diagnostics output. Provenance is stored
+in the W3C PROV
+XML format and also in an SVG file under the
+work and plot directory. For more information,
+see [recording provenance][provenance].
+
+
Congratulations!
+
You now know the basic diagnostic script structure and some available
+tools for putting together your own diagnostics. Have a look at existing
+recipes and diagnostics in the repository for more examples of functions
+you can use in your diagnostics!
+
+
+
+
+
+
Key Points
+
+
ESMValTool provides helper functions to interface a Python
+diagnostic script with preprocessor output.
+
Existing diagnostics can be used as templates and modified to write
+new diagnostics.
+
Helper functions can be imported from
+esmvaltool.diag_scripts.shared and used in your own
+diagnostic script.
+
+
diff --git a/instructor/09-cmorization.html b/instructor/09-cmorization.html
new file mode 100644
index 00000000..72873a6d
--- /dev/null
+++ b/instructor/09-cmorization.html
@@ -0,0 +1,1495 @@
+
+ESMValTool Tutorial: CMORization: adding new datasets to ESMValTool
+ Skip to main content
+
+
+
+
+
+
+
+ Beta
+
+ This lesson is in the beta phase, which means that it is ready for teaching by instructors outside of the original author team.
+
+
+
+
How to use the existing CMORizer scripts shipped with
+ESMValTool?
+
How to add support for new (observational) datasets?
+
+
+
+
+
+
+
Objectives
+
Understand what CMORization is and why it is necessary.
+
Use existing scripts to CMORize your data.
+
Write a new CMORizer script to support additional data.
+
+
+
+
+
+
Introduction
+
This episode deals with “CMORization”. ESMValTool is designed to work
+with data that follow the CMOR standards. Unfortunately, not all
+datasets follow these standards. In order to use such datasets in
+ESMValTool we first need to reformat the data. This process is called
+“CMORization”.
+
+
+
+
+
+
What are the CMOR standards?
+
+
The name “CMOR” originates from a tool: the Climate Model Output Rewriter.
+This tool is used to create “CF-Compliant netCDF files for use in the
+CMIP projects”. So CMOR extends the CF-standard with additional
+requirements for the Coupled Model Intercomparison Projects (see e.g. here).
+
Concretely, the CMOR standards dictate e.g. the variable names and
+units, coordinate information, how the data should be structured (e.g. 1
+variable per file), additional metadata requirements, and file naming
+conventions a.k.a. the data reference syntax (DRS).
+All this information is stored in so-called CMOR tables. For example,
+the CMOR tables for the CMIP6 project can be found here.
+
+
+
+
ESMValTool offers two ways to CMORize data:
+
A reformatting script can be used to create a CMOR-compliant copy.
+CMORizer scripts for several popular datasets are included in
+ESMValTool, and ESMValTool also provides a convenient way to execute
+them.
+
ESMValCore can execute CMOR fixes ‘[on the fly](https://docs.esmvaltool.org/projects/esmvalcore/en/latest/develop/
+fixing_data.html#fixing-data)’. The advantage is that you don’t need to
+store an additional, reformatted copy of the data. The disadvantage is
+that these fixes should be implemented inside ESMValCore, which is
+beyond the scope of this tutorial.
+
In this lesson, we will re-implement a CMORizer script for the
+FLUXCOM dataset that contains observations of the Gross Primary
+Production (GPP), a variable that is important for calculating
+components of the global carbon cycle. See the next section on how to
+obtain data.
The data for this episode is available via the FluxCom Data
+Portal. First you’ll need to register. After registration, in the
+dropdown boxes, select FLUXCOM as the data choice and click download.
+Three files will be displayed. Click the download button on the “FLUXCOM
+(RS+METEO) Global Land Carbon Fluxes using CRUNCEP climate data”. You’ll
+receive an email with the FTP address to access the server. Connect to
+the server, follow the path in your email, and look for the file
+raw/monthly/GPP.ANN.CRUNCEPv6.monthly.2000.nc. Download
+that file and save it in a folder called
+~/data/RAWOBS/Tier3/FLUXCOM.
+
Note: you’ll need a user-friendly ftp client. On Linux,
+ncftp works okay.
+
+
+
+
+
+
What is the deal with those “tiers”?
+
+
Many datasets come with access restrictions. In this way the data
+providers can keep track of how their data is used. In many cases
+“restricted access” just means that one has to register with an email
+address and accept the terms of use, which typically ask that you
+acknowledge the data providers.
+
There are also datasets available that do not need a registration.
+The “obs4MIPs” or “ana4MIPs” datasets, for example, are specifically
+produced to facilitate comparisons with model simulations.
+
To reflect these different levels of access restriction, the
+ESMValTool team has created a tier-system. The definition of the
+different tiers are as follows:
+
+Tier1: obs4MIPs and ana4MIPS datasets (can be used
+directly with the ESMValTool)
+
+Tier2: other freely available datasets (most of
+them will need some kind of cmorization)
+
+Tier3: datasets with access restrictions (most of
+these datasets will also need some kind of cmorization)
+
These access restrictions are also why the ESMValTool developers
+cannot distribute copies or automate downloading of all observations and
+reanalysis data used in the recipes. As a compromise, we provide the
+CMORization scripts so that each user can CMORize their own copy of the
+access restricted datasets if needed.
+
+
+
+
Run the existing CMORizer script
+
Before we develop our own CMORizer script, let’s first see what
+happens when we run the existing one. There is a specific command
+available in the ESMValTool to run the CMORizer scripts:
+
+
BASH
+
+
esmvaltool data format --config_file<path to config-user.yml><dataset-name>
+
+
The config-user.yml is the file in which we define the
+different data paths, see the episode on Configuration. In the
+rootpath of your config-user.yml, make sure to
+add the right directory for “RAWOBS” data in which you downloaded the
+FLUXCOM dataset:
+
+
YAML
+
+
rootpath:
+RAWOBS: ~/data/RAWOBS
+
+
This enables ESMValTool to find the raw observational datasets stored
+in the “RAWOBS” folder. The dataset-name needs to be
+identical to the folder name that was created to store the raw
+observation data files, i.e. RAWOBS/TierX/dataset-name. In
+our case this would be “FLUXCOM”.
+
If everything is okay, the output should look something like
+this:
+
+
OUTPUT
+
+
...
+... Starting the CMORization Tool at time: 2022-07-26 14:02:16 UTC
+... ----------------------------------------------------------------------
+... input_dir = /home/peter/data/RAWOBS
+... output_dir = /home/peter/esmvaltool_output/data_formatting_20220726_140216
+... ----------------------------------------------------------------------
+... Running the CMORization scripts.
+... Processing datasets ['FLUXCOM']
+... Input data from: /home/peter/data/RAWOBS/Tier3/FLUXCOM
+... Output will be written to: /home/peter/esmvaltool_output/
+ data_formatting_20220726_140216/Tier3/FLUXCOM
+... Reformat script: /home/peter/mambaforge/envs/esmvaltool/lib/python3.9/
+ site-packages/esmvaltool/cmorizers/data/formatters/datasets/fluxcom
+... CMORizing dataset FLUXCOM using Python script /home/peter/mambaforge/envs/
+ esmvaltool/lib/python3.9/site-packages/esmvaltool/cmorizers/data/formatters/
+ datasets/fluxcom.py
+... Found input file '/home/peter/data/RAWOBS/Tier3/FLUXCOM/GPP.ANN.CRUNCEPv6.monthly.*.nc'
+... CMORizing variable 'gpp'
+... Lmon
+... Var is gpp
+... ... UserWarning: Ignoring netCDF variable 'GPP' invalid units 'gC m-2 day-1'
+
+... Fixing time...
+... Fixing latitude...
+... Fixing longitude...
+... Flipping dimensional coordinate latitude...
+... Saving file
+... Saving: /home/peter/esmvaltool_output/data_formatting_20220726_140216/Tier3/
+ FLUXCOM/OBS_FLUXCOM_reanaly_ANN-v1_Lmon_gpp_200001-200012.nc
+... Cube has lazy data [lazy is preferred]
+... CMORization of dataset FLUXCOM finished!
+... Formatting successful for dataset FLUXCOM
+
+
So you can see that several fixes are applied, and the CMORized file
+is written to the ESMValTool output directory, i.e.
+~/esmvaltool_output/data_formatting_YYYYMMDD_HHMMSS/TierX/dataset-name/filename.nc
+In order to use it, we’ll have to copy it from the output directory to a
+folder called ~/data/OBS/Tier3/FLUXCOM and make sure the
+path to OBS is set correctly in our config-user file:
+
+
YAML
+
+
rootpath:
+OBS: ~/data/OBS
+
+
You can also see the path where ESMValTool stores the reformatting
+script:
+~/ESMValTool/esmvaltool/data/formatters/datasets/fluxcom.py.
+You may have a look at this file if you want. The script also uses a
+configuration file:
+~/ESMValTool/esmvaltool/cmorizers/data/cmor_config/FLUXCOM.yml.
+
Make a test recipe
+
To verify that the data is correctly CMORized, we will make a simple
+test recipe. As illustrated in the figure at the top of this episode,
+one of the steps that ESMValTool executes is a CMOR-check. If the data
+is not correctly CMORized, ESMValTool will give a warning or error.
+
+
+
+
+
+
Create a test recipe
+
+
Create a simple recipe called recipe_check_fluxcom.yml that
+loads the FLUXCOM data. It should include a datasets section with a
+single entry for the “FLUXCOM” dataset with the correct dataset keys,
+and a diagnostics section with two variables: gpp. We don’t need any
+preprocessors or scripts (set scripts: null), but we have
+to add a documentation section with a description, authors and
+maintainer, otherwise the recipe will fail.
+
Use the following dataset keys:
+
project: OBS
+
dataset: FLUXCOM
+
type: reanaly
+
version: ANN-v1
+
mip: Lmon
+
start_year: 2000
+
end_year: 2000
+
tier: 3
+
Some of these dataset keys are further explained in the callout boxes
+in this episode.
+
+
+
+
+
+
+
+
+
Here’s an example recipe
+
+
YAML
+
+
documentation:
+
+description: Test recipe for FLUXCOM data
+title: This is a test recipe for the FLUXCOM data.
+
+authors:
+- kalverla_peter
+
+maintainer:
+- kalverla_peter
+
+datasets:
+-{project: OBS,dataset: FLUXCOM,mip: Lmon,tier:3,start_year:2000,
+end_year:2000,type: reanaly,version: ANN-v1}
+
+diagnostics:
+check_fluxcom:
+description: Check that ESMValTool can load the cmorized fluxnet data without errors.
+variables:
+gpp:
+scripts:null
esmvaltool run recipe_check_fluxcom.yml --config_file<path to config-user.yml> --log_level debug
+
+
If everything is okay, the recipe should run without problems.
+
Starting from scratch
+
Now that you’ve seen how to use an existing CMORizer script, let’s
+think about adding a new one. We will remove the existing CMORizer
+script, and re-implement it from scratch. This exercise allows us to
+point out all the details of what’s going on. We’ll also remove the
+CMORized data that we’ve just created, so our test recipe will not be
+able to use it anymore.
"""ESMValTool CMORizer for FLUXCOM GPP data.
+
+<We will add some useful info here later>
+"""
+import logging
+from esmvaltool.cmorizers.data import utilities as utils
+
+logger = logging.getLogger(__name__)
+
+def cmorization(in_dir, out_dir, cfg, cfg_user, start_date, end_date):
+"""Cmorize the dataset."""
+
+# This is where you'll add the cmorization code
+# 1. find the input data
+# 2. apply the necessary fixes
+# 3. store the data with the correct filename
+
+
Here, in_dir corresponds to the input directory of the
+raw files, out_dir to the output directory of final
+reformatted data set and cfg to a configuration dictionary
+given by a configuration file that we will get to shortly. The last
+three arguments will not be considered in this script but can be used in
+other cases. cfg_user corresponds to the user configuration
+file, start_date to the start of the period to format, and
+end_date to the end of the period to format. When you type
+the command esmvaltool data format in the terminal,
+ESMValTool will call this function with the settings found in your
+configuration files.
+
The ESMValTool CMORizer also needs a dataset configuration file.
+Create a file called
+~/ESMValTool/esmvaltool/cmorizers/data/cmor_config/FLUXCOM.yml
+and fill it with the following boilerplate:
Note: the name of this file must be
+identical to dataset-name.
+
As you can see, the configuration file contains information about the
+original filename of the dataset, and some additional metadata that you
+might recognize from the CMOR filename structure. It also contains a
+list of variables that’s available for this dataset. We’ll add this
+information step by step in the following sections.
+
+
+
+
+
+
RAWOBS, OBS, OBS6!?
+
+
In the configuration above we’ve already filled in the
+project_id. ESMValTool uses these project IDs to find the
+data on your hard drive, and also to find more information about the
+data. The RAWOBS and OBS projects refer to
+external data before and after CMORization, respectively.
+Historically, most external data were observations, hence the
+naming.
+
In going from CMIP5 to CMIP6, the CMOR standards changed a bit. For
+example, some variables were renamed, which posed a dilemma: should
+CMORization reformat to the CMIP5 or CMIP6 definition? To solve this,
+the OBS6 project was created. So OBS6 data
+follow the CMIP6 standards, and that’s what we’ll use for the new
+CMORizer.
+
+
+
+
You can try running the CMORizer at this point, and it should work
+without errors. However, it doesn’t produce any output yet:
+
+
BASH
+
+
esmvaltool data format --config_file<path to config-user.yml> FLUXCOM
+
+
+
1. Find the input data
+
First we’ll get the CMORizer script to locate our FLUXCOM data. We
+can use the information from the in_dir and
+cfg variables. Add the following snippet to your CMORizer
+script:
+
+
PYTHON
+
+
# 1. find the input data
+logger.info("in_dir: '%s'", in_dir)
+logger.info("cfg: '%s'", cfg)
+
+
If you run the CMORizer again, it will print out the content of these
+variables and the output should contain something like this:
Try to locate the input data inside the CMORizer script and load it
+(we’ll use iris because ESMValTool includes helper
+utilities for iris cubes). Confirm that you’ve loaded the data by
+logging the correct path and (part of the) file content.
+
+
+
+
+
+
+
+
+
There are many ways to do it. In any case, you should have added the
+original filename to the configuration file (and un-commented this
+line): filename: 'GPP.ANN.CRUNCEPv6.monthly.*.nc'. Note the
+*: this is a useful shorthand to find multiple files for
+different years. In a similar way we can also look for multiple
+variables, etc.
+
Here’s an example solution (inserted directly under the original
+comment):
+
+
PYTHON
+
+
# 1. find the input data
+filename_pattern = cfg['filename']
+matches = Path(in_dir).glob(filename_pattern)
+
+for match in matches:
+ input_file =str(match)
+ logger.info("found: %s", input_file)
+ cube = iris.load_cube(input_file)
+ logger.info("content: %s", cube)
+
+
To make this work we’ve added import iris and
+from pathlib import Path at the top of the file. Note that
+we’ve started a loop, since we may find multiple files if there’s more
+than one year of data available.
+
+
+
+
+
+
+
2. Save the data with the correct filename
+
Before we start adding fixes, we’ll first make sure that our CMORizer
+can also write output files with the correct name. This will enable us
+to use the test recipe for the CMOR compatibility check.
+
We can use the save function from the utils
+that we imported at the top. The call signature looks like this:
+utils.save_variables(cube, var, outdir, attrs, **kwargs).
+
We already have the cube and the outdir.
+The variable short name (var) and attributes
+(attrs) are set through the configuration file. So we need
+to find out what the correct short name and attributes are.
+
The standard attributes for CMIP variables are defined in the CMIP
+tables. These tables are differentiated according to the “MIP” they
+belong to. The tables are a copy of the PCMDI guidelines.
+
+
+
+
+
+
Find the variable “gpp” in a CMOR table
+
+
Check the available CMOR tables to find the variable “gpp” with the
+following characteristics:
The variable “gpp” belongs to the land variables. The temporal
+resolution that we are looking for is “monthly”. This information points
+to the “Lmon” CMIP table. And indeed, the variable “gpp” can be found in
+the file [here](https://github.com/ESMValGroup/ESMValCore/blob/main/esmvalcore/
+cmor/tables/cmip6/Tables/CMIP6_Lmon.json).
+
+
+
+
+
If the variable you are interested in is not available in the
+standard CMOR tables, you could write a custom CMOR table entry for the
+variable. This, however, is beyond the scope of this tutorial.
+
+
+
+
+
+
Fill the configuration file
+
+
Uncomment the following entries in your configuration file and fill
+them with appropriate values:
+
dataset_id
+
version
+
tier
+
modeling_realm
+
short_name (the ??? immediately under variables)
+
mip
+
+
+
+
+
+
+
+
+
The configuration file now look something like this:
Now that we have set this information correctly in the config file,
+we can call the save function. Add the following python code to your
+CMORizer script:
+
+
PYTHON
+
+
# 3. store the data with the correct filename
+attributes = cfg['attributes']
+variables = cfg['variables']
+
+for short_name, variable_info in variables.items():
+ all_attributes = {**attributes, **variable_info} # add the mip to the other attributes
+ utils.save_variable(cube=cube, var=short_name, outdir=out_dir, attrs=all_attributes)
+
+
Since we only have one variable (gpp), the loop is not strictly
+necessary. However, this makes it possible to add more variables later
+on.
+
+
+
+
+
+
Was the CMORization successful so far?
+
+
If you run the CMORizer again, you should see that it creates an
+output file named
+OBS6_FLUXCOM_reanaly_ANN-v1_Lmon_gpp_xxxx01-xxxx12.nc
+stored in your ESMValTool output directory
+~/esmvaltool_output/data_formatting_YYYYMMDD_HHMMSS/Tier3/FLUXCOM/.
+The “xxxx” and “yyyy” represent the start and end year of the data.
+
+
+
+
Great! So we have produced a NetCDF file with the CMORizer that
+follows the naming convention for ESMValTool datasets. Let’s have a look
+at the NetCDF file as it was written with the very basic CMORizer from
+above.
The file contains a variable named “GPP” that contains three
+dimensions: “time”, “lat”, “lon”. Notice the strange time units, and the
+invalid_units in the global attributes section. Also it
+seems that there is not information available about the lat and lon
+coordinates. These are just some of the things we’ll address in the next
+section.
+
+
+
3. Implementing additional fixes
+
Copy the output of the CMORizer to your folder
+~/data/OBS6/Tier3/FLUXCOM/ and change the test recipe to
+look for OBS6 data instead of OBS (note: we’re upgrading the CMORizer to
+newer standards here!). Make sure the path to OBS6 is set
+correctly in our config-user file:
+
+
YAML
+
+
rootpath:
+OBS6: ~/data/OBS6
+
+
If we now run the test recipe on our newly ‘CMORized’ data,
+
+
BASH
+
+
esmvaltool run recipe_check_fluxcom.yml --config_file<path to config-user.yml> --log_level debug
+
+
it should be able to find the correct file, but it does not succeed
+yet. The first thing that the ESMValTool CMOR checker brings up is:
+
+
ERROR
+
+
iris.exceptions.UnitConversionError: Cannot convert from unknown units. The
+"units" attribute may be set directly.
+
+
If you look closely at the error messages, you can see that this
+error concerns the units of the coordinates. ESMValTool tries to fix
+them automatically, but since no units are defined on the coordinates,
+this fails.
+
The cmorizer utilities also include a function called
+fix_coords, but before we can use it, we’ll also need to
+make sure the coordinates have the correct standard name. Add the
+following code to your cmorizer:
+
+
PYTHON
+
+
# 2. Apply the necessary fixes
+# 2a. Fix/add coordinate information and metadata
+cube.coord('lat').standard_name ='latitude'
+cube.coord('lon').standard_name ='longitude'
+utils.fix_coords(cube)
+
+
With some additional refactoring, our cmorization function might then
+look something like this:
+
+
PYTHON
+
+
def cmorization(in_dir, out_dir, cfg, cfg_user, start_date, end_date):
+"""Cmorize the dataset."""
+
+# Get general information from the config file
+ attributes = cfg['attributes']
+ variables = cfg['variables']
+
+for short_name, variable_info in variables.items():
+ logger.info("CMORizing variable: %s", short_name)
+
+# 1a. Find the input data (one file for each year)
+ filename_pattern = cfg['filename']
+ matches = Path(in_dir).glob(filename_pattern)
+
+for match in matches:
+# 1b. Load the input data
+ input_file =str(match)
+ logger.info("found: %s", input_file)
+ cube = iris.load_cube(input_file)
+
+# 2. Apply the necessary fixes
+# 2a. Fix/add coordinate information and metadata
+ cube.coord('lat').standard_name ='latitude'
+ cube.coord('lon').standard_name ='longitude'
+ utils.fix_coords(cube)
+
+# 3. Save the CMORized data
+ all_attributes = {**attributes, **variable_info}
+ utils.save_variable(cube=cube, var=short_name, outdir=out_dir, attrs=all_attributes)
+
+
Run the CMORizer script once more. Have a look at the netCDF file,
+and confirm that the coordinates now have much more metadata added to
+them. Then, run the test recipe again with the latest CMORizer output.
+The next error is:
+
+
ERROR
+
+
esmvalcore.cmor.check.CMORCheckError: There were errors in variable GPP:
+Variable GPP units unknown can not be converted to kg m-2 s-1 in cube:
+
+
Okay, so let’s fix the units of the “GPP” variable in the CMORizer.
+Remember that you can find the correct units in the CMOR table. Add the
+following three lines to our CMORizer:
+
+
PYTHON
+
+
# 2b. Fix gpp units
+logger.info("Changing units for gpp from gc/m2/day to kg/m2/s")
+cube.data = cube.core_data() / (1000*86400)
+cube.units ='kg m-2 s-1'
+
+
If everything is okay, the test recipe should now pass. We’re getting
+there. Looking through the output though, there’s still a warning.
+
+
OUTPUT
+
+
WARNING There were warnings in variable GPP:
+Standard name for GPP changed from None to gross_primary_productivity_of_biomass_expressed_as_carbon
+Long name for GPP changed from GPP to Carbon Mass Flux out of Atmosphere Due to
+ Gross Primary Production on Land [kgC m-2 s-1]
+
+
ESMValTool is able to apply automatic fixes here, but if we are
+running a CMORizer script anyway, we might as well fix it
+immediately.
You can see that we’re using the CMOR table here. This was passed on
+by ESMValTool as part of the CFG input variable. So here
+we’re making sure that we’re updating the cubes metadata to conform to
+the CMOR table.
+
Finally, the test recipe should run without errors or warnings.
+
+
+
4. Finalizing the CMORizer
+
Once everything works as expected, there’s a couple of things that we
+can still do.
+
+Add download instructions. The header of the
+CMORizer contains information about where to obtain the data, when it
+was accessed the last time, which ESMValTool “tier” it is associated
+with, and more detailed information about the necessary downloading and
+processing steps.
+
+
+
+
+
+
Fill out the header for the “FLUXCOM” dataset
+
+
Fill out the header of the new CMORizer. The different parts that
+need to be present in the header are the following:
+
Caption: the first line of the docstring should summarize what the
+script does.
+
Tier
+
Source
+
Last access
+
Download and processing instructions
+
+
+
+
+
+
+
+
+
The header for the “FLUXCOM” dataset could look something like
+this:
+
+
PYTHON
+
+
"""ESMValTool CMORizer for FLUXCOM GPP data.
+
+Tier
+ Tier 3: restricted dataset.
+
+Source
+ http://www.bgc-jena.mpg.de/geodb/BGI/Home
+
+Last access
+ 20190727
+
+Download and processing instructions
+ From the website, select FLUXCOM as the data choice and click download.
+ Two files will be displayed. One for Land Carbon Fluxes and one for
+ Land Energy fluxes. The Land Carbon Flux file (RS + METEO) using
+ CRUNCEP data file has several data files for different variables.
+ The data for GPP generated using the
+ Artificial Neural Network Method will be in files with name:
+ GPP.ANN.CRUNCEPv6.monthly.\*.nc
+ A registration is required for downloading the data.
+ Users in the UK with a CEDA-JASMIN account may request access to the jules
+ workspace and access the data.
+ Note : This data may require rechunking of the netcdf files.
+ This constraint will not exist once iris is updated to
+ version 2.3.0 Aug 2019
+"""
+
+
+
+
+
+
+Fill the dataset information list. The file
+[datasets.yml](https://github.com/ESMValGroup/ESMValTool/blob/main/esmvaltool/
+cmorizers/data/datasets.yml) contains the ESMValTool “tier”, the data
+source, the last access time and download instructions for all supported
+datasets in ESMValTool. You can simply reuse the information written in
+the header of the CMORizer.
+
+
+
+
+
+
Fill out the FLUXCOM entry in
+datasets.yml
+
+
+
Fill out the FLUXCOM entry in datasets.yml. The
+different parts that need to be present in the entry are the
+following:
+
Dataset-name
+
Tier
+
Source
+
Last access
+
Download and processing instructions
+
+
+
+
+
+
+
+
+
The entry for the “FLUXCOM” dataset should look like:
+
+
YAML
+
+
FLUXCOM:
+tier:3
+source: http://www.bgc-jena.mpg.de/geodb/BGI/Home
+last_access: 2019-07-27
+ info: |
+ From the website, select FLUXCOM as the data choice and click download.
+ Two files will be displayed. One for Land Carbon Fluxes and one for
+ Land Energy fluxes. The Land Carbon Flux file (RS + METEO) using
+ CRUNCEP data file has several data files for different variables.
+ The data for GPP generated using the
+ Artificial Neural Network Method will be in files with name:
+ GPP.ANN.CRUNCEPv6.monthly.*.nc
+ A registration is required for downloading the data.
+ Users in the UK with a CEDA-JASMIN account may request access to the jules
+ workspace and access the data.
+ Note : This data may require rechunking of the netcdf files.
+ This constraint will not exist once iris is updated to
+ version 2.3.0 Aug 2019
+
+
+
+
+
+
Once the datasets.yml file is filled, you can check that
+ESMValTool can display information about the added dataset with:
+
+
BASH
+
+
esmvaltool data info FLUXCOM
+
+
If everything is okay, the output should look something like
+this:
+
+
OUTPUT
+
+
$ esmvaltool data info FLUXCOM
+FLUXCOM
+
+Tier: 3
+Source: http://www.bgc-jena.mpg.de/geodb/BGI/Home
+Automatic download: No
+
+From the website, select FLUXCOM as the data choice and click download.
+Two files will be displayed. One for Land Carbon Fluxes and one for
+Land Energy fluxes. The Land Carbon Flux file (RS + METEO) using
+CRUNCEP data file has several data files for different variables.
+The data for GPP generated using the
+Artificial Neural Network Method will be in files with name:
+GPP.ANN.CRUNCEPv6.monthly.*.nc
+A registration is required for downloading the data.
+Users in the UK with a CEDA-JASMIN account may request access to the jules
+workspace and access the data.
+Note : This data may require rechunking of the netcdf files.
+This constraint will not exist once iris is updated to
+version 2.3.0 Aug 2019
+
+
Note that Automatic download: No means that no automatic
+downloading script is available in ESMValTool for this dataset. The
+implementation of such a script is beyond the scope of this tutorial. To
+find out which datasets come with an automatic download script, you can
+run: esmvaltool data list to list all datasets supported in
+ESMValTool. More information about the usage of automatic downloading
+scripts can be found in the User
+Guide.
+
+Complete the metadata in the config file. We have
+left a few fields empty in the configuration file, such as ‘source’. By
+filling out these fields we can make sure the relevant metadata is
+passed on as attributes in the CMORized data. To make this work, add the
+following line to the CMORizer script:
+
+
PYTHON
+
+
# 2d. Update the cubes metadata with all info from the config file
+utils.set_global_atts(cube, attributes)
+
+
Add a reference. Make sure that there is a
+reference file available for the dataset, see the instruction [here](https://docs.esmvaltool.org/en
+/latest/community/diagnostic.html#adding-references).
+
Make a pull request. Since you have gone through
+all the trouble to reformat the dataset so that the ESMValTool can work
+with it, it would be great if you could provide the CMORizer, and
+ultimately with that the dataset, to the rest of the community. For more
+information, see the episode on Development and
+contribution.
+
Add documentation. Make sure that you have added
+the info of your dataset to the User Guide so that people know it is
+available for the ESMValTool Obtaining
+input data.
+
+
Some final comments
+
Congratulations! You have just added support for a new dataset to
+ESMValTool! Adding a new CMORizer is definitely already an advanced task
+when working with the ESMValTool. You need to have a basic understanding
+of how the ESMValTool works and how it’s internal structure looks like.
+In addition, you need to have a basic understanding of NetCDF files and
+a programming language. In our example we used python for the CMORizing
+script since we advocate for focusing the code development on only a few
+different programming languages. This helps to maintain the code and to
+ensure the compatibility of the code with possible fundamental changes
+to the structure of the ESMValTool and ESMValCore.
+
More information about adding observations to the ESMValTool can be
+found in the documentation.
+
+
+
+
+
+
Key Points
+
+
CMORizers are dataset-specific scripts that can be run once to
+generate CMOR-compliant data.
+
ESMValTool comes with a set of CMORizers readily available, but you
+can also add your own.
+
+
diff --git a/instructor/10-debugging.html b/instructor/10-debugging.html
new file mode 100644
index 00000000..d5c70752
--- /dev/null
+++ b/instructor/10-debugging.html
@@ -0,0 +1,1035 @@
+
+ESMValTool Tutorial: Debugging
+ Skip to main content
+
+
+
+
+
+
+
+ Beta
+
+ This lesson is in the beta phase, which means that it is ready for teaching by instructors outside of the original author team.
+
+
+
+
Every user encounters errors. Once you know why you get certain types
+of errors, they become much easier to fix. The good news is that
+ESMValTool creates a record of the output messages and stores them in
+log files. They can be used for debugging or monitoring the process.
+This lesson helps you understand the different types of errors and when
+you are likely to encounter them.
+
Log files
+
Each time we run ESMValTool, it will produce a new output directory.
+This directory should contain the run folder that is
+automatically generated by ESMValTool. To examine this, we run a
+recipe_python.yml that can be found in lesson Running your first recipe . Check lesson Configuration on how to set paths.
+
In a new terminal, run the recipe:
+
+
BASH
+
+
cd esmvaltool_tutorial
+esmvaltool run examples/recipe_python.yml
+
+
+
ERROR
+
+
esmvaltool: command not found
+
+
ESMValTool encounters this error because the conda environment
+esmvaltool has not been activated. To fix the error, before
+running the recipe, activate the environment:
+
+
BASH
+
+
conda activate esmvaltool
+
+
+
+
+
+
+
conda environment
+
+
More information about the conda environment can be found at Installation.
In the main_log_debug.txt and main_log.txt,
+ESMValTool writes the output messages, warnings and possible errors that
+might happen during pre-processings. To inspect them, we can look inside
+the files. For example:
In the log.txt, ESMValTool writes the output messages,
+warnings and possible errors that are related to the diagnostic
+script.
+
If you encounter an error and don’t know what it means, it is
+important to read the log information. Sometimes knowing where the error
+occurred is enough to fix it, even if you don’t entirely understand the
+message. However, note that you may not always be able to find the error
+or fix it. In that case, ESMValTool community helps you figure out what
+went wrong.
+
+
+
+
+
+
Different log files
+
+
In the run directory, there are two log files
+main_log_debug.txt and main_log.txt. What are
+their differences?
+
+
+
+
+
+
+
+
+
The main_log_debug.txt contains the output messages from
+the pre-processor whereas the main_log.txt shows general
+errors and warnings that might happen in running the recipe and
+diagnostics script.
+
+
+
+
+
Let’s change some settings in the recipe to run a regional
+pre-processor. We use a text editor called nano to open the
+recipe file:
+
+
BASH
+
+
nano recipe_python.yml
+
+
+
+
+
+
+
Text editor side note
+
+
No matter what editor you use, you will need to know where it
+searches for and saves files. If you start it from the shell, it will
+(probably) use your current working directory as its default location.
+We use nano in examples here because it is one of the least
+complex text editors. Press ctrl + O to save the
+file, and then ctrl + X to exit
+nano.
+
+
+
+
+
+
+
+
+
+
YAML
+
+
# ESMValTool
+# recipe_python.yml
+---
+# See https://docs.esmvaltool.org/en/latest/recipes/recipe_examples.html
+# for a description of this recipe.
+
+# See https://docs.esmvaltool.org/projects/esmvalcore/en/latest/recipe/overview.html
+# for a description of the recipe format.
+---
+documentation:
+ description: |
+ Example recipe that plots a map and timeseries of temperature.
+
+title: Recipe that runs an example diagnostic written in Python.
+
+authors:
+- andela_bouwe
+- righi_mattia
+
+maintainer:
+- schlund_manuel
+
+references:
+- acknow_project
+
+projects:
+- esmval
+- c3s-magic
+
+datasets:
+-{dataset: BCC-ESM1,project: CMIP6,exp: historical,ensemble: r1i1p1f1,grid: gn}
+-{dataset: bcc-csm1-1,version: v1,project: CMIP5,exp: historical,ensemble: r1i1p1}
+
+preprocessors:
+# See https://docs.esmvaltool.org/projects/esmvalcore/en/latest/recipe/preprocessor.html
+# for a description of the preprocessor functions.
+
+to_degrees_c:
+convert_units:
+units: degrees_C
+
+annual_mean_amsterdam:
+extract_location:
+location: Amsterdam
+scheme: linear
+annual_statistics:
+operator: mean
+multi_model_statistics:
+statistics:
+- mean
+span: overlap
+convert_units:
+units: degrees_C
+
+annual_mean_global:
+area_statistics:
+operator: mean
+annual_statistics:
+operator: mean
+convert_units:
+units: degrees_C
+
+diagnostics:
+
+map:
+description: Global map of temperature in January 2000.
+themes:
+- phys
+realms:
+- atmos
+variables:
+tas:
+mip: Amon
+preprocessor: to_degrees_c
+timerange: 2000/P1M
+ caption: |
+ Global map of {long_name} in January 2000 according to {dataset}.
+scripts:
+script1:
+script: examples/diagnostic.py
+quickplot:
+plot_type: pcolormesh
+cmap: Reds
+
+timeseries:
+description: Annual mean temperature in Amsterdam and global mean since 1850.
+themes:
+- phys
+realms:
+- atmos
+variables:
+tas_amsterdam:
+short_name: tas
+mip: Amon
+preprocessor: annual_mean_amsterdam
+timerange: 1850/2000
+caption: Annual mean {long_name} in Amsterdam according to {dataset}.
+tas_global:
+short_name: tas
+mip: Amon
+preprocessor: annual_mean_global
+timerange: 1850/2000
+caption: Annual global mean {long_name} according to {dataset}.
+scripts:
+script1:
+script: examples/diagnostic.py
+quickplot:
+plot_type: plot
+
+
+
+
+
+
Keys and values in recipe settings
+
The [ESMValTool pre-processors](https://docs.esmvaltool.org/projects/ESMValCore/en/latest/
+recipe/preprocessor.html?highlight=preprocessor) cover a broad range of
+operations on the input data, like time manipulation, area manipulation,
+land-sea masking, variable derivation, etc. Let’s add the preprocessor
+extract_region to a new section
+annual_mean_regional:
+
+
YAML
+
+
preprocessors:
+annual_mean_regional:
+annual_statistics:
+operator: mean
+extract_region:
+start_longitude:-10
+end_longitude:40
+start_latitude:27
+end_latitude:70
+
+
Also, we change the projects value esmval
+to tutorial:
+
+
YAML
+
+
projects:
+- tutorial
+- c3s-magic
+
+
Then, we save the file and run the recipe:
+
+
BASH
+
+
esmvaltool run recipe_python.yml
+
+
+
ERROR
+
+
ValueError: Tag 'tutorial' does not exist in section 'projects' of
+esmvaltool/config-references.yml 2020-06-29 18:09:56,641 UTC [46055] INFO If you
+have a question or need help, please start a new discussion on
+https://github.com/ESMValGroup/ESMValTool/discussions If you suspect this is a
+bug, please open an issue on https://github.com/ESMValGroup/ESMValTool/issues To
+make it easier to find out what the problem is, please consider attaching the
+files run/recipe_*.yml and run/main_log_debug.txt from the output directory.
+
+
The values for the keys author, maintainer,
+projects and references in the recipe should
+be known by ESMValTool:
ESMValTool references in BibTeX format can be found in
+the [ESMValTool/esmvaltool/references](https://github.com/ESMValGroup/ESMValTool/
+tree/master/esmvaltool/references) directory.
+
+
ESMValTool can’t locate the
+data
+
You are assisting a colleague with ESMValTool. The colleague replaces
+the CanESM2 entry in
+dataset: CanESM2, project: CMIP5 to ACCESS1-3
+and runs the recipe. However, ESMValTool encounters an error like:
+
+
BASH
+
+
ERROR No input files found for variable {'short_name': 'tas', 'mip': 'Amon',
+'preprocessor':'annual_mean_amsterdam', 'variable_group': 'tas_amsterdam',
+'diagnostic':'timeseries', 'dataset': 'ACCESS1-3', 'project': 'CMIP5', 'exp':
+'historical','ensemble': 'r1i1p1', 'recipe_dataset_index': 1, 'institute':
+['CSIRO-BOM'],'product': ['output1', 'output2'], 'timerange': '1850/2000',
+'alias':'CMIP5', 'original_short_name': 'tas', 'standard_name':
+'air_temperature','long_name': 'Near-Surface Air Temperature', 'units': 'K',
+'modeling_realm':['atmos'], 'frequency': 'mon', 'start_year': 1850,
+'end_year': 2000}
+ERROR Looked for files matching
+['tas_Amon_ACCESS1-3_historical_r1i1p1*.nc'], but did not find any existing
+input directory
+ERROR Set 'log_level' to 'debug' to get more information
+
+
+
What suggestions would you give the researcher for fixing the
+error?
+
+
+
+
+
+
Challenge
+
+
+
+
+
+
+
+
+
+
+
Check user-config.yml to see if the correct directory
+for input data is introduced
+
Check the available data, regarding exp, mip, ensemble, start_year,
+and end_year
+
Check the variable names in the diagnostics section in the
+recipe
+
+
+
+
+
Check pre-processed data
+
The setting save_intermediary_cubes in the configuration
+file can be used to save the pre-processed data. More information about
+this setting can be found at Configuration.
+
+
+
+
+
+
save_intermediary_cubes
+
+
Note that this setting should only be used for debugging, as it
+significantly slows down the recipe and increases disk usage because a
+lot of output files need to be stored.
+
+
+
+
Check diagnostic script path
+
The result of the pre-processor is passed to the
+examples/diagnostic.py script, that is introduced in the
+recipe as:
The diagnostic scripts are located in the folder
+diag_scripts in the ESMValTool installation directory
+<path_to_esmvaltool>. To find where ESMValTool is
+located on your system, see Installation .
+
Let’s see what happens if we can change the script path as:
esmvalcore._task.DiagnosticError: Cannot execute script
+'diag_scripts/ocean/diagnostic_timeseries.py'
+(~/mambaforge/envs/esmvaltool2.6/lib/python3.10/site-packages/esmvaltool/
+diag_scripts/diag_scripts/ocean/diagnostic_timeseries.py):
+file does not exist. 2022-10-18 11:42:34,136 UTC [39323] INFO If you have a
+question or need help, please start a new discussion on
+https://github.com/ESMValGroup/ESMValTool/discussions If you suspect this is a
+bug, please open an issue on https://github.com/ESMValGroup/ESMValTool/issues To
+make it easier to find out what the problem is, please consider attaching the
+files run/recipe_*.yml and run/main_log_debug.txt from the output directory.
+
+
The script path should be relative to diag_scripts
+directory. It means that the script
+diagnostic_timeseries.py is located in
+<path_to_esmvaltool>/diag_scripts/ocean/.
+Alternatively, the script path can be an absolute path. To examine this,
+we can download the script from the ESMValTool
+repository:
Then, run the recipe again and examine the output to see
+Run was successful!
+
+
+
+
+
+
Available recipe and diagnostic scripts
+
+
ESMValTool provides a broad suite of recipes
+and diagnostic scripts for different disciplines like atmosphere,
+climate metrics, future projections, IPCC, land, ocean, ….
+
+
+
+
+
Re-running a diagnostic
+
Look at the main_log.txt file and answer the following
+question: How to re-run the diagnostic script?
+
+
Solution
+
The main_log.txt file contains information on how to
+re-run the diagnostic script without re-running the pre-processors:
+
+
+
+
BASH
+
+
2020-06-29 20:36:32,844 UTC [52810] INFO To re-run this diagnostic script, run:
+
+
+
+
+
+
+
Challenge
+
+
+
+
+
+
+
+
+
+
+
If you run the command in a terminal, you will be able to re-run the
+diagnostic.
+
+
+
+
+
+
+
+
+
+
Memory issues
+
+
If you run out of memory, try setting max_parallel_tasks
+to 1 in the configuration file. Then, check the amount of memory you
+need for that by inspecting the file run/resource_usage.txt
+in the output directory. Using the number, there you can increase the
+number of parallel tasks again to a reasonable number for the amount of
+memory available in your system.
+
+
+
+
+
+
+
+
+
Key Points
+
+
There are three different kinds of log files:
+main_log.txt, and main_log_debug.txt and
+log.txt.
+
+
diff --git a/instructor/404.html b/instructor/404.html
new file mode 100644
index 00000000..2cc43829
--- /dev/null
+++ b/instructor/404.html
@@ -0,0 +1,453 @@
+
+ESMValTool Tutorial: Page not found
+ Skip to main content
+
+
+
+
+
+
+
+ Beta
+
+ This lesson is in the beta phase, which means that it is ready for teaching by instructors outside of the original author team.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ESMValTool Tutorial
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
Page not found
+
+
Our apologies!
+
We cannot seem to find the page you are looking for. Here are some
+tips that may help:
+
+
diff --git a/instructor/CODE_OF_CONDUCT.html b/instructor/CODE_OF_CONDUCT.html
new file mode 100644
index 00000000..bcd219a6
--- /dev/null
+++ b/instructor/CODE_OF_CONDUCT.html
@@ -0,0 +1,467 @@
+
+ESMValTool Tutorial: Contributor Code of Conduct
+ Skip to main content
+
+
+
+
+
+
+
+ Beta
+
+ This lesson is in the beta phase, which means that it is ready for teaching by instructors outside of the original author team.
+
+
+
+
+
+
diff --git a/instructor/LICENSE.html b/instructor/LICENSE.html
new file mode 100644
index 00000000..4131d474
--- /dev/null
+++ b/instructor/LICENSE.html
@@ -0,0 +1,515 @@
+
+ESMValTool Tutorial: Licenses
+ Skip to main content
+
+
+
+
+
+
+
+ Beta
+
+ This lesson is in the beta phase, which means that it is ready for teaching by instructors outside of the original author team.
+
+
+
+
to Share—copy and redistribute the material in any
+medium or format
+
to Adapt—remix, transform, and build upon the
+material
+
for any purpose, even commercially.
+
The licensor cannot revoke these freedoms as long as you follow the
+license terms.
+
Under the following terms:
+
Attribution—You must give appropriate credit
+(mentioning that your work is derived from work that is Copyright (c)
+The Carpentries and, where practical, linking to https://carpentries.org/), provide a link to the
+license, and indicate if changes were made. You may do so in any
+reasonable manner, but not in any way that suggests the licensor
+endorses you or your use.
+
No additional restrictions—You may not apply
+legal terms or technological measures that legally restrict others from
+doing anything the license permits. With the understanding
+that:
+
Notices:
+
You do not have to comply with the license for elements of the
+material in the public domain or where your use is permitted by an
+applicable exception or limitation.
+
No warranties are given. The license may not give you all of the
+permissions necessary for your intended use. For example, other rights
+such as publicity, privacy, or moral rights may limit how you use the
+material.
+
Software
+
Except where otherwise noted, the example programs and other software
+provided by The Carpentries are made available under the OSI-approved MIT
+license.
+
Permission is hereby granted, free of charge, to any person obtaining
+a copy of this software and associated documentation files (the
+“Software”), to deal in the Software without restriction, including
+without limitation the rights to use, copy, modify, merge, publish,
+distribute, sublicense, and/or sell copies of the Software, and to
+permit persons to whom the Software is furnished to do so, subject to
+the following conditions:
+
The above copyright notice and this permission notice shall be
+included in all copies or substantial portions of the Software.
+
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND,
+EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
+CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
+TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
+SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+
Trademark
+
“The Carpentries”, “Software Carpentry”, “Data Carpentry”, and
+“Library Carpentry” and their respective logos are registered trademarks
+of Community Initiatives.
+
+
diff --git a/instructor/aio.html b/instructor/aio.html
new file mode 100644
index 00000000..80970e5c
--- /dev/null
+++ b/instructor/aio.html
@@ -0,0 +1,6337 @@
+
+
+
+
+
+ESMValTool Tutorial: All in One View
+
+
+
+
+
+
+
+
+
+
+
+
+ Skip to main content
+
+
+
+
+
+
+
+ Beta
+
+ This lesson is in the beta phase, which means that it is ready for teaching by instructors outside of the original author team.
+
+
+
+
This tutorial is a first introduction to ESMValTool. Before diving
+into the technical steps, let’s talk about what ESMValTool is all
+about.
+
+
+
+
+
+
What is ESMValTool?
+
+
What do you already know about or expect from ESMValTool?
+
+
+
+
+
+
+
+
+
EMSValTool is many things, but in this tutorial we will focus on the
+following traits:
+
✓ A tool to analyse climate data
+
✓ A collection of diagnostics for reproducible climate
+science
+
✓ A community effort
+
+
+
+
+
A tool to analyse climate data
+
+
+
ESMValTool takes care of finding, opening, checking, fixing,
+concatenating, and preprocessing CMIP data and several other supported
+datasets.
+
The central component of ESMValTool that we will see in this tutorial
+is the recipe. Any ESMValTool recipe is basically a set
+of instructions to reproduce a certain result. The basic structure of a
+recipe is as follows:
+
+
+Documentation with relevant (citation)
+information
+
+Datasets that should be analysed
+
+Preprocessor steps that must be applied
+
+Diagnostic scripts performing more specific
+evaluation steps
+
+
An example recipe could look like this:
+
+
YAML
+
+
documentation:
+title: This is an example recipe.
+description: Example recipe
+authors:
+- lastname_firstname
+
+datasets:
+-{dataset: HadGEM2-ES,project: CMIP5,exp: historical,mip: Amon,
+ensemble: r1i1p1,start_year:1960,end_year:2005}
+
+preprocessors:
+global_mean:
+area_statistics:
+operator: mean
+
+diagnostics:
+hockeystick_plot:
+description: plot of global mean temperature change
+variables:
+temperature:
+short_name: tas
+preprocessor: global_mean
+scripts: hockeystick.py
+
+
+
+
+
+
+
Understanding the different section of the recipe
+
+
Try to figure out the meaning of the different dataset keys. Hint:
+they can be found in the documentation of ESMValTool.
A collection of diagnostics for reproducible climate science
+
+
+
More than a tool, ESMValTool is a collection of publicly available
+recipes and diagnostic scripts. This makes it possible to easily
+reproduce important results.
ESMValTool is built and maintained by an active community of
+scientists and software engineers. It is an open source project to which
+anyone can contribute. Many of the interactions take place on GitHub.
+Here, we briefly introduce you to some of the most important pages.
+
+
+
+
+
+
Meet the ESMValGroup
+
+
Go to github.com/ESMValGroup. This
+is the GitHub page of our ‘organization’. Have a look around. How many
+collaborators are there? Do you know any of them?
+
Near the top of the page there are 2 pinned repositories: ESMValTool
+and ESMValCore. Visit each of the repositories. How many people have
+contributed to each of them? Can you also find out how many people have
+contributed to this tutorial?
+
+
+
+
+
+
+
+
+
Issues and pull requests
+
+
Go back to the repository pages of ESMValTool or ESMValCore. There
+are tabs for ‘issues’ and ‘pull requests’. You can use the labels to
+navigate them a bit more. How many open issues are about enhancements of
+ESMValTool? And how many bugs have been fixed in ESMValCore? There is
+also an ‘insights’ tab, where you can see a summary of recent activity.
+How many issues have been opened and closed in the past month?
+
+
+
+
Conclusion
+
+
+
This concludes the introduction of the tutorial. You now have a basic
+knowledge of ESMValTool and its community. The following episodes will
+walk you through the installation, configuration and running your first
+recipes.
+
+
+
+
+
+
Key Points
+
+
+
ESMValTool provides a reliable interface to analyse and evaluate
+climate data
+
A large collection of recipes and diagnostic scripts is already
+available
+
ESMValTool is built and maintained by an active community of
+scientists and developers
How do I load and check the ESMValTool environment?
+
How do I configure ESMValTool?
+
How do I run a recipe?
+
+
+
+
+
+
+
+
Objectives
+
+
Understand the purpose of the quickstart guide
+
Load and check the ESMValTool environment
+
Configure ESMValTool
+
Run a recipe
+
+
+
+
+
+
+
+
+
+
+
+
What is the purpose of the quickstart guide?
+
+
+
The purpose of the quickstart guide is to enable a user of
+ESMValTool to run ESMValTool as quickly as possible by making the bare
+minimum number of changes.
+
+
+
+
+
+
+
+
+
+
How do I load and check the ESMValTool environment?
+
+
+
For this quickstart guide, an assumption is made that ESMValTool
+has already been installed at the site where ESMValTool will be run. If
+this is not the case, see the [Installation][lesson-installation]
+episode in this tutorial.
+
Load the ESMValTool environment by following the instructions at
+[ESMValTool: Pre-installed versions on HPC clusters / other
+servers][activate-environment].
+
+
Check the ESMValTool environment by accessing the help for
+ESMValTool:
+
+
BASH
+
+
esmvaltool--help
+
+
+
+
+
+
+
+
+
+
+
+
How do I configure ESMValTool?
+
+
+
+
Create the ESMValTool user configuration file (the file is
+written by default to ~/.esmvaltool/config-user.yml):
+
+
BASH
+
+
esmvaltool config get_config_user
+
+
+
Edit the ESMValTool user configuration file using your favourite
+text editor to uncomment the lines relating to the site where ESMValTool
+will be run.
+
For more details about the ESMValTool user configuration file see
+the [Configuration][lesson-configuration] episode in this
+tutorial.
+
+
+
+
+
+
+
+
+
+
How do I run a recipe?
+
+
+
+
Run the example Python recipe:
+
+
BASH
+
+
esmvaltool run examples/recipe_python.yml
+
+
+
+
Wait for the recipe to complete. If the recipe completes
+successfully, the last line printed to screen at the end of the log will
+look something like:
+
+
BASH
+
+
YYYY-MM-DD HH:mm:SS, NNN UTC [NNNNN] INFO Run was successful
+
+
+
+
View the output of the recipe by opening the HTML file produced
+by ESMValTool (the location of this file is printed to screen near the
+end of the log):
+
+
BASH
+
+
YYYY-MM-DD HH:mm:SS, NNN UTC [NNNNN] INFO Wrote recipe output to:
+file:///$HOME/esmvaltool_output/recipe_python_<date>_<time>/index.html
+
+
+
For more details about running recipes see the [Running your
+first recipe][lesson-recipe] episode in this tutorial.
+
+
+
+
+
+
+
+
+
+
Key Points
+
+
+
The purpose of the quickstart guide is to enable a user of
+ESMValTool to run ESMValTool as quickly as possible without having to go
+through the whole tutorial
+
Use the module load command to load the ESMValTool
+environment, see the \[Installation\]\[lesson-installation\] episode for more
+details and use esmvaltool --help to check the ESMValTool
+environment
+
Use esmvaltool config get_config_user to create the
+ESMValTool user configuration file
What are the prerequisites for installing ESMValTool?
+
How do I confirm that the installation was successful?
+
+
+
+
+
+
+
+
Objectives
+
+
Install ESMValTool
+
Demonstrate that the installation was successful
+
+
+
+
+
+
+
Overview
+
+
+
The instructions help with the installation of ESMValTool on
+operating systems like Linux/MacOSX/Windows. We use the Mamba
+package manager to install the ESMValTool. Other installation methods
+are also available; they can be found in the documentation.
+We will first install Mamba, and then ESMValTool. We end this chapter by
+testing that the installation was successful.
+
Before we begin, here are all the possible ways in which you can use
+ESMValTool depending on your level of expertise or involvement with
+ESMValTool and associated software such as GitHub and Mamba.
+
+
If you have access to a server where ESMValTool is already installed
+as a module, for e.g., the [CEDA JASMIN](https://help.jasmin.ac.uk/article
+/4955-community-software-esmvaltool) server, you can simply load the
+module with the following command:
+
+
+
BASH
+
+
module load esmvaltool
+
+
After loading esmvaltool, we can start using ESMValTool
+right away. Please see the next
+lesson. 2. If you would like to install ESMValTool as a mamba
+package, then this lesson will tell you how! 3. If you would like to
+start experimenting with existing diagnostics or contributing to
+ESMvalTool, please see the instructions for source installation in the
+lesson Development and
+contribution and in the documentation.
+
+
+
+
+
+
Install ESMValTool on Windows
+
+
ESMValTool does not directly support Windows, but successful usage
+has been reported through the Windows Subsystem
+for Linux(WSL), available in Windows 10. To install the WSL please
+follow the instructions on the
+Windows Documentation page. After installing the WSL, installation
+can be done using the same instructions for Linux/MacOSX.
+
+
+
+
Install ESMValTool on Linux/MacOSX
+
+
+
+
Install Mamba
+
+
ESMValTool is distributed using Mamba. To
+install mamba on Linux or MacOSX, follow the
+instructions below:
+
+
Please download the installation file for the latest Mamba
+version here.
+
Next, run the installer from the place where you downloaded
+it:
+
+
On Linux:
+
+
BASH
+
+
bash Mambaforge-Linux-x86_64.sh
+
+
On MacOSX:
+
+
BASH
+
+
bash Mambaforge-MacOSX-x86_64.sh
+
+
+
Follow the instructions in the installer. The defaults should
+normally suffice.
+
You will need to restart your terminal for the changes to have
+effect.
+
We recommend updating mamba before the esmvaltool installation.
+To do so, run:
+
+
+
BASH
+
+
mamba update --name base mamba
+
+
+
Verify you have a working mamba installation by:
+
+
+
BASH
+
+
which mamba
+
+
This should show the path to your mamba executable,
+e.g. ~/mambaforge/bin/mamba.
+
For more information about installing mamba, see [the mamba
+installation documentation](https://docs.esmvaltool.org/en
+/latest/quickstart/installation.html#mamba-installation).
+
+
+
Install the ESMValTool package
+
+
The ESMValTool package contains diagnostics scripts in four
+languages: R, Python, Julia and NCL. This introduces a lot of
+dependencies, and therefore the installation can take quite long. It is,
+however, possible to install ‘subpackages’ for each of the languages.
+The following (sub)packages are available:
+
+
esmvaltool-python
+
esmvaltool-ncl
+
esmvaltool-r
+
+esmvaltool –> the complete package, i.e. the
+combination of the above.
+
+
For the tutorial, we will install the complete package. Thus, to
+install the ESMValTool package, run
+
+
BASH
+
+
mamba create --name esmvaltool esmvaltool
+
+
On MacOSX ESMValTool functionalities in Julia, NCL, and R are not
+supported. To install a Mamba environment on MacOSX, please refer to
+specific information.
Some ESMValTool diagnostics are written in the Julia programming
+language. If you want a full installation of ESMValTool including Julia
+diagnostics, you need to make sure Julia is installed before installing
+ESMValTool.
+
In this tutorial, we will not use Julia, but for reference, we have
+listed the steps to install Julia below. Complete instructions for
+installing Julia can be found on the Julia
+installation page.
+
+
+
+
+
+
First, open a bash terminal and activate the newly created
+esmvaltool environment.
+
+
BASH
+
+
conda activate esmvaltool
+
+
Next, to install Julia via mamba, you can use the
+following command:
+
+
BASH
+
+
mamba install julia
+
+
To check that the Julia executable can be found, run
+
+
BASH
+
+
which julia
+
+
to display the path to the Julia executable, it should be
+
+
OUTPUT
+
+
~/mambaforge/envs/esmvaltool/bin/julia
+
+
To test that Julia is installed correctly, run
+
+
BASH
+
+
julia
+
+
to start the interactive Julia interpreter. Press Ctrl+D
+to exit.
+
+
+
+
+
+
+
Test that the installation was successful
+
+
To test that the installation was successful, run
+
+
BASH
+
+
conda activate esmvaltool
+
+
to activate the conda environment called esmvaltool. In
+the shell prompt the active conda environment should have been changed
+from (base) to (esmvaltool).
+
Next, run
+
+
BASH
+
+
esmvaltool--help
+
+
to display the command line help.
+
+
+
+
+
+
Version of ESMValTool
+
+
Can you figure out which version of ESMValTool has been
+installed?
+
+
+
+
+
+
+
+
+
The esmvaltool --help command lists version
+as a command to get the version
+
When you run
+
+
BASH
+
+
esmvaltool version
+
+
The version of ESMValTool installed should be displayed on the screen
+as:
+
+
OUTPUT
+
+
ESMValCore: 2.11.0
+ESMValTool: 2.11.0
+
+
Note that on HPC servers such as JASMIN, sometimes a more recent
+development version may be displayed for ESMValTool, for e.g.
+ESMValTool: 2.11.0.dev71+g2c60b4d97
+
+
+
+
+
+
+
+
+
+
Key Points
+
+
+
All the required packages can be installed using mamba.
+
You can find more information about installation in the documentation.
What is the user configuration file and how should I use it?
+
+
+
+
+
+
+
+
Objectives
+
+
Understand the contents of the user-config.yml file
+
Prepare a personalized user-config.yml file
+
Configure ESMValTool to use some settings
+
+
+
+
+
+
+
The configuration file
+
+
+
For the purposes of this tutorial, we will create a directory in our
+home directory called esmvaltool_tutorial and use that as
+our working directory. The following steps should do that:
+
+
BASH
+
+
mkdir esmvaltool_tutorial
+cd esmvaltool_tutorial
+
+
The config-user.yml configuration file contains all the
+global level information needed by ESMValTool to run. This is a YAML file.
+
You can get the default configuration file by running:
The default configuration file will be downloaded to the directory
+specified with the --path variable. For instance, you can
+provide the path to your working directory as the
+target_dir. If this option is not used, the file will be
+saved to the default location:
+~/.esmvaltool/config-user.yml, where ~ is the
+path to your home directory. Note that files and directories starting
+with a period are “hidden”, to see the .esmvaltool
+directory in the terminal use ls -la ~. Note that if a
+configuration file by that name already exists in the default location,
+the get_config_user command will not update the file as
+ESMValTool will not overwrite the file. You will have to move the file
+first if you want an updated copy of the user configuration file.
+
We run a text editor called nano to have a look inside
+the configuration file and then modify it if needed:
+
+
BASH
+
+
nano ~/.esmvaltool/config-user.yml
+
+
Any other editor can be used, e.g.vim.
+
This file contains the information for:
+
+
Output settings
+
Destination directory
+
Auxiliary data directory
+
Number of tasks that can be run in parallel
+
Rootpath to input data
+
Directory structure for the data from different projects
+
+
+
+
+
+
+
Text editor side note
+
+
No matter what editor you use, you will need to know where it
+searches for and saves files. If you start it from the shell, it will
+(probably) use your current working directory as its default location.
+We use nano in examples here because it is one of the least
+complex text editors. Press ctrl + O to save the
+file, and then ctrl + X to exit
+nano.
+
+
+
+
Output settings
+
+
+
The configuration file starts with output settings that inform
+ESMValTool about your preference for output. You can turn on or off the
+setting by true or false values. Most of these
+settings are fairly self-explanatory.
+
+
+
+
+
+
Saving preprocessed data
+
+
Later in this tutorial, we will want to look at the contents of the
+preproc folder. This folder contains preprocessed data and
+is removed by default when ESMValTool is run. In the configuration file,
+which settings can be modified to prevent this from happening?
+
+
+
+
+
+
+
+
+
If the option remove_preproc_dir is set to
+false, then the preproc/ directory contains
+all the pre-processed data and the metadata interface files. If the
+option save_intermediary_cubes is set to true
+then data will also be saved after each preprocessor step in the folder
+preproc. Note that saving all intermediate results to file
+will result in a considerable slowdown, and can quickly fill your
+disk.
+
+
+
+
+
Destination directory
+
+
+
The destination directory is the rootpath where ESMValTool will store
+its output folders containing e.g. figures, data, logs, etc. With every
+run, ESMValTool automatically generates a new output folder determined
+by recipe name, and date and time using the format: YYYYMMDD_HHMMSS.
+
+
+
+
+
+
Set the destination directory
+
+
Let’s name our destination directory esmvaltool_output
+in the working directory. ESMValTool should write the output to this
+path, so make sure you have the disk space to write output to this
+directory. How do we set this in the config-user.yml?
+
+
+
+
+
+
+
+
+
We use output_dir entry in the
+config-user.yml file as:
+
+
YAML
+
+
output_dir: ./esmvaltool_output
+
+
If the esmvaltool_output does not exist, ESMValTool will
+generate it for you.
+
+
+
+
+
Rootpath to input data
+
+
+
ESMValTool uses several categories (in ESMValTool, this is referred
+to as projects) for input data based on their source. The current
+categories in the configuration file are mentioned below. For example,
+CMIP is used for a dataset from the Climate Model Intercomparison
+Project whereas OBS may be used for an observational dataset. More
+information about the projects used in ESMValTool is available in the
+[documentation](https://docs.esmvaltool.org/projects/esmvalcore/en/latest/
+quickstart/find_data.html). When using ESMValTool on your own machine,
+you can create a directory to download climate model data or observation
+data sets and let the tool use data from there. It is also possible to
+ask ESMValTool to download climate model data as needed. This can be
+done by specifying a download directory and by setting the option to
+download data as shown below.
+
+
YAML
+
+
# Directory for storing downloaded climate data
+download_dir: ~/climate_data
+search_esgf: always
+
+
If you are working offline or do not want to download the data then
+set the option above to never. If you want to download data
+only when the necessary files are missing at the usual location, you can
+set the option to when_missing.
+
The rootpath specifies the directories where ESMValTool
+will look for input data. For each category, you can define either one
+path or several paths as a list. For example:
These are typically available in the default configuration file you
+downloaded, so simply removing the machine specific lines should be
+sufficient to access input data.
+
+
+
+
+
+
Set the correct rootpath
+
+
In this tutorial, we will work with data from CMIP5 and CMIP6. How can we
+modify the rootpath to make sure the data path is set
+correctly for both CMIP5 and CMIP6? Note: to get the data, check the
+instructions in Setup.
+
+
+
+
+
+
+
+
+
+
Are you working on your own local machine? You need to add the root
+path of the folder where the data is available to the
+config-user.yml file as:
Are you working on your local machine and have downloaded data using
+ESMValTool? You need to add the root path of the folder where the data
+has been downloaded to as specified in the
+download_dir.
Are you working on a computer cluster like Jasmin or DKRZ?
+Site-specific path to the data for JASMIN/DKRZ/ETH/IPSL are already
+listed at the end of the config-user.yml file. You need to
+uncomment the related lines. For example, on JASMIN:
Directory structure for the data from different projects
+
+
+
Input data can be from various models, observations and reanalysis
+data that adhere to the CF/CMOR
+standard. The drs setting describes the file
+structure.
+
The drs setting describes the file structure for several
+projects (e.g. CMIP6, CMIP5, obs4mips, OBS6, OBS) on several key
+machines (e.g. BADC, CP4CDS, DKRZ, ETHZ, SMHI, BSC). For more
+information about drs, you can visit the ESMValTool
+documentation on [Data Reference Syntax (DRS)](https://docs.esmvaltool.org/projects/esmvalcore/
+en/latest/quickstart/find_data.html#cmor-drs).
+
+
+
+
+
+
Set the correct drs
+
+
In this lesson, we will work with data from CMIP5 and CMIP6. How can we
+set the correct drs?
+
+
+
+
+
+
+
+
+
+
Are you working on your own local machine? You need to set the
+drs of the data in the config-user.yml file
+as:
+
+
+
YAML
+
+
drs:
+CMIP5: default
+CMIP6: default
+
+
+
Are you asking ESMValTool to download the data for use with your
+diagnostics? You need to set the drs of the data in the
+config-user.yml file as:
Are you working on a computer cluster like Jasmin or DKRZ?
+Site-specific drs of the data are already listed at the end
+of the config-user.yml file. You need to uncomment the
+related lines. For example, on Jasmin:
+
+
+
YAML
+
+
# Site-specific entries: Jasmin
+ # Uncomment the lines below to locate data on JASMIN
+drs:
+CMIP6: BADC
+CMIP5: BADC
+OBS: default
+OBS6: default
+obs4mips: default
+ana4mips: default
+
+
+
+
+
+
+
+
+
+
+
Explain the default drs (if working on local machine)
+
+
+
In the previous exercise, we set the drs of CMIP5 data
+to default. Can you explain why?
+
Have a look at the directory structure of the OBS data.
+There is a folder called Tier1. What does it mean?
+
+
+
+
+
+
+
+
+
+
+
drs: default is one way to retrieve data from a ROOT
+directory that has no DRS-like structure. default indicates
+that all the files are in a folder without any structure.
+
Observational data are organized in Tiers depending on their
+level of public availability. Therefore the default directory must be
+structured accordingly with sub-directories TierX
+e.g. Tier1, Tier2 or Tier3, even when drs: default. More
+details can be found in the [documentation](https://docs.esmvaltool.org/projects/esmvalcore/en/latest/
+quickstart/find_data.html#observational-data).
+
+
+
+
+
+
Other settings
+
+
+
+
+
+
+
+
Auxiliary data directory
+
+
The auxiliary_data_dir setting is the path where any
+required additional auxiliary data files are stored. This location
+allows us to tell the diagnostic script where to find the files if they
+can not be downloaded at runtime. This option should not be used for
+model or observational datasets, but for data files (e.g. shape files)
+used in plotting such as coastline descriptions and if you want to feed
+some additional data (e.g. shape files) to your recipe.
This option enables you to perform parallel processing. You can
+choose the number of tasks in parallel as 1/2/3/4/… or you can set it to
+null. That tells ESMValTool to use the maximum number of
+available CPUs. For the purpose of the tutorial, please set ESMValTool
+use only 1 cpu:
+
+
YAML
+
+
max_parallel_tasks:1
+
+
In general, if you run out of memory, try setting
+max_parallel_tasks to 1. Then, check the amount of memory
+you need for that by inspecting the file
+run/resource_usage.txt in the output directory. Using the
+number there you can increase the number of parallel tasks again to a
+reasonable number for the amount of memory available in your system.
+
+
+
+
+
+
+
+
+
Make your own configuration file
+
+
It is possible to have several configuration files with different
+purposes, for example: config-user_formalised_runs.yml,
+config-user_debugging.yml. In this case, you have to pass the path of
+your own configuration file as a command-line option when running the
+ESMValTool. We will learn how to do this in the next lesson.
+
+
+
+
+
+
+
+
+
Key Points
+
+
+
The config-user.yml tells ESMValTool where to find
+input data.
This episode describes how ESMValTool recipes work, how to run a
+recipe and how to explore the recipe output. By the end of this episode,
+you should be able to run your first recipe, look at the recipe output,
+and make small modifications.
+
Running an existing recipe
+
+
+
The recipe format has briefly been introduced in the
+[Introduction][lesson-introduction] episode. To see all the recipes that
+are shipped with ESMValTool, type
+
+
BASH
+
+
esmvaltool recipes list
+
+
We will start by running [examples/recipe_python.yml](https://docs.esmvaltool.
+org/en/latest/recipes/recipe_examples.html)
+
esmvaltool run examples/recipe_python.yml
+
or if you have the user configuration file in your current directory
+then
+
esmvaltool run --config_file ./config-user.yml examples/recipe_python.yml
+
If everything is okay, you should see that ESMValTool is printing a
+lot of output to the command line. The final message should be “Run was
+successful”. The exact output varies depending on your machine, but it
+should look something like the example log output on terminal below.
+
{% include example_output.txt %}
+
+
+
+
+
+
Pro tip: ESMValTool search paths
+
+
You might wonder how ESMValTool was able find the recipe file, even
+though it’s not in your working directory. All the recipe paths printed
+from esmvaltool recipes list are relative to ESMValTool’s
+installation location. This is where ESMValTool will look if it cannot
+find the file by following the path from your working directory.
+
+
+
+
Investigating the log messages
+
+
+
Let’s dissect what’s happening here.
+
+
+
+
+
+
Output files and directories
+
+
After the banner and general information, the output starts with some
+important locations.
+
+
Did ESMValTool use the right config file?
+
What is the path to the example recipe?
+
What is the main output folder generated by ESMValTool?
+
Can you guess what the different output directories are for?
+
ESMValTool creates two log files. What is the difference?
+
+
+
+
+
+
+
+
+
+
+
The config file should be the one we edited in the previous episode,
+something like
+/home/<username>/.esmvaltool/config-user.yml or
+~/esmvaltool_tutorial/config-user.yml.
+
ESMValTool found the recipe in its installation directory, something
+like
+/home/users/username/mambaforge/envs/esmvaltool/bin/esmvaltool/recipes/examples/
+or if you are using a pre-installed module on a server, something like
+/apps/jasmin/community/esmvaltool/ESMValTool_<version> /esmvaltool/recipes/examples/recipe_python.yml,
+where <version> is the latest release.
+
ESMValTool creates a time-stamped output directory for every run. In
+this case, it should be something like
+recipe_python_YYYYMMDD_HHMMSS. This folder is made inside
+the output directory specified in the previous episode:
+~/esmvaltool_tutorial/esmvaltool_output.
+
There should be four output folders:
+
+
+
+plots/: this is where output figures are stored.
+
+preproc/: this is where pre-processed data are
+stored.
+
+run/: this is where esmvaltool stores general
+information about the run, such as log messages and a copy of the recipe
+file.
+
+work/: this is where output files (not figures) are
+stored.
+
+
+
The log files are:
+
+
+
+main_log.txt is a copy of the command-line output
+
+main_log_debug.txt contains more detailed information
+that may be useful for debugging.
+
+
+
+
+
+
+
+
+
+
+
Debugging: No ‘preproc’ directory?
+
+
If you’re missing the preproc directory, then your
+config-user.yml file has the value
+remove_preproc_dir set to true (this is used
+to save disk space). Please set this value to false and run
+the recipe again.
+
+
+
+
After the output locations, there are two main sections that can be
+distinguished in the log messages:
+
+
Creating tasks
+
Executing tasks
+
+
+
+
+
+
+
Analyse the tasks
+
+
List all the tasks that ESMValTool is executing for this recipe. Can
+you guess what this recipe does?
+
+
+
+
+
+
+
+
+
Just after all the ‘creating tasks’ and before ‘executing tasks’, we
+find the following line in the output:
+
[134535] INFO These tasks will be executed: map/tas, timeseries/tas_global,
+timeseries/script1, map/script1, timeseries/tas_amsterdam
+
So there are three tasks related to timeseries: global temperature,
+Amsterdam temperature, and a script (tas: near-surface air
+temperature). And then there are two tasks related to a map: something
+with temperature, and again a script.
+
+
+
+
+
Examining the recipe file
+
+
+
To get more insight into what is happening, we will have a look at
+the recipe file itself. Use the following command to copy the recipe to
+your working directory
+
+
BASH
+
+
esmvaltool recipes get examples/recipe_python.yml
+
+
Now you should see the recipe file in your working directory (type
+ls to verify). Use the nano editor to open
+this file:
+
+
BASH
+
+
nano recipe_python.yml
+
+
For reference, you can also view the recipe by unfolding the box
+below.
+
+
+
+
+
+
+
YAML
+
+
# ESMValTool
+# recipe_python.yml
+#
+# See https://docs.esmvaltool.org/en/latest/recipes/recipe_examples.html
+# for a description of this recipe.
+#
+# See https://docs.esmvaltool.org/projects/esmvalcore/en/latest/recipe/overview.html
+# for a description of the recipe format.
+---
+documentation:
+ description: |
+ Example recipe that plots a map and timeseries of temperature.
+
+title: Recipe that runs an example diagnostic written in Python.
+
+authors:
+- andela_bouwe
+- righi_mattia
+
+maintainer:
+- schlund_manuel
+
+references:
+- acknow_project
+
+projects:
+- esmval
+- c3s-magic
+
+datasets:
+-{dataset: BCC-ESM1,project: CMIP6,exp: historical,ensemble: r1i1p1f1,grid: gn}
+-{dataset: bcc-csm1-1,project: CMIP5,exp: historical,ensemble: r1i1p1}
+
+preprocessors:
+ # See https://docs.esmvaltool.org/projects/esmvalcore/en/latest/recipe/preprocessor.html
+ # for a description of the preprocessor functions.
+
+to_degrees_c:
+convert_units:
+units: degrees_C
+
+annual_mean_amsterdam:
+extract_location:
+location: Amsterdam
+scheme: linear
+annual_statistics:
+operator: mean
+multi_model_statistics:
+statistics:
+- mean
+span: overlap
+convert_units:
+units: degrees_C
+
+annual_mean_global:
+area_statistics:
+operator: mean
+annual_statistics:
+operator: mean
+convert_units:
+units: degrees_C
+
+diagnostics:
+
+map:
+description: Global map of temperature in January 2000.
+themes:
+- phys
+realms:
+- atmos
+variables:
+tas:
+mip: Amon
+preprocessor: to_degrees_c
+timerange: 2000/P1M
+ caption: |
+ Global map of {long_name} in January 2000 according to {dataset}.
+scripts:
+script1:
+script: examples/diagnostic.py
+quickplot:
+plot_type: pcolormesh
+cmap: Reds
+
+timeseries:
+description: Annual mean temperature in Amsterdam and global mean since 1850.
+themes:
+- phys
+realms:
+- atmos
+variables:
+tas_amsterdam:
+short_name: tas
+mip: Amon
+preprocessor: annual_mean_amsterdam
+timerange: 1850/2000
+caption: Annual mean {long_name} in Amsterdam according to {dataset}.
+tas_global:
+short_name: tas
+mip: Amon
+preprocessor: annual_mean_global
+timerange: 1850/2000
+caption: Annual global mean {long_name} according to {dataset}.
+scripts:
+script1:
+script: examples/diagnostic.py
+quickplot:
+plot_type: plot
+
+
+
+
+
+
Do you recognize the basic recipe structure that was introduced in
+episode 1?
+
+
+Documentation with relevant (citation)
+information
+
+Datasets that should be analysed
+
+Preprocessors groups of common preprocessing
+steps
+
+Diagnostics scripts performing more specific
+evaluation steps
+
+
+
+
+
+
+
Analyse the recipe
+
+
Try to answer the following questions:
+
+
Who wrote this recipe?
+
Who should be approached if there is a problem with this
+recipe?
+
How many datasets are analyzed?
+
What does the preprocessor called annual_mean_global
+do?
+
Which script is applied for the diagnostic called
+map?
+
Can you link specific lines in the recipe to the tasks that we saw
+before?
+
How is the location of the city specified?
+
How is the temporal range of the data specified?
+
+
+
+
+
+
+
+
+
+
+
The example recipe is written by Bouwe Andela and Mattia
+Righi.
+
Manuel Schlund is listed as the maintainer of this
+recipe.
+
Two datasets are analysed:
+
+
+
CMIP6 data from the model BCC-ESM1
+
CMIP5 data from the model bcc-csm1-1
+
+
+
The preprocessor annual_mean_global computes an area
+mean as well as annual means
+
The diagnostic called map executes a script referred
+to as script1. This is a python script named
+examples/diagnostic.py
+
There are two diagnostics: map and
+timeseries. Under the diagnostic map we find
+two tasks:
+
+
+
a preprocessor task called tas, applying the
+preprocessor called to_degrees_c to the variable
+tas.
+
a diagnostic task called script1, applying the script
+examples/diagnostic.py to the preprocessed data
+(map/tas).
+
+
Under the diagnostic timeseries we find three tasks:
+
+
a preprocessor task called tas_amsterdam, applying the
+preprocessor called annual_mean_amsterdam to the variable
+tas.
+
a preprocessor task called tas_global, applying the
+preprocessor called annual_mean_global to the variable
+tas.
+
a diagnostic task called script1, applying the script
+examples/diagnostic.py to the preprocessed data
+(timeseries/tas_global and
+timeseries/tas_amsterdam).
+
+
+
The extract_location preprocessor is used to get
+data for a specific location here. ESMValTool interpolates to the
+location based on the chosen scheme. Can you tell the scheme used here?
+For more ways to extract areas, see the [Area
+operations][preproc-area-manipulation] page.
+
The timerange tag is used to extract data from a
+specific time period here. The start time is 01/01/2000 and
+the span of time to calculate means is 1 Month given by
+P1M. For more options on how to specify time ranges, see
+the [timerange documentation][timeranges].
+
+
+
+
+
+
+
+
+
+
+
Pro tip: short names and variable groups
+
+
The preprocessor tasks in ESMValTool are called ‘variable groups’.
+For the diagnostic timeseries, we have two variable groups:
+tas_amsterdam and tas_global. Both of them
+operate on the variable tas (as indicated by the
+short_name), but they apply different preprocessors. For
+the diagnostic map the variable group itself is named
+tas, and you’ll notice that we do not explicitly provide
+the short_name. This is a shorthand built into
+ESMValTool.
+
+
+
+
+
+
+
+
+
Output files
+
+
Have another look at the output directory created by the ESMValTool
+run.
+
Which files/folders are created by each task?
+
+
+
+
+
+
+
+
+
+
+map/tas: creates /preproc/map/tas,
+which contains preprocessed data for each of the input datasets, a file
+called metadata.yml describing the contents of these
+datasets and provenance information in the form of .xml
+files.
+
+timeseries/tas_global: creates
+/preproc/timeseries/tas_global, which contains preprocessed
+data for each of the input datasets, a metadata.yml file
+and provenance information in the form of .xml files.
+
+timeseries/tas_amsterdam: creates
+/preproc/timeseries/tas_amsterdam, which contains
+preprocessed data for each of the input datasets, plus a combined
+MultiModelMean, a metadata.yml file and
+provenance files.
+
+map/script1: creates /run/map/script1
+with general information and a log of the diagnostic script run. It also
+creates /plots/map/script1/ and
+/work/map/script1, which contain output figures and output
+datasets, respectively. For each output file, there is also
+corresponding provenance information in the form of .xml,
+.bibtex and .txt files.
+
+timeseries/script1: creates
+/run/timeseries/script1 with general information and a log
+of the diagnostic script run. It also creates
+/plots/timeseries/script1 and
+/work/timeseries/script1, which contain output figures and
+output datasets, respectively. For each output file, there is also
+corresponding provenance information in the form of .xml,
+.bibtex and .txt files.
+
+
+
+
+
+
+
+
+
+
+
Pro tip: diagnostic logs
+
+
When you run ESMValTool, any log messages from the diagnostic script
+are not printed on the terminal. But they are written to the
+log.txt files in the folder
+/run/<diag_name>/log.txt.
+
ESMValTool does print a command that can be used to re-run a
+diagnostic script. When you use this the output will be printed
+to the command line.
+
+
+
+
Modifying the example recipe
+
+
+
Let’s make a small modification to the example recipe. Notice that
+now that you have copied and edited the recipe, you can use
+
esmvaltool run recipe_python.yml
+
to refer to your local file rather than the default version shipped
+with ESMValTool.
+
+
+
+
+
+
Change your location
+
+
Modify and run the recipe to analyse the temperature for your own
+location.
+
+
+
+
+
+
+
+
+
In principle, you only have to modify the location in the
+preprocessor called annual_mean_amsterdam. However, it is
+good practice to also replace all instances of amsterdam
+with the correct name of your location. Otherwise the log messages and
+output will be confusing. You are free to modify the names of
+preprocessors or diagnostics.
+
In the diff file below you will see the changes we have
+made to the file. The top 2 lines are the filenames and the lines like
+@@ -39,9 +39,9 @@ represent the line numbers in the
+original and modified file, respectively. For more info on this format,
+see here.
+
+
DIFF
+
+
--- recipe_python.yml
++++ recipe_python_london.yml
+@@ -39,9 +39,9 @@
+ convert_units:
+ units: degrees_C
+
+- annual_mean_amsterdam:
++ annual_mean_london:
+ extract_location:
+- location: Amsterdam
++ location: London
+ scheme: linear
+ annual_statistics:
+ operator: mean
+@@ -83,7 +83,7 @@
+ cmap: Reds
+
+ timeseries:
+- description: Annual mean temperature in Amsterdam and global mean since 1850.
++ description: Annual mean temperature in London and global mean since 1850.
+ themes:
+ - phys
+ realms:
+@@ -92,9 +92,9 @@
+ tas_amsterdam:
+ short_name: tas
+ mip: Amon
+- preprocessor: annual_mean_amsterdam
++ preprocessor: annual_mean_london
+ timerange: 1850/2000
+- caption: Annual mean {long_name} in Amsterdam according to {dataset}.
++ caption: Annual mean {long_name} in London according to {dataset}.
+ tas_global:
+ short_name: tas
+ mip: Amon
+
+
+
+
+
+
+
+
+
+
+
Key Points
+
+
+
ESMValTool recipes work ‘out of the box’ (if input data is
+available)
+
There are strong links between the recipe, log file, and output
+folders
+
Recipes can easily be modified to re-use existing code for your own
+use case
There are lots of resources available to assist you in using
+ESMValTool.
+
The ESMValTool Discussions
+page is a good place to find information on general issues, or check
+if your question has already been addressed. If you have a GitHub
+account, you can also post your questions there.
+
If you encounter difficulties, a great starting point is to visit
+issues page issue page
+to check whether your issues have already been reported or not. If they
+have been reported before, suggestions provided by developers can help
+you to solve the issues you encountered. Note that you will need a
+GitHub account for this.
+
Additionally, there is an ESMValTool email list. Please see information
+on how to subscribe to the user mailing list.
+
+
+
What if I find a bug?
+
+
If you find a bug, please report it to the ESMValTool team. This will
+help us fix issues, ensuring not only your uninterrupted workflow but
+also contributing to the overall stability of ESMValTool for all
+users.
+
To report a bug, please create a new issue using the issue
+page.
+
In your bug report, please describe the problem as clearly and as
+completely as possible. You may need to include a recipe or the output
+log as well.
Can I use different preprocessors for different variables?
+
Can I use different datasets for different variables?
+
How can I combine different preprocessor functions?
+
Can I run the same recipe for multiple ensemble members?
+
+
+
+
+
+
+
+
Objectives
+
+
Create a recipe with multiple preprocessors
+
Use different preprocessors for different variables
+
Run a recipe with variables from different datasets
+
+
+
+
+
+
+
Introduction
+
+
+
One of the key strengths of ESMValTool is in making complex analyses
+reusable and reproducible. But that doesn’t mean everything in
+ESMValTool needs to be complex. Sometimes, the biggest challenge is in
+keeping things simple. You probably know the ‘warming stripes’
+visualization by Professor Ed Hawkins. On the site https://showyourstripes.info{:target=“_blank”} you can
+find the same visualization for many regions in the world.
+
Shared by Ed Hawkins under a Creative Commons 4.0 Attribution
+International licence. Source: https://showyourstripes.info
+
In this episode, we will reproduce and extend this functionality with
+ESMValTool. We have prepared a small Python script that takes a NetCDF
+file with timeseries data, and visualizes it in the form of our desired
+warming stripes figure.
+
The diagnostic script that we will use is called
+warming_stripes.py and can be downloaded here.
+
Download the file and store it in your working directory. If you
+want, you may also have a look at the contents, but it is not necessary
+to do so for this lesson.
+
We will write an ESMValTool recipe that takes some data, performs the
+necessary preprocessing, and then runs this Python script.
+
+
+
+
+
+
Drawing up a plan
+
+
Previously, we saw that running ESMValTool executes a number of
+tasks. What tasks do you think we will need to execute and what should
+each of these tasks do to generate the warming stripes?
+
+
+
+
+
+
+
+
+
In this episode, we will need to do the following two tasks:
+
+
A preprocessing task that converts the gridded temperature data to a
+timeseries of global temperature anomalies
+
A diagnostic tasks that calls our Python script, taking our
+preprocessed timeseries data as input.
+
+
+
+
+
+
Building a recipe from scratch
+
+
+
The easiest way to make a new recipe is to start from an existing
+one, and modify it until it does exactly what you need. However, in this
+episode we will start from scratch. This forces us to think about all
+the steps involved in processing the data. We will also deal with
+commonly occurring errors through the development of the recipe.
+
Remember the basic structure of a recipe, and notice that each
+component is extensively described in the documentation under the
+section, [“Overview”][recipe-overview]{:target=“_blank”}:
This is the first place to look for help if you get stuck.
+
Open a new file called recipe_warming_stripes.yml:
+
+
BASH
+
+
nano recipe_warming_stripes.yml
+
+
Let’s add the standard header comments (these do not do anything),
+and a first description.
+
+
YAML
+
+
# ESMValTool
+# recipe_warming_stripes.yml
+---
+documentation:
+description: Reproducing Ed Hawkins' warming stripes visualization
+title: Reproducing Ed Hawkins' warming stripes visualization.
+
+
Notice that yaml always requires two spaces
+indentation between the different levels. Pressing ctrl+o
+will save the file. Verify the filename at the bottom and press enter.
+Then use ctrl+x to exit the editor.
+
We will try to run the recipe after every modification we make, to
+see if it (still) works!
+
+
BASH
+
+
esmvaltool run recipe_warming_stripes.yml
+
+
In this case, it gives an error. Below you see the last few lines of
+the error message.
+
+
ERROR
+
+
...
+yamale.yamale_error.YamaleError:
+Error validating data '/home/users/username/esmvaltool_tutorial/recipe_warming_stripes.yml'
+with schema
+'/apps/jasmin/community/esmvaltool/miniconda3_py311_23.11.0-2/envs/esmvaltool/lib/python3.11/
+site-packages/esmvalcore/_recipe/recipe_schema.yml'
+ documentation.authors: Required field missing
+2024-05-27 13:21:23,805 UTC [41924] INFO
+If you have a question or need help, please start a new discussion on
+https://github.com/ESMValGroup/ESMValTool/discussions
+If you suspect this is a bug, please open an issue on
+https://github.com/ESMValGroup/ESMValTool/issues
+To make it easier to find out what the problem is, please consider attaching the
+files run/recipe_*.yml and run/main_log_debug.txt from the output directory.
+
+
+
We can use the the log message above, to understand why ESMValTool
+failed. Here, this is because we missed a required field with author
+names. The text
+documentation.authors: Required field missing tells us
+that. We see that ESMValTool always tries to validate the recipe at an
+early stage. Note also the suggestion to open a GitHub issue if you need
+help debugging the error message. This is something most users do when
+they cannot understand the error or are not able to fix it on their
+own.
+
Let’s add some additional information to the recipe. Open the recipe
+file again, and add an authors section below the description. ESMValTool
+expects the authors as a list, like so:
+
+
YAML
+
+
authors:
+- lastname_firstname
+
+
To bypass a number of similar error messages, add a minimal
+diagnostics section below the documentation. The file should now look
+like:
This is the minimal recipe layout that is required by ESMValTool. If
+we now run the recipe again, you will probably see the following
+error:
+
+
ERROR
+
+
ValueError: Tag 'doe_john' does not exist in section
+'authors' of /apps/jasmin/community/esmvaltool/ESMValTool_2.10.0/esmvaltool/config-references.yml
+
+
+
+
+
+
+
+
Pro tip: config-references.yml
+
+
The error message above points to a file named
+[config-references.yml][config-references] This is where ESMValTool
+stores all its citation information. To add yourself as an author, add
+your name in the form lastname_firstname in alphabetical
+order following the existing entries, under the
+# Development team section. See the [List of
+authors][list-of-authors]{:target=“_blank”} section in the ESMValTool
+documentation for more information.
+
+
+
+
For now, let’s just use one of the existing references. Change the
+author field to righi_mattia, who cannot receive enough
+credit for all the effort he put into ESMValTool. If you now run the
+recipe again, you should see the final message
+
+
OUTPUT
+
+
ERROR No tasks to run!
+
+
Although there is no actual error in the recipe, ESMValTool assumes
+you mistakenly left out a variable name to process and alerts you with
+this error message.
+
Adding a dataset entry
+
+
+
Let’s add a datasets section.
+
+
+
+
+
+
Filling in the dataset keys
+
+
Use the paths specified in the configuration file to explore the data
+directory, and look at the explanation of the dataset entry in the
+[ESMValTool documentation][recipe-section-datasets]{:target=“_blank”}.
+For both the datasets, write down the following properties:
Note that the grid key is only required for CMIP6 data, and that the
+extent of the historical period has changed between CMIP5 and CMIP6.
+
+
+
+
+
Let us start with the BCC-ESM1 dataset and add a datasets section to
+the recipe, listing this single dataset, as shown below. Note that key
+fields such as mip or start_year are included
+in the datasets section here but are part of the
+diagnostic section in the recipe example seen in Running your first recipe.
The recipe should run but produce the same message as in the previous
+case since we still have not included a variable to actually process. We
+have not included the short name of the variable in this dataset section
+because this allows us to reuse this dataset entry with different
+variable names later on. This is not really necessary for our simple use
+case, but it is common practice in ESMValTool.
+
+
+
+
+
+
Pro-tip: Automatically populating a recipe with all available datasets
+
+
You can select all available models for processing using
+glob patterns or wildcards. An example
+datasets section that uses all available CMIP6 models and
+ensemble members for the historical experiment is available
+[here] [include-all-datasets]{:target=“_blank”}. Note that you will have
+to set the search_esgf option in the
+config_file to always so that you can download
+data from ESGF nodes as needed.
+
+
+
+
Adding the preprocessor section
+
+
+
Above, we already described the preprocessing task that needs to
+convert the standard, gridded temperature data to a timeseries of
+temperature anomalies.
+
+
+
+
+
+
Defining the preprocessor
+
+
Have a look at the available preprocessors in the
+[documentation][preprocessor]{:target=“_blank”}. Write down
+
+
Which preprocessor functions do you think we should use?
+
What are the parameters that we can pass to these functions?
+
What do you think should be the order of the preprocessors?
+
A suitable name for the overall preprocessor
+
+
+
+
+
+
+
+
+
+
We need to calculate anomalies and global means. There is an
+anomalies preprocessor which takes in as arguments, a time
+period, a reference period, and whether or not to standardize the data.
+The global means can be calculated with the area_statistics
+preprocessor, which takes an operator as argument (in our case we want
+to compute the mean).
+
The default order in which these preprocessors are applied can be
+seen [here][preprocessor-functions]{:target=“_blank”}:
+area_statistics comes before anomalies. If you
+want to change this, you can use the custom_order
+preprocessor as described
+[here][recipe-section-preprocessors]{:target=“_blank”}. For this
+example, we will keep the default order..
+
Let’s name our preprocessor global_anomalies.
+
+
+
+
+
Add the following block to your recipe file between the
+datasets and diagnostics block:
We are now ready to finish our diagnostics section. Remember that we
+want to create two tasks: a preprocessor task, and a diagnostic task. To
+illustrate that we can also pass settings to the diagnostic script, we
+add the option to specify a custom colormap.
+
+
+
+
+
+
Fill in the blanks
+
+
Extend the diagnostics section in your recipe by filling in the
+blanks in the following template:
+
+
YAML
+
+
diagnostics:
+<... (suitable name for our diagnostic)>:
+ description: <...>
+variables:
+<... (suitable name for the preprocessed variable)>:
+ short_name: <...>
+ preprocessor: <...>
+scripts:
+<... (suitable name for our python script)>:
+script: <full path to python script>
+colormap: <... choose from matplotlib colormaps>
+
+
+
+
+
+
+
+
+
+
+
YAML
+
+
diagnostics:
+diagnostic_warming_stripes:
+description: visualize global temperature anomalies as warming stripes
+variables:
+global_temperature_anomalies_global:
+short_name: tas
+preprocessor: global_anomalies
+scripts:
+warming_stripes_script:
+script: ~/esmvaltool_tutorial/warming_stripes.py
+colormap:'bwr'
+
+
+
+
+
+
You should now be able to run the recipe to get your own warming
+stripes.
+
Note: for the purpose of simplicity in this episode, we have not
+added logging or provenance tracking in the diagnostic script. Once you
+start to develop your own diagnostic scripts and want to add them to the
+ESMValTool repositories, this will be required. Writing your own
+diagnostic script is discussed in a later
+episode.
+
Bonus exercises
+
+
+
Below are a few exercises to practice modifying an ESMValTool recipe.
+For your reference, here’s a copy of the recipe at this point. This
+will be the point of departure for each of the modifications we’ll make
+below.
+
+
+
+
+
+
Specific location selection
+
+
On showyourstripes.org, you can download stripes for specific
+locations. Here we show how this can be done with ESMValTool. Instead of
+the global mean, we can pick a location to plot the stripes for. Can you
+find a suitable preprocessor to do this?
+
+
+
+
+
+
+
+
+
You can use extract_point or extract_region
+to select a location. We used extract_point. Here’s a copy
+of the recipe at this
+point and this is the difference from the previous recipe:
Split the diagnostic in two with two different time periods for the
+same variable. You can choose the time periods yourself. In the example
+below, we have chosen the recent past and the 20th century and have used
+variable grouping.
+
+
+
+
+
+
+
+
+
Here’s a copy of the recipe at this point
+and this is the difference with the previous recipe:
Now that you have different variable groups, we can also use
+different preprocessors. Add a second preprocessor to add another
+location of your choosing.
+
+
Solution
+
Here’s a copy of the recipe at
+this point and this is the difference with the previous recipe:
If you want to avoid retyping the arguments used in your
+preprocessor, you can use YAML anchors as seen in the
+anomalies preprocessor specifications in the recipe
+above.
+
+
+
+
+
+
+
+
+
Additional datasets
+
+
So far we have defined the datasets in the datasets section of the
+recipe. However, it’s also possible to add specific datasets only for
+specific variables or variable groups. Take a look at the documentation
+to learn about the additional_datasets keyword
+[here][additional-datasets]{:target=“_blank”}, and add a second dataset
+only for one of the variable groups.
+
+
+
+
+
+
+
+
+
Here’s a copy of the recipe at
+this point and this is the difference with the previous recipe:
You can choose data from multiple ensemble members for a model in a
+single line.
+
+
+
+
+
+
+
+
+
The dataset section allows you to choose more than one
+ensemble member Here’s a copy of the changed recipe
+to do that. Changes made are shown in the diff output below:
Check out the section on a different way to use multiple ensemble
+members or even multiple experiments at [Concatenating data
+corresponding to multiple facets]
+[concatenating-datasets]{:target=“_blank”}.
+
+
+
+
+
+
+
+
+
Key Points
+
+
+
A recipe can work with different preprocessors at the same
+time.
+
The setting additional_datasets can be used to add a
+different dataset.
+
Variable groups are useful for defining different settings for
+different variables.
+
Multiple ensemble members and experiments can be analysed in a
+single recipe through concatenation.
How can I incorporate my contributions into ESMValTool?
+
+
+
+
+
+
+
+
Objectives
+
+
Execute a successful ESMValTool installation from the source
+code.
+
Contribute to ESMValTool development.
+
+
+
+
+
+
+
We now know how ESMValTool works, but how do we develop it?
+ESMValTool is an open-source project in ESMValGroup. We can contribute
+to its development by:
helping with reviewing process of pull requests, see ESMValTool
+documentation on Review
+of pull requests
+
+
+
In this lesson, we first show how to set up a development
+installation of ESMValTool so you can make changes or additions. We then
+explain how you can contribute these changes to the community.
+
+
+
+
+
+
Git knowledge
+
+
For this episode, you need some knowledge of Git. You can refresh your knowledge in
+the corresponding Git carpentries
+course.
+
+
+
+
Development installation
+
+
+
We’ll explore how ESMValTool can be installed it in a
+develop mode. Even if you aren’t collaborating with the
+community, this installation is needed to run your new codes with
+ESMValTool. Let’s get started.
Download the code from the repository. A ZIP file called
+ESMValTool-main.zip is downloaded. To continue the
+installation, unzip the file, move to the ESMValTool-main
+directory and then follow the sequence of steps starting from the
+section on ESMValTool dependencies below.
+
Clone the repository if you want to contribute to the ESMValTool
+development:
This command will ask your GitHub username and a personal
+token as password. Please follow instructions on
+[GitHub token authentication
+requirements][token-authentication-requirements] to create a personal
+access token. Alternatively, you could [generate a new SSH
+key][generate-ssh-key] and [add it to your GitHub account][add-ssh-key].
+After the authentication, the output might look like:
Now, a folder called ESMValTool has been created in your
+working directory. This folder contains the source code of the tool. To
+continue the installation, we move into the ESMValTool
+directory:
+
+
BASH
+
+
cd ESMValTool
+
+
Note that the main branch is checked out by default. We
+can see this if we run:
+
+
BASH
+
+
git status
+
+
+
OUTPUT
+
+
On branch main
+Your branch is up to date with 'origin/main'.
+
+nothing to commit, working tree clean
+
+
+
+
2 ESMValTool dependencies
+
+
Please don’t forget if an esmvaltool environment is already created
+following the lesson Installation, we
+should choose another name for the new environment in this lesson.
+
ESMValTool now uses mamba instead of conda
+for the recommended installation. For a minimal mamba installation, see
+section Install Mamba in lesson Installation.
+
It is good practice to update the version of mamba and conda on your
+machine before setting up ESMValTool. This can be done as follows:
+
+
BASH
+
+
mamba update --name base mamba conda
+
+
To simplify the installation process, an environment file
+environment.yml is provided in the ESMValTool directory. We
+create an environment by running:
The environment is called esmvaltool by default. If an
+esmvaltool environment is already created following the
+lesson Installation, we should choose
+another name for the new environment in this lesson by:
This will create a new conda environment and install ESMValTool (with
+all dependencies that are needed for development purposes) into it with
+a single command.
+
For more information see [conda managing
+environments][manage-environments].
+
Now, we should activate the environment:
+
+
BASH
+
+
conda activate esmvaltool
+
+
where esmvaltool is the name of the environment (replace
+by a_new_name in case another environment name was
+used).
+
+
+
3 ESMValTool installation
+
+
ESMValTool can be installed in a develop mode by
+running:
+
+
BASH
+
+
pip install --editable'.[develop]'
+
+
This will add the esmvaltool directory to the Python
+path in editable mode and install the development dependencies. We
+should check if the installation works properly. To do this, run the
+tool with:
+
+
BASH
+
+
esmvaltool--help
+
+
If the installation is successful, ESMValTool prints a help message
+to the console.
+
+
+
+
+
+
Checking the development installation
+
+
We can use the command mamba list to list installed
+packages in the esmvaltool environment. Use this command to
+check that ESMValTool is installed in a develop mode.
# Name Version Build Channel
+esmvaltool 2.10.0.dev3+g2dbc2cfcc pypi_0 pypi
+
+
+
+
+
+
+
+
4 Updating ESMValTool
+
+
The main branch has the latest features of ESMValTool.
+Please make sure that the source code on your machine is up-to-date. If
+you obtain the source code using git clone as explained in step
+1 Source code, you can run git pull to
+update the source code. Then ESMValTool installation will be updated
+with changes from the main branch.
+
+
Contribution
+
+
+
We have seen how to install ESMValTool in a develop
+mode. Now, we try to contribute to its development. Let’s see how this
+can be achieved. We first discuss our ideas in an issue
+in ESMValTool repository. This can avoid disappointment at a later
+stage, for example, if more people are doing the same thing. It also
+gives other people an early opportunity to provide input and
+suggestions, which results in more valuable contributions.
+
Then, we create a new branch locally and start
+developing new codes. To create a new branch:
+
+
BASH
+
+
git checkout -b your_branch_name
+
+
If needed, a link to a git tutorial can be found in Setup.
+
Once our development is finished, we can initiate a
+pull request. To this end, we encourage you to join the
+ESMValTool development team.
+
For more extensive documentation on contributing code, including a
+section on the GitHubWorkflow,
+please see the [Contributing code and documentation][code-documentation]
+section in the ESMValtool documentation.
+
+
Review process
+
+
The pull request will be tested, discussed and merged as part of the
+“review process”. The process will take some effort and
+time to learn. However, a few (command line) tools can get you a long
+way, and we’ll cover those essentials in the next sections.
+
Tip: we encourage you to keep the pull requests
+small. Reviewing small incremental changes is more efficient.
+
+
+
Working example
+
+
We saw the ‘warming stripes’ diagnostic in lesson Writing your own recipe. Imagine the
+following task: you want to contribute warming stripes recipe and
+diagnostics to ESMValTool. You have to add the diagnostics warming_stripes.py and the recipe recipe_warming_stripes.yml
+to their locations in ESMValTool directory. After these changes, you
+should also check if everything works fine. This is where we take
+advantage of the tools that are introduced later.
+
Let’s get started. Note that since this is an exercise to get
+familiar with the development and contribution process, we will not
+create a GitHub issue at this time but proceed as though it has been
+done.
+
+
+
Check code quality
+
+
We aim to adhere to best practices and coding standards. There are
+several tools that check our code against those standards like:
+
+
+flake8 for
+checking against the PEP8 style guide
+
+yapf to ensure
+consistent formatting for the whole project
+
+isort to consistently
+sort the import statements
+
+yamllint
+to ensure there are no syntax errors in our recipes and config
+files
The good news is that pre-commit has been already
+installed when we chose development installation.
+pre-commit is a command line and runs all of those tools.
+It also fixes some of those errors. To explore other tools, have a look
+at ESMValTool documentation on Code
+style.
+
+
+
+
+
+
Using pre-commit
+
+
Let’s checkout our local branch and add the script warming_stripes.py to the
+esmvaltool/diag_scripts directory.
+
+
BASH
+
+
cd ESMValTool
+git checkout your_branch_name
+cp path_of_warming_stripes.py esmvaltool/diag_scripts/
+
+
By default, pre-commit only runs on the files that have
+been staged in git:
+
+
BASH
+
+
git status
+git add esmvaltool/diag_scripts/warming_stripes.py
+pre-commit run --files esmvaltool/diag_scripts/warming_stripes.py
+
+
Inspect the output of pre-commit and fix the remaining
+errors.
+
+
+
+
+
+
+
+
+
The tail of the output of pre-commit:
+
+
BASH
+
+
Check for added large files..............................................Passed
+Check python ast.........................................................Passed
+Check for case conflicts.................................................Passed
+Check for merge conflicts................................................Passed
+Debug Statements (Python)................................................Passed
+Fix End of Files.........................................................Passed
+Trim Trailing Whitespace.................................................Passed
+yamllint.............................................(no files to check)Skipped
+nclcodestyle.........................................(no files to check)Skipped
+style-files..........................................(no files to check)Skipped
+lintr................................................(no files to check)Skipped
+codespell................................................................Passed
+isort....................................................................Passed
+yapf.....................................................................Passed
+docformatter.............................................................Failed
+- hook id: docformatter
+- files were modified by this hook
+flake8...................................................................Failed
+- hook id: flake8
+- exit code: 1
+
+esmvaltool/diag_scripts/warming_stripes.py:20:5:
+F841 local variable 'nx' is assigned to but never used
+
+
As can be seen above, there are two Failed check:
+
+
+docformatter: it is mentioned that “files were modified
+by this hook”. We run git diff to see the modifications.
+The output includes the following:
+
+
+
BASH
+
+
+in the form of the popular warming stripes figure by Ed Hawkins."""
+
+
The syntax """ at the end of docstring is moved by one
+line. Shifting it to the next line should fix this error.
+
+
+flake8: the error message is about an unused local
+variable nx. We should check our codes regarding the usage
+of nx. For now, let’s assume that it is added by mistake
+and remove it. Note that you have to run git add again to
+re-stage the file. Then rerun pre-commit and check that it passes.
+
+
+
+
+
+
+
+
Run unit tests
+
+
Previous section introduced some tools to check code style and
+quality. There is lack of mechanism to determine whether or not our code
+is getting the right answer. To achieve that, we need to write and run
+tests for widely-used functions. ESMValTool comes with a lot of tests
+that are in the folder tests.
+
To run tests, first we make sure that the working directory is
+ESMValTool and our local branch is checked out. Then, we
+can run tests using pytest locally:
+
+
BASH
+
+
pytest
+
+
Tests will also be run automatically by CircleCI, when you submit a pull
+request.
+
+
+
+
+
+
Running tests
+
+
Make sure our local branch is checked out and add the recipe recipe_warming_stripes.yml
+to the esmvaltool/recipes directory:
Run pytest and inspect the results, this might take a
+few minutes. If a test is failed, try to fix it.
+
+
+
+
+
+
+
+
+
Run:
+
+
BASH
+
+
pytest
+
+
When pytest run is complete, you can inspect the test
+reports that are printed in the console. Have a look at the second
+section of the report FAILURES:
The test message shows that the recipe
+recipe_warming_stripes.yml is not a valid recipe. Look for
+a line that starts with an E in the rest of the
+message:
+
+
BASH
+
+
+E esmvalcore._task.DiagnosticError: Cannot execute script
+'~/esmvaltool_tutorial/warming_stripes.py'(~/esmvaltool_tutorial/warming_stripes.py):
+file does not exist.
+
+
To fix the recipe, we need to edit the path of the diagnostic script
+as warming_stripes.py:
When we add or update a code, we also update its corresponding
+documentation. The ESMValTool documentation is available on docs.esmvaltool.org.
+The source files are located in
+ESMValTool/doc/sphinx/source/.
+
To build documentation locally, first we make sure that the working
+directory is ESMValTool and our local branch is checked
+out. Then, we run:
Similar to code, documentation should be well written and adhere to
+standards. If the documentation is built properly, the previous command
+prints a message to the console:
+
+
OUTPUT
+
+
build succeeded.
+
+The HTML pages are in doc/sphinx/build.
+
+
The main page of the documentation has been built into
+index.html in doc/sphinx/build/ directory. To
+preview this page locally, we open the file in a web browser:
+
+
BASH
+
+
xdg-open doc/sphinx/build/index.html
+
+
+
+
+
+
+
Creating a documentation
+
+
In previous exercises, we added the recipe recipe_warming_stripes.yml
+to ESMValTool. Now, we create a documentation file
+recipe_warming_stripes.rst for this recipe:
Save and close the file. We can think of this file as one page of a
+book. Examples of documentation pages can be found in the folder
+ESMValTool/doc/sphinx/source/recipes. Then, we need to
+decide where this page should be located inside the book. The table of
+content is defined by index.rst. Let’s have a look at the
+content:
+
+
BASH
+
+
nano doc/sphinx/source/recipes/index.rst
+
+
Add the recipe name i.e. recipe_warming_stripes to the
+section Other in this file and preview the recipe
+documentation page locally.
+
+
+
+
+
+
+
+
+
First, we add the recipe name recipe_warming_stripes to
+the section Other:
How do I use the preprocessor output in a Python diagnostic?
+
+
+
+
+
+
+
+
Objectives
+
+
Write a new Python diagnostic script.
+
Explain how a diagnostic script reads the preprocessor output.
+
+
+
+
+
+
+
Introduction
+
+
+
The diagnostic script is an important component of ESMValTool and it
+is where the scientific analysis or performance metric is implemented.
+With ESMValTool, you can adapt an existing diagnostic or write a new
+script from scratch. Diagnostics can be written in a number of open
+source languages such as Python, R, Julia and NCL but we will focus on
+understanding and writing Python diagnostics in this lesson.
+
In this lesson, we will explain how to find an existing diagnostic
+and run it using ESMValTool installed in editable/development mode. For
+a development installation, see the instructions in the lesson Development and contribution. Also,
+we will work with the recipe [recipe_python.yml][recipe] and the
+diagnostic script [diagnostic.py][diagnostic] called by this recipe that
+we have seen in the lesson Running your first
+recipe .
+
Let’s get started!
+
Understanding an existing Python diagnostic
+
+
+
If you clone the ESMValTool repository, a folder called
+ESMValTool is created in your home/working directory, see
+the instructions in the lesson Development and contribution .
+
The folder ESMValTool contains the source code of the
+tool. We can find the recipe recipe_python.yml and the
+python script diagnostic.py in these directories:
Let’s have look at the code in diagnostic.py. For
+reference, we show the diagnostic code in the dropdown box below. There
+are four main sections in the script:
+
+
A description i.e. the docstring (line 1).
+
Import statements (line 2-16).
+
Functions that implement our analysis (line 21-102).
+
A typical Python top-level script
+i.e. if __name__ == '__main__' (line 105-108).
+
+
+
+
+
+
+
+
PYTHON
+
+
1: """Python example diagnostic."""
+2: import logging
+3: from pathlib import Path
+4: from pprint import pformat
+5:
+6: import iris
+7:
+8: from esmvaltool.diag_scripts.shared import (
+9: group_metadata,
+10: run_diagnostic,
+11: save_data,
+12: save_figure,
+13: select_metadata,
+14: sorted_metadata,
+15: )
+16: from esmvaltool.diag_scripts.shared.plot import quickplot
+17:
+18: logger = logging.getLogger(Path(__file__).stem)
+19:
+20:
+21: def get_provenance_record(attributes, ancestor_files):
+22: """Create a provenance record describing the diagnostic data and plot."""
+23: caption = caption = attributes['caption'].format(**attributes)
+24:
+25: record = {
+26: 'caption': caption,
+27: 'statistics': ['mean'],
+28: 'domains': ['global'],
+29: 'plot_types': ['zonal'],
+30: 'authors': [
+31: 'andela_bouwe',
+32: 'righi_mattia',
+33: ],
+34: 'references': [
+35: 'acknow_project',
+36: ],
+37: 'ancestors': ancestor_files,
+38: }
+39: return record
+40:
+41:
+42: def compute_diagnostic(filename):
+43: """Compute an example diagnostic."""
+44: logger.debug("Loading %s", filename)
+45: cube = iris.load_cube(filename)
+46:
+47: logger.debug("Running example computation")
+48: cube = iris.util.squeeze(cube)
+49: return cube
+50:
+51:
+52: def plot_diagnostic(cube, basename, provenance_record, cfg):
+53: """Create diagnostic data and plot it."""
+54:
+55: # Save the data used for the plot
+56: save_data(basename, provenance_record, cfg, cube)
+57:
+58: if cfg.get('quickplot'):
+59: # Create the plot
+60: quickplot(cube, **cfg['quickplot'])
+61: # And save the plot
+62: save_figure(basename, provenance_record, cfg)
+63:
+64:
+65: def main(cfg):
+66: """Compute the time average for each input dataset."""
+67: # Get a description of the preprocessed data that we will use as input.
+68: input_data = cfg['input_data'].values()
+69:
+70: # Demonstrate use of metadata access convenience functions.
+71: selection = select_metadata(input_data, short_name='tas', project='CMIP5')
+72: logger.info("Example of how to select only CMIP5 temperature data:\n%s",
+73: pformat(selection))
+74:
+75: selection = sorted_metadata(selection, sort='dataset')
+76: logger.info("Example of how to sort this selection by dataset:\n%s",
+77: pformat(selection))
+78:
+79: grouped_input_data = group_metadata(input_data,
+80: 'variable_group',
+81: sort='dataset')
+82: logger.info(
+83: "Example of how to group and sort input data by variable groups from "
+84: "the recipe:\n%s", pformat(grouped_input_data))
+85:
+86: # Example of how to loop over variables/datasets in alphabetical order
+87: groups = group_metadata(input_data, 'variable_group', sort='dataset')
+88: for group_name in groups:
+89: logger.info("Processing variable %s", group_name)
+90: for attributes in groups[group_name]:
+91: logger.info("Processing dataset %s", attributes['dataset'])
+92: input_file = attributes['filename']
+93: cube = compute_diagnostic(input_file)
+94:
+95: output_basename = Path(input_file).stem
+96: if group_name != attributes['short_name']:
+97: output_basename = group_name +'_'+ output_basename
+98: if"caption"notin attributes:
+99: attributes['caption'] = input_file
+100: provenance_record = get_provenance_record(
+101: attributes, ancestor_files=[input_file])
+102: plot_diagnostic(cube, output_basename, provenance_record, cfg)
+103:
+104:
+105: if__name__=='__main__':
+106:
+107: with run_diagnostic() as config:
+108: main(config)
+
+
+
+
+
+
+
+
+
+
+
What is the starting point of a diagnostic?
+
+
+
Can you spot a function called main in the code
+above?
+
What are its input arguments?
+
How many times is this function mentioned?
+
+
+
+
+
+
+
+
+
+
+
The main function is defined in line 65 as
+main(cfg).
+
The input argument to this function is the variable
+cfg, a Python dictionary that holds all the necessary
+information needed to run the diagnostic script such as the location of
+input data and various settings. We will next parse this
+cfg variable in the main function and extract
+information as needed to do our analyses (e.g. in line 68).
+
The main function is called near the very end on line
+108. So, it is mentioned twice in our code - once where it is called by
+the top-level Python script and second where it is defined.
+
+
+
+
+
+
+
+
+
+
+
The function run_diagnostic
+
+
The function run_diagnostic (line 107) is called a
+context manager provided with ESMValTool and is the main entry point for
+most Python diagnostics.
+
+
+
+
Preprocessor-diagnostic interface
+
+
+
In the previous exercise, we have seen that the variable
+cfg is the input argument of the main
+function. The first argument passed to the diagnostic via the
+cfg dictionary is a path to a file called
+settings.yml. The ESMValTool documentation page provides an
+overview of what is in this file, see [Diagnostic script
+interfaces][interface].
+
+
+
+
+
+
What information do I need when writing a diagnostic script?
+
+
From the lesson Configuration, we
+saw how to change the configuration settings before running a recipe.
+First we set the option remove_preproc_dir to
+false in the configuration file, then run the recipe
+recipe_python.yml:
+
+
BASH
+
+
esmvaltool run examples/recipe_python.yml
+
+
+
Find one example of the file settings.yml in the
+run directory?
+
Open the file settings.yml and look at the
+input_files list. It contains paths to some files
+metadata.yml. What information do you think is saved in
+those files?
+
+
+
+
+
+
+
+
+
+
+
One example of settings.yml can be found in the
+directory:
+path_to_recipe_output/run/map/script1/settings.yml
+
+
The metadata.yml files hold information about the
+preprocessed data. There is one file for each variable having detailed
+information on your data including project (e.g., CMIP6, CMIP5), dataset
+names (e.g., BCC-ESM1, CanESM2), variable attributes (e.g.,
+standard_name, units), preprocessor applied and time range of the data.
+You can use all of this information in your own diagnostic.
+
+
+
+
+
+
Diagnostic shared functions
+
+
+
Looking at the code in diagnostic.py, we see that
+input_data is read from the cfg dictionary
+(line 68). Now we can group the input_data according to
+some criteria such as the model or experiment. To do so, ESMValTool
+provides many functions such as select_metadata (line 71),
+sorted_metadata (line 75), and group_metadata
+(line 79). As you can see in line 8, these functions are imported from
+esmvaltool.diag_scripts.shared that means these are shared
+across several diagnostics scripts. A list of available functions and
+their description can be found in [The ESMValTool Diagnostic API
+reference][shared].
+
+
Extracting
+information needed for analyses
+
We have seen the functions used for selecting, sorting and grouping
+data in the script. What do these functions do?
+
+
Answer
+
There is a statement after use of select_metadata,
+sorted_metadata and group_metadata that starts
+with logger.info (lines 72, 76 and 82). These lines print
+output to the log files. In the previous exercise, we ran the recipe
+recipe_python.yml. If you look at the log file
+recipe_python_#_#/run/map/script1/log.txt in
+esmvaltool_output directory, you can see the output from
+each of these functions, for example:
+
2023-06-28 12:47:14,038 [2548510] INFO diagnostic,106 Example of how to
+group and sort input data by variable groups from the recipe:
+{'tas': [{'alias': 'CMIP5',
+ 'caption': 'Global map of {long_name} in January 2000 according to '
+ '{dataset}.\n',
+ 'dataset': 'bcc-csm1-1',
+ 'diagnostic': 'map',
+ 'end_year': 2000,
+ 'ensemble': 'r1i1p1',
+ 'exp': 'historical',
+ 'filename': '~/recipe_python_20230628_124639/preproc/map/tas/
+This is how we can access preprocessed data within our diagnostic.
+
+
+
+
+
Diagnostic computation
+
+
+
After grouping and selecting data, we can read individual attributes
+(such as filename) of each item. Here, we have grouped the input data by
+variables, so we loop over the variables (line 88).
+Following this is a call to the function compute_diagnostic
+(line 93). Let’s look at the definition of this function in line 42,
+where the actual analysis of the data is done.
+
Note that output from the ESMValCore preprocessor is in the form of
+NetCDF files. Here, compute_diagnostic uses Iris
+to read data from a netCDF file and performs an operation
+squeeze to remove any dimensions of length one. We can
+adapt this function to add our own analysis. As an example, here we
+calculate the bias using the average of the data using Iris cubes.
+
+
PYTHON
+
+
def compute_diagnostic(filename):
+"""Compute an example diagnostic."""
+ logger.debug("Loading %s", filename)
+ cube = iris.load_cube(filename)
+
+ logger.debug("Running example computation")
+ cube = iris.util.squeeze(cube)
+
+# Calculate a bias using the average of data
+ cube.data = cube.core_data() - cube.core_data.mean()
+return cube
+
+
+
+
+
+
+
iris cubes
+
+
Iris reads data from NetCDF files into data structures called cubes.
+The data in these cubes can be modified, combined with other cubes’ data
+or plotted.
+
+
+
+
+
+
+
+
+
Reading data using xarray
+
+
Alternately, you can use xarrays to read the data
+instead of Iris.
+
+
+
+
+
+
+
+
+
First, import xarray package at the top of the script
+as:
+
+
PYTHON
+
+
import xarray as xr
+
+
Then, change the compute_diagnostic as:
+
+
PYTHON
+
+
def compute_diagnostic(filename):
+"""Compute an example diagnostic."""
+ logger.debug("Loading %s", filename)
+ dataset = xr.open_dataset(filename)
+
+#do your analyses on the data here
+
+return dataset
+
+
Caution: If you read data using xarray keep in mind to change
+accordingly the other functions in the diagnostic which are dealing at
+the moment with Iris cubes.
+
+
+
+
+
+
+
+
+
+
Reading data using the netCDF4 package
+
+
Yet another option to read the NetCDF file data is to use the
+[netCDF-4 Python interface][netCDF] to the netCDF C library.
+
+
+
+
+
+
+
+
+
First, import the netCDF4 package at the top of the
+script as:
+
+
PYTHON
+
+
import netCDF4
+
+
Then, change compute_diagnostic as:
+
+
PYTHON
+
+
def compute_diagnostic(filename):
+"""Compute an example diagnostic."""
+ logger.debug("Loading %s", filename)
+ nc_data = netCDF4.Dataset(filename,'r')
+
+#do your analyses on the data here
+
+return nc_data
+
+
Caution: If you read data using netCDF4 keep in mind to change
+accordingly the other functions in the diagnostic which are dealing at
+the moment with Iris cubes.
+
+
+
+
+
Diagnostic output
+
+
+
+
Plotting the output
+
+
Often, the end product of a diagnostic script is a plot or figure.
+The Iris cube returned from the compute_diagnostic function
+(line 93) is passed to the plot_diagnostic function (line
+102). Let’s have a look at the definition of this function in line 52.
+This is where we would plug in our plotting routine in the diagnostic
+script.
+
More specifically, the quickplot function (line 60) can
+be replaced with the function of our choice. As can be seen, this
+function uses **cfg['quickplot'] as an input argument. If
+you look at the diagnostic section in the recipe
+recipe_python.yml, you see quickplot is a key
+there:
This way, we can pass arguments such as the type of plot
+pcolormesh and the colormap cmap:Reds from the
+recipe to the quickplot function in the diagnostic.
+
+
+
+
+
+
Passing arguments from the recipe to the diagnostic
+
+
Change the type of the plot and its colormap and inspect the output
+figure.
+
+
+
+
+
+
+
+
+
In the recipe recipe_python.yml, you could change
+plot_type and cmap. As an example, we choose
+plot_type: pcolor and cmap: BuGn:
The plot can be found at
+path_to_recipe_output/plots/map/script1/png.
+
+
+
+
+
+
+
+
+
+
ESMValTool gallery
+
+
ESMValTool makes it possible to produce a wide array of plots and
+figures as seen in the gallery.
+
+
+
+
+
+
Saving the output
+
+
In our example, the function save_data in line 56 is
+used to save the Iris cube. The saved files can be found under the
+work directory in a .nc format. There is also
+the function save_figure in line 62 to save the plots under
+the plot directory in a .png format (or
+preferred format specified in your configuration settings). Again, you
+may choose your own method of saving the output.
+
+
+
Recording the provenance
+
+
When developing a diagnostic script, it is good practice to record
+provenance. To do so, we use the function
+get_provenance_record (line 100). Let us have a look at the
+definition of this function in line 21 where we describe the diagnostic
+data and plot. Using the dictionary record, it is possible
+to add custom provenance to our diagnostics output. Provenance is stored
+in the W3C PROV
+XML format and also in an SVG file under the
+work and plot directory. For more information,
+see [recording provenance][provenance].
+
+
Congratulations!
+
+
+
You now know the basic diagnostic script structure and some available
+tools for putting together your own diagnostics. Have a look at existing
+recipes and diagnostics in the repository for more examples of functions
+you can use in your diagnostics!
+
+
+
+
+
+
Key Points
+
+
+
ESMValTool provides helper functions to interface a Python
+diagnostic script with preprocessor output.
+
Existing diagnostics can be used as templates and modified to write
+new diagnostics.
+
Helper functions can be imported from
+esmvaltool.diag_scripts.shared and used in your own
+diagnostic script.
How to use the existing CMORizer scripts shipped with
+ESMValTool?
+
How to add support for new (observational) datasets?
+
+
+
+
+
+
+
+
Objectives
+
+
Understand what CMORization is and why it is necessary.
+
Use existing scripts to CMORize your data.
+
Write a new CMORizer script to support additional data.
+
+
+
+
+
+
+
Introduction
+
+
+
This episode deals with “CMORization”. ESMValTool is designed to work
+with data that follow the CMOR standards. Unfortunately, not all
+datasets follow these standards. In order to use such datasets in
+ESMValTool we first need to reformat the data. This process is called
+“CMORization”.
+
+
+
+
+
+
What are the CMOR standards?
+
+
The name “CMOR” originates from a tool: the Climate Model Output Rewriter.
+This tool is used to create “CF-Compliant netCDF files for use in the
+CMIP projects”. So CMOR extends the CF-standard with additional
+requirements for the Coupled Model Intercomparison Projects (see e.g. here).
+
Concretely, the CMOR standards dictate e.g. the variable names and
+units, coordinate information, how the data should be structured (e.g. 1
+variable per file), additional metadata requirements, and file naming
+conventions a.k.a. the data reference syntax (DRS).
+All this information is stored in so-called CMOR tables. For example,
+the CMOR tables for the CMIP6 project can be found here.
+
+
+
+
ESMValTool offers two ways to CMORize data:
+
+
A reformatting script can be used to create a CMOR-compliant copy.
+CMORizer scripts for several popular datasets are included in
+ESMValTool, and ESMValTool also provides a convenient way to execute
+them.
+
ESMValCore can execute CMOR fixes ‘[on the fly](https://docs.esmvaltool.org/projects/esmvalcore/en/latest/develop/
+fixing_data.html#fixing-data)’. The advantage is that you don’t need to
+store an additional, reformatted copy of the data. The disadvantage is
+that these fixes should be implemented inside ESMValCore, which is
+beyond the scope of this tutorial.
+
+
In this lesson, we will re-implement a CMORizer script for the
+FLUXCOM dataset that contains observations of the Gross Primary
+Production (GPP), a variable that is important for calculating
+components of the global carbon cycle. See the next section on how to
+obtain data.
The data for this episode is available via the FluxCom Data
+Portal. First you’ll need to register. After registration, in the
+dropdown boxes, select FLUXCOM as the data choice and click download.
+Three files will be displayed. Click the download button on the “FLUXCOM
+(RS+METEO) Global Land Carbon Fluxes using CRUNCEP climate data”. You’ll
+receive an email with the FTP address to access the server. Connect to
+the server, follow the path in your email, and look for the file
+raw/monthly/GPP.ANN.CRUNCEPv6.monthly.2000.nc. Download
+that file and save it in a folder called
+~/data/RAWOBS/Tier3/FLUXCOM.
+
Note: you’ll need a user-friendly ftp client. On Linux,
+ncftp works okay.
+
+
+
+
+
+
What is the deal with those “tiers”?
+
+
Many datasets come with access restrictions. In this way the data
+providers can keep track of how their data is used. In many cases
+“restricted access” just means that one has to register with an email
+address and accept the terms of use, which typically ask that you
+acknowledge the data providers.
+
There are also datasets available that do not need a registration.
+The “obs4MIPs” or “ana4MIPs” datasets, for example, are specifically
+produced to facilitate comparisons with model simulations.
+
To reflect these different levels of access restriction, the
+ESMValTool team has created a tier-system. The definition of the
+different tiers are as follows:
+
+
+Tier1: obs4MIPs and ana4MIPS datasets (can be used
+directly with the ESMValTool)
+
+Tier2: other freely available datasets (most of
+them will need some kind of cmorization)
+
+Tier3: datasets with access restrictions (most of
+these datasets will also need some kind of cmorization)
+
+
These access restrictions are also why the ESMValTool developers
+cannot distribute copies or automate downloading of all observations and
+reanalysis data used in the recipes. As a compromise, we provide the
+CMORization scripts so that each user can CMORize their own copy of the
+access restricted datasets if needed.
+
+
+
+
Run the existing CMORizer script
+
+
+
Before we develop our own CMORizer script, let’s first see what
+happens when we run the existing one. There is a specific command
+available in the ESMValTool to run the CMORizer scripts:
+
+
BASH
+
+
esmvaltool data format --config_file<path to config-user.yml><dataset-name>
+
+
The config-user.yml is the file in which we define the
+different data paths, see the episode on Configuration. In the
+rootpath of your config-user.yml, make sure to
+add the right directory for “RAWOBS” data in which you downloaded the
+FLUXCOM dataset:
+
+
YAML
+
+
rootpath:
+RAWOBS: ~/data/RAWOBS
+
+
This enables ESMValTool to find the raw observational datasets stored
+in the “RAWOBS” folder. The dataset-name needs to be
+identical to the folder name that was created to store the raw
+observation data files, i.e. RAWOBS/TierX/dataset-name. In
+our case this would be “FLUXCOM”.
+
If everything is okay, the output should look something like
+this:
+
+
OUTPUT
+
+
...
+... Starting the CMORization Tool at time: 2022-07-26 14:02:16 UTC
+... ----------------------------------------------------------------------
+... input_dir = /home/peter/data/RAWOBS
+... output_dir = /home/peter/esmvaltool_output/data_formatting_20220726_140216
+... ----------------------------------------------------------------------
+... Running the CMORization scripts.
+... Processing datasets ['FLUXCOM']
+... Input data from: /home/peter/data/RAWOBS/Tier3/FLUXCOM
+... Output will be written to: /home/peter/esmvaltool_output/
+ data_formatting_20220726_140216/Tier3/FLUXCOM
+... Reformat script: /home/peter/mambaforge/envs/esmvaltool/lib/python3.9/
+ site-packages/esmvaltool/cmorizers/data/formatters/datasets/fluxcom
+... CMORizing dataset FLUXCOM using Python script /home/peter/mambaforge/envs/
+ esmvaltool/lib/python3.9/site-packages/esmvaltool/cmorizers/data/formatters/
+ datasets/fluxcom.py
+... Found input file '/home/peter/data/RAWOBS/Tier3/FLUXCOM/GPP.ANN.CRUNCEPv6.monthly.*.nc'
+... CMORizing variable 'gpp'
+... Lmon
+... Var is gpp
+... ... UserWarning: Ignoring netCDF variable 'GPP' invalid units 'gC m-2 day-1'
+
+... Fixing time...
+... Fixing latitude...
+... Fixing longitude...
+... Flipping dimensional coordinate latitude...
+... Saving file
+... Saving: /home/peter/esmvaltool_output/data_formatting_20220726_140216/Tier3/
+ FLUXCOM/OBS_FLUXCOM_reanaly_ANN-v1_Lmon_gpp_200001-200012.nc
+... Cube has lazy data [lazy is preferred]
+... CMORization of dataset FLUXCOM finished!
+... Formatting successful for dataset FLUXCOM
+
+
So you can see that several fixes are applied, and the CMORized file
+is written to the ESMValTool output directory, i.e.
+~/esmvaltool_output/data_formatting_YYYYMMDD_HHMMSS/TierX/dataset-name/filename.nc
+In order to use it, we’ll have to copy it from the output directory to a
+folder called ~/data/OBS/Tier3/FLUXCOM and make sure the
+path to OBS is set correctly in our config-user file:
+
+
YAML
+
+
rootpath:
+OBS: ~/data/OBS
+
+
You can also see the path where ESMValTool stores the reformatting
+script:
+~/ESMValTool/esmvaltool/data/formatters/datasets/fluxcom.py.
+You may have a look at this file if you want. The script also uses a
+configuration file:
+~/ESMValTool/esmvaltool/cmorizers/data/cmor_config/FLUXCOM.yml.
+
Make a test recipe
+
+
+
To verify that the data is correctly CMORized, we will make a simple
+test recipe. As illustrated in the figure at the top of this episode,
+one of the steps that ESMValTool executes is a CMOR-check. If the data
+is not correctly CMORized, ESMValTool will give a warning or error.
+
+
+
+
+
+
Create a test recipe
+
+
Create a simple recipe called recipe_check_fluxcom.yml that
+loads the FLUXCOM data. It should include a datasets section with a
+single entry for the “FLUXCOM” dataset with the correct dataset keys,
+and a diagnostics section with two variables: gpp. We don’t need any
+preprocessors or scripts (set scripts: null), but we have
+to add a documentation section with a description, authors and
+maintainer, otherwise the recipe will fail.
+
Use the following dataset keys:
+
+
project: OBS
+
dataset: FLUXCOM
+
type: reanaly
+
version: ANN-v1
+
mip: Lmon
+
start_year: 2000
+
end_year: 2000
+
tier: 3
+
+
Some of these dataset keys are further explained in the callout boxes
+in this episode.
+
+
+
+
+
+
+
+
+
Here’s an example recipe
+
+
YAML
+
+
documentation:
+
+description: Test recipe for FLUXCOM data
+title: This is a test recipe for the FLUXCOM data.
+
+authors:
+- kalverla_peter
+
+maintainer:
+- kalverla_peter
+
+datasets:
+-{project: OBS,dataset: FLUXCOM,mip: Lmon,tier:3,start_year:2000,
+end_year:2000,type: reanaly,version: ANN-v1}
+
+diagnostics:
+check_fluxcom:
+description: Check that ESMValTool can load the cmorized fluxnet data without errors.
+variables:
+gpp:
+scripts:null
esmvaltool run recipe_check_fluxcom.yml --config_file<path to config-user.yml> --log_level debug
+
+
If everything is okay, the recipe should run without problems.
+
Starting from scratch
+
+
+
Now that you’ve seen how to use an existing CMORizer script, let’s
+think about adding a new one. We will remove the existing CMORizer
+script, and re-implement it from scratch. This exercise allows us to
+point out all the details of what’s going on. We’ll also remove the
+CMORized data that we’ve just created, so our test recipe will not be
+able to use it anymore.
"""ESMValTool CMORizer for FLUXCOM GPP data.
+
+<We will add some useful info here later>
+"""
+import logging
+from esmvaltool.cmorizers.data import utilities as utils
+
+logger = logging.getLogger(__name__)
+
+def cmorization(in_dir, out_dir, cfg, cfg_user, start_date, end_date):
+"""Cmorize the dataset."""
+
+# This is where you'll add the cmorization code
+# 1. find the input data
+# 2. apply the necessary fixes
+# 3. store the data with the correct filename
+
+
Here, in_dir corresponds to the input directory of the
+raw files, out_dir to the output directory of final
+reformatted data set and cfg to a configuration dictionary
+given by a configuration file that we will get to shortly. The last
+three arguments will not be considered in this script but can be used in
+other cases. cfg_user corresponds to the user configuration
+file, start_date to the start of the period to format, and
+end_date to the end of the period to format. When you type
+the command esmvaltool data format in the terminal,
+ESMValTool will call this function with the settings found in your
+configuration files.
+
The ESMValTool CMORizer also needs a dataset configuration file.
+Create a file called
+~/ESMValTool/esmvaltool/cmorizers/data/cmor_config/FLUXCOM.yml
+and fill it with the following boilerplate:
Note: the name of this file must be
+identical to dataset-name.
+
As you can see, the configuration file contains information about the
+original filename of the dataset, and some additional metadata that you
+might recognize from the CMOR filename structure. It also contains a
+list of variables that’s available for this dataset. We’ll add this
+information step by step in the following sections.
+
+
+
+
+
+
RAWOBS, OBS, OBS6!?
+
+
In the configuration above we’ve already filled in the
+project_id. ESMValTool uses these project IDs to find the
+data on your hard drive, and also to find more information about the
+data. The RAWOBS and OBS projects refer to
+external data before and after CMORization, respectively.
+Historically, most external data were observations, hence the
+naming.
+
In going from CMIP5 to CMIP6, the CMOR standards changed a bit. For
+example, some variables were renamed, which posed a dilemma: should
+CMORization reformat to the CMIP5 or CMIP6 definition? To solve this,
+the OBS6 project was created. So OBS6 data
+follow the CMIP6 standards, and that’s what we’ll use for the new
+CMORizer.
+
+
+
+
You can try running the CMORizer at this point, and it should work
+without errors. However, it doesn’t produce any output yet:
+
+
BASH
+
+
esmvaltool data format --config_file<path to config-user.yml> FLUXCOM
+
+
+
1. Find the input data
+
+
First we’ll get the CMORizer script to locate our FLUXCOM data. We
+can use the information from the in_dir and
+cfg variables. Add the following snippet to your CMORizer
+script:
+
+
PYTHON
+
+
# 1. find the input data
+logger.info("in_dir: '%s'", in_dir)
+logger.info("cfg: '%s'", cfg)
+
+
If you run the CMORizer again, it will print out the content of these
+variables and the output should contain something like this:
Try to locate the input data inside the CMORizer script and load it
+(we’ll use iris because ESMValTool includes helper
+utilities for iris cubes). Confirm that you’ve loaded the data by
+logging the correct path and (part of the) file content.
+
+
+
+
+
+
+
+
+
There are many ways to do it. In any case, you should have added the
+original filename to the configuration file (and un-commented this
+line): filename: 'GPP.ANN.CRUNCEPv6.monthly.*.nc'. Note the
+*: this is a useful shorthand to find multiple files for
+different years. In a similar way we can also look for multiple
+variables, etc.
+
Here’s an example solution (inserted directly under the original
+comment):
+
+
PYTHON
+
+
# 1. find the input data
+filename_pattern = cfg['filename']
+matches = Path(in_dir).glob(filename_pattern)
+
+for match in matches:
+ input_file =str(match)
+ logger.info("found: %s", input_file)
+ cube = iris.load_cube(input_file)
+ logger.info("content: %s", cube)
+
+
To make this work we’ve added import iris and
+from pathlib import Path at the top of the file. Note that
+we’ve started a loop, since we may find multiple files if there’s more
+than one year of data available.
+
+
+
+
+
+
+
2. Save the data with the correct filename
+
+
Before we start adding fixes, we’ll first make sure that our CMORizer
+can also write output files with the correct name. This will enable us
+to use the test recipe for the CMOR compatibility check.
+
We can use the save function from the utils
+that we imported at the top. The call signature looks like this:
+utils.save_variables(cube, var, outdir, attrs, **kwargs).
+
We already have the cube and the outdir.
+The variable short name (var) and attributes
+(attrs) are set through the configuration file. So we need
+to find out what the correct short name and attributes are.
+
The standard attributes for CMIP variables are defined in the CMIP
+tables. These tables are differentiated according to the “MIP” they
+belong to. The tables are a copy of the PCMDI guidelines.
+
+
+
+
+
+
Find the variable “gpp” in a CMOR table
+
+
Check the available CMOR tables to find the variable “gpp” with the
+following characteristics:
The variable “gpp” belongs to the land variables. The temporal
+resolution that we are looking for is “monthly”. This information points
+to the “Lmon” CMIP table. And indeed, the variable “gpp” can be found in
+the file [here](https://github.com/ESMValGroup/ESMValCore/blob/main/esmvalcore/
+cmor/tables/cmip6/Tables/CMIP6_Lmon.json).
+
+
+
+
+
If the variable you are interested in is not available in the
+standard CMOR tables, you could write a custom CMOR table entry for the
+variable. This, however, is beyond the scope of this tutorial.
+
+
+
+
+
+
Fill the configuration file
+
+
Uncomment the following entries in your configuration file and fill
+them with appropriate values:
+
+
dataset_id
+
version
+
tier
+
modeling_realm
+
short_name (the ??? immediately under variables)
+
mip
+
+
+
+
+
+
+
+
+
+
The configuration file now look something like this:
Now that we have set this information correctly in the config file,
+we can call the save function. Add the following python code to your
+CMORizer script:
+
+
PYTHON
+
+
# 3. store the data with the correct filename
+attributes = cfg['attributes']
+variables = cfg['variables']
+
+for short_name, variable_info in variables.items():
+ all_attributes = {**attributes, **variable_info} # add the mip to the other attributes
+ utils.save_variable(cube=cube, var=short_name, outdir=out_dir, attrs=all_attributes)
+
+
Since we only have one variable (gpp), the loop is not strictly
+necessary. However, this makes it possible to add more variables later
+on.
+
+
+
+
+
+
Was the CMORization successful so far?
+
+
If you run the CMORizer again, you should see that it creates an
+output file named
+OBS6_FLUXCOM_reanaly_ANN-v1_Lmon_gpp_xxxx01-xxxx12.nc
+stored in your ESMValTool output directory
+~/esmvaltool_output/data_formatting_YYYYMMDD_HHMMSS/Tier3/FLUXCOM/.
+The “xxxx” and “yyyy” represent the start and end year of the data.
+
+
+
+
Great! So we have produced a NetCDF file with the CMORizer that
+follows the naming convention for ESMValTool datasets. Let’s have a look
+at the NetCDF file as it was written with the very basic CMORizer from
+above.
The file contains a variable named “GPP” that contains three
+dimensions: “time”, “lat”, “lon”. Notice the strange time units, and the
+invalid_units in the global attributes section. Also it
+seems that there is not information available about the lat and lon
+coordinates. These are just some of the things we’ll address in the next
+section.
+
+
+
3. Implementing additional fixes
+
+
Copy the output of the CMORizer to your folder
+~/data/OBS6/Tier3/FLUXCOM/ and change the test recipe to
+look for OBS6 data instead of OBS (note: we’re upgrading the CMORizer to
+newer standards here!). Make sure the path to OBS6 is set
+correctly in our config-user file:
+
+
YAML
+
+
rootpath:
+OBS6: ~/data/OBS6
+
+
If we now run the test recipe on our newly ‘CMORized’ data,
+
+
BASH
+
+
esmvaltool run recipe_check_fluxcom.yml --config_file<path to config-user.yml> --log_level debug
+
+
it should be able to find the correct file, but it does not succeed
+yet. The first thing that the ESMValTool CMOR checker brings up is:
+
+
ERROR
+
+
iris.exceptions.UnitConversionError: Cannot convert from unknown units. The
+"units" attribute may be set directly.
+
+
If you look closely at the error messages, you can see that this
+error concerns the units of the coordinates. ESMValTool tries to fix
+them automatically, but since no units are defined on the coordinates,
+this fails.
+
The cmorizer utilities also include a function called
+fix_coords, but before we can use it, we’ll also need to
+make sure the coordinates have the correct standard name. Add the
+following code to your cmorizer:
+
+
PYTHON
+
+
# 2. Apply the necessary fixes
+# 2a. Fix/add coordinate information and metadata
+cube.coord('lat').standard_name ='latitude'
+cube.coord('lon').standard_name ='longitude'
+utils.fix_coords(cube)
+
+
With some additional refactoring, our cmorization function might then
+look something like this:
+
+
PYTHON
+
+
def cmorization(in_dir, out_dir, cfg, cfg_user, start_date, end_date):
+"""Cmorize the dataset."""
+
+# Get general information from the config file
+ attributes = cfg['attributes']
+ variables = cfg['variables']
+
+for short_name, variable_info in variables.items():
+ logger.info("CMORizing variable: %s", short_name)
+
+# 1a. Find the input data (one file for each year)
+ filename_pattern = cfg['filename']
+ matches = Path(in_dir).glob(filename_pattern)
+
+for match in matches:
+# 1b. Load the input data
+ input_file =str(match)
+ logger.info("found: %s", input_file)
+ cube = iris.load_cube(input_file)
+
+# 2. Apply the necessary fixes
+# 2a. Fix/add coordinate information and metadata
+ cube.coord('lat').standard_name ='latitude'
+ cube.coord('lon').standard_name ='longitude'
+ utils.fix_coords(cube)
+
+# 3. Save the CMORized data
+ all_attributes = {**attributes, **variable_info}
+ utils.save_variable(cube=cube, var=short_name, outdir=out_dir, attrs=all_attributes)
+
+
Run the CMORizer script once more. Have a look at the netCDF file,
+and confirm that the coordinates now have much more metadata added to
+them. Then, run the test recipe again with the latest CMORizer output.
+The next error is:
+
+
ERROR
+
+
esmvalcore.cmor.check.CMORCheckError: There were errors in variable GPP:
+Variable GPP units unknown can not be converted to kg m-2 s-1 in cube:
+
+
Okay, so let’s fix the units of the “GPP” variable in the CMORizer.
+Remember that you can find the correct units in the CMOR table. Add the
+following three lines to our CMORizer:
+
+
PYTHON
+
+
# 2b. Fix gpp units
+logger.info("Changing units for gpp from gc/m2/day to kg/m2/s")
+cube.data = cube.core_data() / (1000*86400)
+cube.units ='kg m-2 s-1'
+
+
If everything is okay, the test recipe should now pass. We’re getting
+there. Looking through the output though, there’s still a warning.
+
+
OUTPUT
+
+
WARNING There were warnings in variable GPP:
+Standard name for GPP changed from None to gross_primary_productivity_of_biomass_expressed_as_carbon
+Long name for GPP changed from GPP to Carbon Mass Flux out of Atmosphere Due to
+ Gross Primary Production on Land [kgC m-2 s-1]
+
+
ESMValTool is able to apply automatic fixes here, but if we are
+running a CMORizer script anyway, we might as well fix it
+immediately.
You can see that we’re using the CMOR table here. This was passed on
+by ESMValTool as part of the CFG input variable. So here
+we’re making sure that we’re updating the cubes metadata to conform to
+the CMOR table.
+
Finally, the test recipe should run without errors or warnings.
+
+
+
4. Finalizing the CMORizer
+
+
Once everything works as expected, there’s a couple of things that we
+can still do.
+
+
+Add download instructions. The header of the
+CMORizer contains information about where to obtain the data, when it
+was accessed the last time, which ESMValTool “tier” it is associated
+with, and more detailed information about the necessary downloading and
+processing steps.
+
+
+
+
+
+
+
Fill out the header for the “FLUXCOM” dataset
+
+
Fill out the header of the new CMORizer. The different parts that
+need to be present in the header are the following:
+
+
Caption: the first line of the docstring should summarize what the
+script does.
+
Tier
+
Source
+
Last access
+
Download and processing instructions
+
+
+
+
+
+
+
+
+
+
The header for the “FLUXCOM” dataset could look something like
+this:
+
+
PYTHON
+
+
"""ESMValTool CMORizer for FLUXCOM GPP data.
+
+Tier
+ Tier 3: restricted dataset.
+
+Source
+ http://www.bgc-jena.mpg.de/geodb/BGI/Home
+
+Last access
+ 20190727
+
+Download and processing instructions
+ From the website, select FLUXCOM as the data choice and click download.
+ Two files will be displayed. One for Land Carbon Fluxes and one for
+ Land Energy fluxes. The Land Carbon Flux file (RS + METEO) using
+ CRUNCEP data file has several data files for different variables.
+ The data for GPP generated using the
+ Artificial Neural Network Method will be in files with name:
+ GPP.ANN.CRUNCEPv6.monthly.\*.nc
+ A registration is required for downloading the data.
+ Users in the UK with a CEDA-JASMIN account may request access to the jules
+ workspace and access the data.
+ Note : This data may require rechunking of the netcdf files.
+ This constraint will not exist once iris is updated to
+ version 2.3.0 Aug 2019
+"""
+
+
+
+
+
+
+
+Fill the dataset information list. The file
+[datasets.yml](https://github.com/ESMValGroup/ESMValTool/blob/main/esmvaltool/
+cmorizers/data/datasets.yml) contains the ESMValTool “tier”, the data
+source, the last access time and download instructions for all supported
+datasets in ESMValTool. You can simply reuse the information written in
+the header of the CMORizer.
+
+
+
+
+
+
+
Fill out the FLUXCOM entry in
+datasets.yml
+
+
+
Fill out the FLUXCOM entry in datasets.yml. The
+different parts that need to be present in the entry are the
+following:
+
+
Dataset-name
+
Tier
+
Source
+
Last access
+
Download and processing instructions
+
+
+
+
+
+
+
+
+
+
The entry for the “FLUXCOM” dataset should look like:
+
+
YAML
+
+
FLUXCOM:
+tier:3
+source: http://www.bgc-jena.mpg.de/geodb/BGI/Home
+last_access: 2019-07-27
+ info: |
+ From the website, select FLUXCOM as the data choice and click download.
+ Two files will be displayed. One for Land Carbon Fluxes and one for
+ Land Energy fluxes. The Land Carbon Flux file (RS + METEO) using
+ CRUNCEP data file has several data files for different variables.
+ The data for GPP generated using the
+ Artificial Neural Network Method will be in files with name:
+ GPP.ANN.CRUNCEPv6.monthly.*.nc
+ A registration is required for downloading the data.
+ Users in the UK with a CEDA-JASMIN account may request access to the jules
+ workspace and access the data.
+ Note : This data may require rechunking of the netcdf files.
+ This constraint will not exist once iris is updated to
+ version 2.3.0 Aug 2019
+
+
+
+
+
+
Once the datasets.yml file is filled, you can check that
+ESMValTool can display information about the added dataset with:
+
+
BASH
+
+
esmvaltool data info FLUXCOM
+
+
If everything is okay, the output should look something like
+this:
+
+
OUTPUT
+
+
$ esmvaltool data info FLUXCOM
+FLUXCOM
+
+Tier: 3
+Source: http://www.bgc-jena.mpg.de/geodb/BGI/Home
+Automatic download: No
+
+From the website, select FLUXCOM as the data choice and click download.
+Two files will be displayed. One for Land Carbon Fluxes and one for
+Land Energy fluxes. The Land Carbon Flux file (RS + METEO) using
+CRUNCEP data file has several data files for different variables.
+The data for GPP generated using the
+Artificial Neural Network Method will be in files with name:
+GPP.ANN.CRUNCEPv6.monthly.*.nc
+A registration is required for downloading the data.
+Users in the UK with a CEDA-JASMIN account may request access to the jules
+workspace and access the data.
+Note : This data may require rechunking of the netcdf files.
+This constraint will not exist once iris is updated to
+version 2.3.0 Aug 2019
+
+
Note that Automatic download: No means that no automatic
+downloading script is available in ESMValTool for this dataset. The
+implementation of such a script is beyond the scope of this tutorial. To
+find out which datasets come with an automatic download script, you can
+run: esmvaltool data list to list all datasets supported in
+ESMValTool. More information about the usage of automatic downloading
+scripts can be found in the User
+Guide.
+
+
+Complete the metadata in the config file. We have
+left a few fields empty in the configuration file, such as ‘source’. By
+filling out these fields we can make sure the relevant metadata is
+passed on as attributes in the CMORized data. To make this work, add the
+following line to the CMORizer script:
+
+
+
PYTHON
+
+
# 2d. Update the cubes metadata with all info from the config file
+utils.set_global_atts(cube, attributes)
+
+
+
Add a reference. Make sure that there is a
+reference file available for the dataset, see the instruction [here](https://docs.esmvaltool.org/en
+/latest/community/diagnostic.html#adding-references).
+
Make a pull request. Since you have gone through
+all the trouble to reformat the dataset so that the ESMValTool can work
+with it, it would be great if you could provide the CMORizer, and
+ultimately with that the dataset, to the rest of the community. For more
+information, see the episode on Development and
+contribution.
+
Add documentation. Make sure that you have added
+the info of your dataset to the User Guide so that people know it is
+available for the ESMValTool Obtaining
+input data.
+
+
+
Some final comments
+
+
+
Congratulations! You have just added support for a new dataset to
+ESMValTool! Adding a new CMORizer is definitely already an advanced task
+when working with the ESMValTool. You need to have a basic understanding
+of how the ESMValTool works and how it’s internal structure looks like.
+In addition, you need to have a basic understanding of NetCDF files and
+a programming language. In our example we used python for the CMORizing
+script since we advocate for focusing the code development on only a few
+different programming languages. This helps to maintain the code and to
+ensure the compatibility of the code with possible fundamental changes
+to the structure of the ESMValTool and ESMValCore.
+
More information about adding observations to the ESMValTool can be
+found in the documentation.
+
+
+
+
+
+
Key Points
+
+
+
CMORizers are dataset-specific scripts that can be run once to
+generate CMOR-compliant data.
+
ESMValTool comes with a set of CMORizers readily available, but you
+can also add your own.
Every user encounters errors. Once you know why you get certain types
+of errors, they become much easier to fix. The good news is that
+ESMValTool creates a record of the output messages and stores them in
+log files. They can be used for debugging or monitoring the process.
+This lesson helps you understand the different types of errors and when
+you are likely to encounter them.
+
Log files
+
+
+
Each time we run ESMValTool, it will produce a new output directory.
+This directory should contain the run folder that is
+automatically generated by ESMValTool. To examine this, we run a
+recipe_python.yml that can be found in lesson Running your first recipe . Check lesson Configuration on how to set paths.
+
In a new terminal, run the recipe:
+
+
BASH
+
+
cd esmvaltool_tutorial
+esmvaltool run examples/recipe_python.yml
+
+
+
ERROR
+
+
esmvaltool: command not found
+
+
ESMValTool encounters this error because the conda environment
+esmvaltool has not been activated. To fix the error, before
+running the recipe, activate the environment:
+
+
BASH
+
+
conda activate esmvaltool
+
+
+
+
+
+
+
conda environment
+
+
More information about the conda environment can be found at Installation.
In the main_log_debug.txt and main_log.txt,
+ESMValTool writes the output messages, warnings and possible errors that
+might happen during pre-processings. To inspect them, we can look inside
+the files. For example:
In the log.txt, ESMValTool writes the output messages,
+warnings and possible errors that are related to the diagnostic
+script.
+
If you encounter an error and don’t know what it means, it is
+important to read the log information. Sometimes knowing where the error
+occurred is enough to fix it, even if you don’t entirely understand the
+message. However, note that you may not always be able to find the error
+or fix it. In that case, ESMValTool community helps you figure out what
+went wrong.
+
+
+
+
+
+
Different log files
+
+
In the run directory, there are two log files
+main_log_debug.txt and main_log.txt. What are
+their differences?
+
+
+
+
+
+
+
+
+
The main_log_debug.txt contains the output messages from
+the pre-processor whereas the main_log.txt shows general
+errors and warnings that might happen in running the recipe and
+diagnostics script.
+
+
+
+
+
Let’s change some settings in the recipe to run a regional
+pre-processor. We use a text editor called nano to open the
+recipe file:
+
+
BASH
+
+
nano recipe_python.yml
+
+
+
+
+
+
+
Text editor side note
+
+
No matter what editor you use, you will need to know where it
+searches for and saves files. If you start it from the shell, it will
+(probably) use your current working directory as its default location.
+We use nano in examples here because it is one of the least
+complex text editors. Press ctrl + O to save the
+file, and then ctrl + X to exit
+nano.
+
+
+
+
+
+
+
+
+
+
YAML
+
+
# ESMValTool
+# recipe_python.yml
+---
+# See https://docs.esmvaltool.org/en/latest/recipes/recipe_examples.html
+# for a description of this recipe.
+
+# See https://docs.esmvaltool.org/projects/esmvalcore/en/latest/recipe/overview.html
+# for a description of the recipe format.
+---
+documentation:
+ description: |
+ Example recipe that plots a map and timeseries of temperature.
+
+title: Recipe that runs an example diagnostic written in Python.
+
+authors:
+- andela_bouwe
+- righi_mattia
+
+maintainer:
+- schlund_manuel
+
+references:
+- acknow_project
+
+projects:
+- esmval
+- c3s-magic
+
+datasets:
+-{dataset: BCC-ESM1,project: CMIP6,exp: historical,ensemble: r1i1p1f1,grid: gn}
+-{dataset: bcc-csm1-1,version: v1,project: CMIP5,exp: historical,ensemble: r1i1p1}
+
+preprocessors:
+# See https://docs.esmvaltool.org/projects/esmvalcore/en/latest/recipe/preprocessor.html
+# for a description of the preprocessor functions.
+
+to_degrees_c:
+convert_units:
+units: degrees_C
+
+annual_mean_amsterdam:
+extract_location:
+location: Amsterdam
+scheme: linear
+annual_statistics:
+operator: mean
+multi_model_statistics:
+statistics:
+- mean
+span: overlap
+convert_units:
+units: degrees_C
+
+annual_mean_global:
+area_statistics:
+operator: mean
+annual_statistics:
+operator: mean
+convert_units:
+units: degrees_C
+
+diagnostics:
+
+map:
+description: Global map of temperature in January 2000.
+themes:
+- phys
+realms:
+- atmos
+variables:
+tas:
+mip: Amon
+preprocessor: to_degrees_c
+timerange: 2000/P1M
+ caption: |
+ Global map of {long_name} in January 2000 according to {dataset}.
+scripts:
+script1:
+script: examples/diagnostic.py
+quickplot:
+plot_type: pcolormesh
+cmap: Reds
+
+timeseries:
+description: Annual mean temperature in Amsterdam and global mean since 1850.
+themes:
+- phys
+realms:
+- atmos
+variables:
+tas_amsterdam:
+short_name: tas
+mip: Amon
+preprocessor: annual_mean_amsterdam
+timerange: 1850/2000
+caption: Annual mean {long_name} in Amsterdam according to {dataset}.
+tas_global:
+short_name: tas
+mip: Amon
+preprocessor: annual_mean_global
+timerange: 1850/2000
+caption: Annual global mean {long_name} according to {dataset}.
+scripts:
+script1:
+script: examples/diagnostic.py
+quickplot:
+plot_type: plot
+
+
+
+
+
+
Keys and values in recipe settings
+
+
+
The [ESMValTool pre-processors](https://docs.esmvaltool.org/projects/ESMValCore/en/latest/
+recipe/preprocessor.html?highlight=preprocessor) cover a broad range of
+operations on the input data, like time manipulation, area manipulation,
+land-sea masking, variable derivation, etc. Let’s add the preprocessor
+extract_region to a new section
+annual_mean_regional:
+
+
YAML
+
+
preprocessors:
+annual_mean_regional:
+annual_statistics:
+operator: mean
+extract_region:
+start_longitude:-10
+end_longitude:40
+start_latitude:27
+end_latitude:70
+
+
Also, we change the projects value esmval
+to tutorial:
+
+
YAML
+
+
projects:
+- tutorial
+- c3s-magic
+
+
Then, we save the file and run the recipe:
+
+
BASH
+
+
esmvaltool run recipe_python.yml
+
+
+
ERROR
+
+
ValueError: Tag 'tutorial' does not exist in section 'projects' of
+esmvaltool/config-references.yml 2020-06-29 18:09:56,641 UTC [46055] INFO If you
+have a question or need help, please start a new discussion on
+https://github.com/ESMValGroup/ESMValTool/discussions If you suspect this is a
+bug, please open an issue on https://github.com/ESMValGroup/ESMValTool/issues To
+make it easier to find out what the problem is, please consider attaching the
+files run/recipe_*.yml and run/main_log_debug.txt from the output directory.
+
+
The values for the keys author, maintainer,
+projects and references in the recipe should
+be known by ESMValTool:
ESMValTool references in BibTeX format can be found in
+the [ESMValTool/esmvaltool/references](https://github.com/ESMValGroup/ESMValTool/
+tree/master/esmvaltool/references) directory.
+
+
+
ESMValTool can’t locate the
+data
+
You are assisting a colleague with ESMValTool. The colleague replaces
+the CanESM2 entry in
+dataset: CanESM2, project: CMIP5 to ACCESS1-3
+and runs the recipe. However, ESMValTool encounters an error like:
+
+
BASH
+
+
ERROR No input files found for variable {'short_name': 'tas', 'mip': 'Amon',
+'preprocessor':'annual_mean_amsterdam', 'variable_group': 'tas_amsterdam',
+'diagnostic':'timeseries', 'dataset': 'ACCESS1-3', 'project': 'CMIP5', 'exp':
+'historical','ensemble': 'r1i1p1', 'recipe_dataset_index': 1, 'institute':
+['CSIRO-BOM'],'product': ['output1', 'output2'], 'timerange': '1850/2000',
+'alias':'CMIP5', 'original_short_name': 'tas', 'standard_name':
+'air_temperature','long_name': 'Near-Surface Air Temperature', 'units': 'K',
+'modeling_realm':['atmos'], 'frequency': 'mon', 'start_year': 1850,
+'end_year': 2000}
+ERROR Looked for files matching
+['tas_Amon_ACCESS1-3_historical_r1i1p1*.nc'], but did not find any existing
+input directory
+ERROR Set 'log_level' to 'debug' to get more information
+
+
+
What suggestions would you give the researcher for fixing the
+error?
+
+
+
+
+
+
Challenge
+
+
+
+
+
+
+
+
+
+
+
+
Check user-config.yml to see if the correct directory
+for input data is introduced
+
Check the available data, regarding exp, mip, ensemble, start_year,
+and end_year
+
Check the variable names in the diagnostics section in the
+recipe
+
+
+
+
+
+
Check pre-processed data
+
+
+
The setting save_intermediary_cubes in the configuration
+file can be used to save the pre-processed data. More information about
+this setting can be found at Configuration.
+
+
+
+
+
+
save_intermediary_cubes
+
+
Note that this setting should only be used for debugging, as it
+significantly slows down the recipe and increases disk usage because a
+lot of output files need to be stored.
+
+
+
+
Check diagnostic script path
+
+
+
The result of the pre-processor is passed to the
+examples/diagnostic.py script, that is introduced in the
+recipe as:
The diagnostic scripts are located in the folder
+diag_scripts in the ESMValTool installation directory
+<path_to_esmvaltool>. To find where ESMValTool is
+located on your system, see Installation .
+
Let’s see what happens if we can change the script path as:
esmvalcore._task.DiagnosticError: Cannot execute script
+'diag_scripts/ocean/diagnostic_timeseries.py'
+(~/mambaforge/envs/esmvaltool2.6/lib/python3.10/site-packages/esmvaltool/
+diag_scripts/diag_scripts/ocean/diagnostic_timeseries.py):
+file does not exist. 2022-10-18 11:42:34,136 UTC [39323] INFO If you have a
+question or need help, please start a new discussion on
+https://github.com/ESMValGroup/ESMValTool/discussions If you suspect this is a
+bug, please open an issue on https://github.com/ESMValGroup/ESMValTool/issues To
+make it easier to find out what the problem is, please consider attaching the
+files run/recipe_*.yml and run/main_log_debug.txt from the output directory.
+
+
The script path should be relative to diag_scripts
+directory. It means that the script
+diagnostic_timeseries.py is located in
+<path_to_esmvaltool>/diag_scripts/ocean/.
+Alternatively, the script path can be an absolute path. To examine this,
+we can download the script from the ESMValTool
+repository:
Then, run the recipe again and examine the output to see
+Run was successful!
+
+
+
+
+
+
Available recipe and diagnostic scripts
+
+
ESMValTool provides a broad suite of recipes
+and diagnostic scripts for different disciplines like atmosphere,
+climate metrics, future projections, IPCC, land, ocean, ….
+
+
+
+
+
Re-running a diagnostic
+
Look at the main_log.txt file and answer the following
+question: How to re-run the diagnostic script?
+
+
Solution
+
The main_log.txt file contains information on how to
+re-run the diagnostic script without re-running the pre-processors:
+
+
+
+
BASH
+
+
2020-06-29 20:36:32,844 UTC [52810] INFO To re-run this diagnostic script, run:
+
+
+
+
+
+
+
Challenge
+
+
+
+
+
+
+
+
+
+
+
If you run the command in a terminal, you will be able to re-run the
+diagnostic.
+
+
+
+
+
+
+
+
+
+
Memory issues
+
+
If you run out of memory, try setting max_parallel_tasks
+to 1 in the configuration file. Then, check the amount of memory you
+need for that by inspecting the file run/resource_usage.txt
+in the output directory. Using the number, there you can increase the
+number of parallel tasks again to a reasonable number for the amount of
+memory available in your system.
+
+
+
+
+
+
+
+
+
Key Points
+
+
+
There are three different kinds of log files:
+main_log.txt, and main_log_debug.txt and
+log.txt.
+
+
+
+
diff --git a/instructor/glossary.html b/instructor/glossary.html
new file mode 100644
index 00000000..3905fe15
--- /dev/null
+++ b/instructor/glossary.html
@@ -0,0 +1,539 @@
+
+ESMValTool Tutorial: Glossary
+ Skip to main content
+
+
+
+
+
+
+
+ Beta
+
+ This lesson is in the beta phase, which means that it is ready for teaching by instructors outside of the original author team.
+
+
+
+
CMIP: The Coupled Model Intercomparison Project
+(CMIP) aims at better understanding past, present and future climate
+changes arising from natural, unforced variability or in response to
+changes in radiative forcing in a multi-model context. For more
+information, check the WCR-Climate webpage.
+
CMOR: CMOR stands for Climate Model
+Output Rewriter library. It comprises a set of C-based
+functions, with bindings to both Python and FORTRAN 90, that can be used
+to produce CF-compliant NetCDF files that fulfill the requirements of
+many of the climate community’s standard model experiments.
+
dataset: In a recipe, dataset refers to the name
+of the model or observation data, e.g. HadGEM2-ES.
+
diagnostic: In a recipe, the diagnostic
+section(s) define the tasks that will be executed when running the
+recipe. It can also include a diagnostic script that converts the
+preprocessed input data to the desired output like plots or NetCDF
+files.
+
ensemble: An ensemble is a collection of
+experiments based on standardised inputs. This ensures that ensemble
+calculations use the same initial states, initialisation methods, and
+physics details. Ensemble members are named in the rip-nomenclature,
+r for realization, i for initialisation and p
+for physics, followed by an integer, e.g. r1i1p1.
+
experiment: An experiment is defined as an
+activity aimed at addressing a specific scientific problem. In CMIP,
+experiments are numerical experiments with climate models.
+
grid: In the dataset section of a recipe, a grid
+is the array of coordinates on which a variable has been
+sampled.
+
mask: Masks are used to isolate regions. For
+example, if you apply an ocean mask, then coordinates in the ocean will
+not be taken into account for the preprocessors/diagnostics. Masks can
+also be used to work with regions that have missing or invalid
+entries.
+
mip: Typically used to refer to a table of
+variables in ESMValTool. These are standardized per project, e.g. the MIP
+tables for CMIP6.
+
Omon: Monthly ocean data.
+
pr: CMIP short-name for
+‘precipitation’.
+
preprocessor: A preprocessor is an operation
+that is applied to a dataset.
+
project: In the dataset section of a recipe,
+“project” Typically refers to a standard experimental protocol for
+global models, e.g. CMIP.
+
RCP: Representative Concentration
+Pathways are scenarios that represent the full bandwidth of
+possible future emission trajectories. Depending on population growth
+and the development of energy production, food production and land use,
+various emission trajectories are possible.
+
recipe: A recipe contains the instructions to be
+carried out by the ESMValTool. This includes four main sections:
+datasets, preprocessors, diagnostics and description.
+
ta: CMIP short-name for ‘air
+temperature’.
+
tas: CMIP short-name for ‘near-surface air
+temperature’.
+
thetao: Sea water potential temperature variable
+(assume reference height is sea level).
+
thetaoga: Global average sea water potential
+temperature
+
tos: CMIP short-name for ‘sea surface
+temperature’.
+
ts: CMIP short-name for ‘surface
+temperature’.
+
yaml: YAML is a human-readable
+data-serialization language. It is commonly used for configuration files
+and in applications where data is being stored or exchanged.
+
+
diff --git a/instructor/images.html b/instructor/images.html
new file mode 100644
index 00000000..4e0ebe3b
--- /dev/null
+++ b/instructor/images.html
@@ -0,0 +1,525 @@
+
+
+
+
+
+ESMValTool Tutorial: All Images
+
+
+
+
+
+
+
+
+
+
+
+
+ Skip to main content
+
+
+
+
+
+
+
+ Beta
+
+ This lesson is in the beta phase, which means that it is ready for teaching by instructors outside of the original author team.
+
+
+
+
+
+
+
+
diff --git a/instructor/index.html b/instructor/index.html
new file mode 100644
index 00000000..4f06ce05
--- /dev/null
+++ b/instructor/index.html
@@ -0,0 +1,895 @@
+
+ESMValTool Tutorial: Summary and Schedule
+ Skip to main content
+
+
+
+
+
+
+
+ Beta
+
+ This lesson is in the beta phase, which means that it is ready for teaching by instructors outside of the original author team.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ESMValTool Tutorial
+
+
+
+
+
+
+
+
+
+
+
+
+
+
Summary and Schedule
+
+
+
This tutorial helps you to use ESMValTool.
+
The Earth System Model Evaluation Tool (ESMValTool) is a community
+developed software toolkit that aims to facilitate the diagnosis and
+evaluation of the causes and effects of model biases and inter-model
+spread within the CMIP model ensemble.
+
This tutorial is structured into basic and
+advanced topics such that episodes starting from the
+[Introduction][lesson-introduction] up to the episode on Conclusion of the basic tutorial all
+cover basic topics and can be done in one sitting.
+
The remaining episodes cover the advanced topics and each episode is
+a mini-tutorial covering an advanced aspect of working with ESMValTool.
+These mini-tutorials can be appended to the main tutorial or worked
+through independently.
+
+
+
+
+
+
What will you learn in this course
+
+
What is ESMValTool
+
How to install ESMValTool
+
How to configure ESMValTool for your local system
+
How to run ESMValTool
+
How to work with ESMValTool’s suite of preprocessors
+
How to debug your recipes
+
How to access and deploy recipes from the ESMValTools gallery
+(Advanced)
+
How to develop your own diagnostics and recipes (Advanced)
+
How to contribute your recipes and diagnostics back into ESMValTool
+(Advanced)
+
How to include new observational datasets (Advanced)
Main things you need to know before starting this course
+
This tutorial can be taken online independently or taught by one
+of our instructors.
+
Don’t be alarmed if you can’t work through the entire tutorial in
+one sitting. It may take some time to get used to working with
+ESMValTool.
+
If you get stuck, help is always available from the tutors, from
+ESMValTool developers via the github issues
+page or via the ESMValTool email list. Please see information
+on how to subscribe to user mailing list.
+
This tutorial includes several advanced lessons after the
+conclusion. These advanced lessons should be treated like
+“mini-tutorials”, and include aspects like “developing your own
+diagnostic” or “how to include observations”.
+What is the purpose of the quickstart guide? How do I load and check
+the ESMValTool environment? How do I configure ESMValTool? How
+do I run a recipe?
+
+How do I create a new recipe? Can I use different preprocessors for
+different variables? Can I use different datasets for different
+variables? How can I combine different preprocessor
+functions? Can I run the same recipe for multiple ensemble members?
+
+CMORization: what is it and why do we need it? How to use the
+existing CMORizer scripts shipped with ESMValTool? How to add
+support for new (observational) datasets?
+
+ The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.
+
+
This page includes some information on how to prepare for
+participating in this tutorial.
+
+
+
+
+
+
Prerequisites
+
+
Minimal requirements:
+
Basic understanding of your preferred command line interface (ie a
+bash terminal)
+
Access to CMIP data
+
Optional, but useful:
+
Basic understanding of git
+
Access to a suitable computing system (eg CEDA-Jasmin,
+DKRZ-Mistral)
+
GitHub account
+
+
+
+
Command line & git tutorials
+
We typically use the command line to interact with ESMValTool. While
+most of us are likely to have experience with the command line, novices
+may want to work through this software carpentry unix shell course.
Git is a distributed version-control system for tracking changes in
+source code during software development. It’s how we distribute, share,
+and manage the ESMValTool code.
Access to CMIP and Observational data and a suitable compute
+cluster
+
To complete this tutorial and use ESMValTool, you will need access to
+data in a reasonable format. Some data will be provided, but there is
+simply too much data available for your tutors to make it all available
+directly.
+
ESMValTool may be run on multiple platforms, from your local machine
+to large computing clusters. The best option is to use a computing
+cluster with an Earth System
+Grid Federation (ESGF) node. The benefit of using a compute cluster
+with an ESGF node is that the Coupled
+Model Intercomparison Project (CMIP) is locally stored on disk and
+accessible directly by the tool. Similarly, observational data would
+also be available at these sites. The ESGF also hosts observations for
+Model Intercomparison Projects (obs4MIPs) and reanalyses data
+(ana4MIPs).
+
Here are a few options for compute clusters with ESGF nodes:
Also note that if you are working from home, JASMIN may not be
+directly accessible from your home. You may need to use ssh to connect
+to a machine in your institute and then on to JASMIN. Please test your
+connection before the tutorial.
+
+
Jasmin-login
+
Note that you have only created an account for the web-interface. To
+log into the jasmin machine and do work, you’ll need to create a login
+account too using this
+page.
Once you have access to the data archive on CEDA, make sure to link
+your CEDA and JASMIN accounts. This can be done by checking the link to
+CEDA box on your JASMIN
+profile page.
+
The linking may take a few hours to take effect and is necessary for
+you to access the BADC archives via JASMIN. Some CMIP5 data sets such as
+MIROC are not accessible by default and special permission has to be
+requested to access them via the
+CEDA catalogue page.
+
+
+
Test your Setup
+
Log into jasmin-login:
+
+
BASH
+
+
ssh-X JASMIN-USERNAME@login1.jasmin.ac.uk
+
+
Then log into the sci1 machine:
+
+
BASH
+
+
ssh-X jasmin-sci1
+
+
Can you see the following locations:
+
+
BASH
+
+
ls /gws/nopw/j04/esmeval/obsdata-v2/
+ls /badc/cmip5/data/cmip5/output1/MOHC/HadGEM2-ES
+ls /badc/cmip6/data/CMIP6/CMIP/*/*/historical/r1i1p1f?/Omon/[ts]os/gn/latest/*.nc
+
+
Note that the JASMIN is only open to certain locations (mostly
+universities, and research centres). You may need a VPN if you wish to
+connect from your home network.
Once you have access to the data archive on CEDA, make sure to link
+your CEDA and JASMIN accounts. This can be done by checking the link to
+CEDA box on your JASMIN
+profile page.
+
The linking may take a few hours to take effect and is necessary for
+you to access the BADC archives via JASMIN. Some CMIP5 data sets such as
+MIROC are not accessible by default and special permission has to be
+requested to access them via the
+CEDA catalogue page.
Please skip this section if you are not going to use DKRZ and go here.
+
If you do not already have an account at the DKRZ, then register as soon as
+possible. You could find a short introduction on how to get started at
+DKRZ here.
+
There is also a user manual for
+Levante which is DKRZ’s current supercomputer.
+
+
Join a project
+
To use the resources on DKRZ you have to join a project. One option
+is to join an existing project by logging into https://luv.dkrz.de/ with
+your account and select ‘Join existing project’. Once you are accepted
+by the manager of your chosen project, your web account will be turned
+into a full LDAP account which will allow you to log into and use the
+DKRZ’s resources. If you do not have access to an existing project,
+another option for you would be to apply for resources at DKRZ. Here are
+some instructions on how
+to apply for resources.
+
+
+
Access to data on DKRZ
+
CMIP5 and CMIP6 data are available in these directories:
Temporary data: scratch directory on /scratch/*/username is
+automatically deleted after 14 days (15TiB) (Please use this directory
+for all your testing! Do not use the work directory for tests.) (see
+also this)
+
Running batch jobs: Info and examples on SLURM job scheduling system
+at DKRZ can be found here.
Please skip this section if you are not going to use ESMValTool on
+your local machine and go here.
+
If you are planning on running ESMValTool on your own machine, please
+make sure that you are able to download CMIP data and that you have a
+few GB of space available to install conda and ESMValTool, but also
+enough to make a copy of some data (~125MB) needed for this
+tutorial.
+
You can use ESMValTool to automatically download data needed for test
+recipes. Please see the [Configuration][lesson-configuration] episode or
+the [configuration file documentation][config-file] for more
+information. This the recommended option as it has the advantage that
+data is stored in subdirectories, and features such as wildcards and
+recording the version of the data will work automatically.
+
Alternatively, you can run the following command using wget:
You don’t need a github account to participate in the tutorial.
+However, if you want to raise an issue, contribute to the discussions,
+or share your code, please create a github
+account.
You may hear a few of the following phrases during the tutorial.
+Don’t be alarmed, they will make sense eventually.
+
+
GitHub issues
+
Issues are github’s ticketing system. They allow users and developers
+to discuss problems, identify bugs, or to make suggestions. Each issue
+is assigned a number and will have it’s own page on GitHub.
Raising an issue is the act of creating a new issue. If you are asked
+to raise an issue, please follow any instructions that you are given,
+and also make sure that you read the default issue text.
+
+
+
GitHub pull requests
+
A GitHub pull request is the act of requesting that a branch is
+merged with another branch.
+
This is an advanced feature of GitHub, and will generally be
+performed by the ESMValTool development team.
+
+
+
+
+ Beta
+
+ This lesson is in the beta phase, which means that it is ready for teaching by instructors outside of the original author team.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ESMValTool Tutorial
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
Instructor Notes
+
+
+
This page includes some tips, reminders and advice for giving this
+tutorial.
+
Tutorial Content
+
+
+
Well ahead of the planned tutorial date, please go over the available
+tutorial material in your own time at least once to note specifics such
+as areas where your cohort may need extra time, any changes you want to
+make or additional examples you would like to include.
+
If you want to add to or adapt the material, you can create your own
+copy to modify by forking the repository. See here for a guide on how
+to fork a repository.
+
Please use our guide for Contributing
+when creating your own material just in case you would like to add to
+the current repository in the future.
+
Once you are satisified with the material you would like to use and
+have an estimate of the time required, please proceed with meeting
+logistics.
+
Meeting Logistics
+
+
+
Decide if the tutorial will be held on site or online. If on site,
+make sure of venue availability before finalizing the date. If online,
+set up connection details.
+
Once you have agreed to a date, and attendees have signed up, send
+them an email with the following information:
a contact email address so they can post any questions that arise
+during setup
+
venue details (if on site) or meeting connection details (if
+online).
+
+
Also allow for sufficient time to get access to any accounts etc.
+which could take more than a few days in some cases (a good thumb rule
+is to have the time to send a reminder email a few days after the first
+email, so that participants will still have time after the reminder to
+complete setup).
+
Prepare an estimate for how long the tutorial will last (this could
+vary based on your choice of modules and any changes you have made to
+the tutorial). Do include comfort breaks as needed.
+
If more than one person will be conducting the tutorial, discuss
+ahead and decide who will handle each component for a smooth flow of the
+material.
+
Remind people a few days before the meeting about the pre-meeting
+exercises.
+
If conducting the tutorial on site, confirm room bookings and check
+for internet connectivity or firewall issues at the venue. Coffee and
+biscuits also usually go down well with participants!
+
If conducting the tutorial online, have a backup plan for failed or
+poor internet connections. This could involve sending participants some
+of the material or previously recorded meeting videos ahead of time.
+Establish rules for questions, microphone and video usage before
+starting the meeting. Send meeting connection details (Zoom/Teams etc.)
+ahead of time and the day before the meeting.
+
Have a feedback form ready if you would like attendees to send you
+feedback. Request that they do so at the end of the tutorial and send it
+out as soon after the tutorial as possible.
+
Tips
+
+
+
+
Test all recipes, diagnostics and instructions in
+advance.
+
Make sure you don’t have issues with accessing the compute node
+via VPN or Wi-Fi.
+
Check with your computing service for any planned downtime or
+potential interuptions.
+
Download a local copy of the data, just in case.
+
Decide on one or two advanced mini-tutorials as a stretch goal
+but don’t expect to do all of them.
+
+
+
+
+ Beta
+
+ This lesson is in the beta phase, which means that it is ready for teaching by instructors outside of the original author team.
+
+
+
+
The purpose of the quickstart guide is to enable a user of
+ESMValTool to run ESMValTool as quickly as possible without having to go
+through the whole tutorial
+
Use the module load command to load the ESMValTool
+environment, see the \[Installation\]\[lesson-installation\] episode for more
+details and use esmvaltool --help to check the ESMValTool
+environment
+
Use esmvaltool config get_config_user to create the
+ESMValTool user configuration file
+
+
+
+
diff --git a/instructor/profiles.html b/instructor/profiles.html
new file mode 100644
index 00000000..1630c7f3
--- /dev/null
+++ b/instructor/profiles.html
@@ -0,0 +1,417 @@
+
+ESMValTool Tutorial: Learner Profiles
+ Skip to main content
+
+
+
+
+
+
+
+ Beta
+
+ This lesson is in the beta phase, which means that it is ready for teaching by instructors outside of the original author team.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ESMValTool Tutorial
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
Learner Profiles
+
+
This is a placeholder file. Please add content here.
+
+
diff --git a/instructor/reference.html b/instructor/reference.html
new file mode 100644
index 00000000..622d964a
--- /dev/null
+++ b/instructor/reference.html
@@ -0,0 +1,466 @@
+
+ESMValTool Tutorial: Glossary
+ Skip to main content
+
+
+
+
+
+
+
+ Beta
+
+ This lesson is in the beta phase, which means that it is ready for teaching by instructors outside of the original author team.
+
+
+
+
+
+
+
+
+ Beta
+
+ This lesson is in the beta phase, which means that it is ready for teaching by instructors outside of the original author team.
+
+
+
+
The purpose of the quickstart guide is to enable a user of
+ESMValTool to run ESMValTool as quickly as possible without having to go
+through the whole tutorial
+
Use the module load command to load the ESMValTool
+environment, see the \[Installation\]\[lesson-installation\] episode for more
+details and use esmvaltool --help to check the ESMValTool
+environment
+
Use esmvaltool config get_config_user to create the
+ESMValTool user configuration file
if in dropdown
+ if (pos >= 0) {
+ var menu_anchor = $(links[pos]);
+ menu_anchor.parent().addClass("active");
+ menu_anchor.closest("li.dropdown").addClass("active");
+ }
+ });
+
+ function paths(pathname) {
+ var pieces = pathname.split("/");
+ pieces.shift(); // always starts with /
+
+ var end = pieces[pieces.length - 1];
+ if (end === "index.html" || end === "")
+ pieces.pop();
+ return(pieces);
+ }
+
+ // Returns -1 if not found
+ function prefix_length(needle, haystack) {
+ if (needle.length > haystack.length)
+ return(-1);
+
+ // Special case for length-0 haystack, since for loop won't run
+ if (haystack.length === 0) {
+ return(needle.length === 0 ? 0 : -1);
+ }
+
+ for (var i = 0; i < haystack.length; i++) {
+ if (needle[i] != haystack[i])
+ return(i);
+ }
+
+ return(haystack.length);
+ }
+
+ /* Clipboard --------------------------*/
+
+ function changeTooltipMessage(element, msg) {
+ var tooltipOriginalTitle=element.getAttribute('data-original-title');
+ element.setAttribute('data-original-title', msg);
+ $(element).tooltip('show');
+ element.setAttribute('data-original-title', tooltipOriginalTitle);
+ }
+
+ if(ClipboardJS.isSupported()) {
+ $(document).ready(function() {
+ var copyButton = "";
+
+ $("div.sourceCode").addClass("hasCopyButton");
+
+ // Insert copy buttons:
+ $(copyButton).prependTo(".hasCopyButton");
+
+ // Initialize tooltips:
+ $('.btn-copy-ex').tooltip({container: 'body'});
+
+ // Initialize clipboard:
+ var clipboardBtnCopies = new ClipboardJS('[data-clipboard-copy]', {
+ text: function(trigger) {
+ return trigger.parentNode.textContent.replace(/\n#>[^\n]*/g, "");
+ }
+ });
+
+ clipboardBtnCopies.on('success', function(e) {
+ changeTooltipMessage(e.trigger, 'Copied!');
+ e.clearSelection();
+ });
+
+ clipboardBtnCopies.on('error', function() {
+ changeTooltipMessage(e.trigger,'Press Ctrl+C or Command+C to copy');
+ });
+ });
+ }
+})(window.jQuery || window.$)
diff --git a/pkgdown.yml b/pkgdown.yml
new file mode 100644
index 00000000..0bf394b3
--- /dev/null
+++ b/pkgdown.yml
@@ -0,0 +1,5 @@
+pandoc: 3.1.11
+pkgdown: 2.1.1
+pkgdown_sha: ~
+articles: {}
+last_built: 2024-11-26T08:08Z
diff --git a/profiles.html b/profiles.html
new file mode 100644
index 00000000..6b673c51
--- /dev/null
+++ b/profiles.html
@@ -0,0 +1,417 @@
+
+ESMValTool Tutorial: Learner Profiles
+ Skip to main content
+
+
+
+
+
+
+
+ Beta
+
+ This lesson is in the beta phase, which means that it is ready for teaching by instructors outside of the original author team.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ESMValTool Tutorial
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
Learner Profiles
+
+
This is a placeholder file. Please add content here.
+
+
diff --git a/reference.html b/reference.html
new file mode 100644
index 00000000..5e0ee348
--- /dev/null
+++ b/reference.html
@@ -0,0 +1,464 @@
+
+ESMValTool Tutorial: Glossary
+ Skip to main content
+
+
+
+
+
+
+
+ Beta
+
+ This lesson is in the beta phase, which means that it is ready for teaching by instructors outside of the original author team.
+
+
+
+