From 7303714aa03268ff73c4b15fb101c0acf2ef7463 Mon Sep 17 00:00:00 2001 From: achiefa Date: Wed, 5 Jun 2024 14:47:52 +0100 Subject: [PATCH 01/22] First commit --- doc/sphinx/source/vp/api.rst | 168 +++++++++++++++++++++++++++++++++ doc/sphinx/source/vp/index.rst | 2 +- 2 files changed, 169 insertions(+), 1 deletion(-) create mode 100644 doc/sphinx/source/vp/api.rst diff --git a/doc/sphinx/source/vp/api.rst b/doc/sphinx/source/vp/api.rst new file mode 100644 index 0000000000..bc5f274c56 --- /dev/null +++ b/doc/sphinx/source/vp/api.rst @@ -0,0 +1,168 @@ +.. code:: eval_rst + + .. _vpapi: + +Using the validphys API +======================= + +Introduction +------------ + +The API functionality allows the ``validphys``/``reportengine`` +machinery to be readily leveraged in a development setting such as a +Jupyter notebook. Any action available to ``validphys`` can be invoked +by Python code, with the same parameters as a runcard. + +For example: + +.. code:: python + + from validphys.api import API + + figs = API.plot_pdfs(pdfs=["NNPDF40_nlo_as_01180"], Q=2) + for f, _ in figs: + f.show() + +The ``API`` object provides a high level interface to the validphys +code. Note that the user doesn’t need to know exactly how to load the +PDF, create the relevant grid to be plotted and then pass that to the +plotting function, this is all handled by the underlying code of +``reportengine``. This abstraction provides a convenient way to explore +the functionality of ``validphys`` (or any other ``reportengine`` +application) as well as to develop further functionality for +``validphys`` itself. + +All the actions available to ``validphys`` are translated into methods +of the ``API`` object. The arguments are the same as the parameters that +the validphys runcard would need to evaluate the action. + +Generic Example +--------------- + +An important use case of this functionality is the development Consider +that you wanted to develop a provider which depends on some expensive +providers, defined somewhere in the ``validphys`` modules, + +.. code:: python + + def expensive_provider1(pdf:PDF, Q, theoryid): + ... + + def expensive_provider2(experiments, ...): + ... + +Now in a notebook we can do + +.. code:: python + + from validphys.api import API + + expensive1 = API.expensive_provider1(pdf="NNPDF40_nlo_as_01180", Q=100, theoryid=208) + expensive2 = API.expensive_provider2(dataset_inputs={"from_": "fit"}, fit="NNPDF40_nlo_as_01180") + +We can then define and test our new function (e.g. in a separate +notebook cell), + +.. code:: python + + def developing_provider(expensive_provider1, expensive_provider2): + ... + + test_output = developing_provider(expensive1, expensive2) + +``expensive1`` and ``expensive2`` have already been evaluated using the +validphys machinery, and we just had to declare the ``validphys2`` +runcard inputs in order to use those providers. The output of these +expensive function is now saved. So for the remainder of our notebook +session we don’t need to re-run the expensive providers every time we +wish to change something with our ``developing_provider``. Clearly this +massively reduces the time to develop and test the new provider since, +the expensive providers which the new ``developing_provider`` depends on +are cached for the rest of the jupyter session. + +For ``expensive_provider2`` the input was slightly more complicated. +When using the API remember that the input is exactly the same as a +``validphys2`` runcard. The runcards are in a ``yaml`` format which is +then parsed as a ``dict``. If it seems more intuitive one can utilise +this when declaring the inputs for the API providers, for example: + +.. code:: python + + input2 = { + "experiments": { + "from_": "fit" + }, + "fit": "NNPDF40_nlo_as_01180" + } + expensive2 = API.expensive_provider2(**input2) + +The ``input2`` dictionary is visually almost identical the corresponding +``validphys2`` runcard, we just need to be careful the separate items +with commas, that all dict keys are strings and that the typing is +correct for the various inputs, we can always look up the appropriate +typing by using the ``validphys --help`` functionality. + +Creating figures in the ``validphys`` style +------------------------------------------- + +If a figure is created using the api, as with the first example: + +.. code:: python + + from validphys.api import API + + fig = API.some_plot(...) + fig.show() + +you might notice that the style of the plot is very different to those +produce by ``validphys``. If you want to use the same style as +``validphys`` then consider using the following commands at the top of +your script or notebook: + +.. code:: python + + import matplotlib + from validphys import mplstyles + matplotlib.style.use(str(mplstyles.smallstyle)) + +also consider using ``fig.tight_layout()`` which reportengine uses +before saving figures. For the example used earlier we would then have + +.. code:: python + + import matplotlib + from validphys import mplstyles + matplotlib.style.use(str(mplstyles.smallstyle)) + + from validphys.api import API + + figs = API.plot_pdfs(pdfs=["NNPDF40_nlo_as_01180"], Q=2) + for f, _ in figs: + f.tight_layout() + f.show() + +Mixing declarative input with custom resources (NOTE: Experimental) +------------------------------------------------------------------- + +For some actions it is possible to mix declarative input with custom +resources. + +Take for example ``xplotting_grid``, which minimally requires us to +specify ``pdf``, ``Q``. We see from ``validphys --help xplotting_grid`` +that it depends on the provider ``xgrid`` which in turn returns a tuple +of ``(scale, x_array)``. Using the API we could specify our own custom +xgrid input, but then rely on the API to collect the other relevant +resources, for example: + +.. code:: python + + import numpy as np + from validphys.api import API + + new_xgrid = ("linear", np.array([0.1, 0.2]) + pdf_grid = API.xplotting_grid(pdf="NNPDF40_nlo_as_01180", Q=2, xgrid=new_xgrid) + +The API offers flexibility to mix declarative inputs such as +``pdf=`` with python objects +``xgrid=(, )``, note that this is very dependent +on the provider in question and is not guaranteed to work all the time. diff --git a/doc/sphinx/source/vp/index.rst b/doc/sphinx/source/vp/index.rst index 2e2a5cbcec..fcde1c3e31 100644 --- a/doc/sphinx/source/vp/index.rst +++ b/doc/sphinx/source/vp/index.rst @@ -57,7 +57,7 @@ Using validphys ./datthcomp.md ./reports.rst ./scripts.rst - ./api.md + ./api.rst ./developer.rst ./tables_figs.rst ./customplots.rst From a8139bd50c7c6cebc384da759790d6f485992425 Mon Sep 17 00:00:00 2001 From: achiefa Date: Fri, 27 Sep 2024 10:13:43 +0200 Subject: [PATCH 02/22] Updated deprecated version of intersphinx_mapping --- doc/sphinx/source/conf.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/sphinx/source/conf.py b/doc/sphinx/source/conf.py index f6caf9cac3..4ac1fcc4d7 100644 --- a/doc/sphinx/source/conf.py +++ b/doc/sphinx/source/conf.py @@ -212,7 +212,7 @@ # -- Options for intersphinx extension --------------------------------------- # Example configuration for intersphinx: refer to the Python standard library. -intersphinx_mapping = {'https://docs.python.org/': None} +intersphinx_mapping = {'python': ('https://docs.python.org/', None)} # -- Options for todo extension ---------------------------------------------- From d500305cdafae8644802b8cb6c0398ae2cbe4570 Mon Sep 17 00:00:00 2001 From: achiefa Date: Fri, 27 Sep 2024 10:14:31 +0200 Subject: [PATCH 03/22] Delted api.md - added cuts.rst --- doc/sphinx/source/vp/api.md | 155 ----------------------------------- doc/sphinx/source/vp/api.rst | 4 +- 2 files changed, 1 insertion(+), 158 deletions(-) delete mode 100644 doc/sphinx/source/vp/api.md diff --git a/doc/sphinx/source/vp/api.md b/doc/sphinx/source/vp/api.md deleted file mode 100644 index eb1e59dda4..0000000000 --- a/doc/sphinx/source/vp/api.md +++ /dev/null @@ -1,155 +0,0 @@ -```eval_rst -.. _vpapi: -``` -# Using the validphys API - -## Introduction - - -The API functionality allows the `validphys`/`reportengine` machinery to be -readily leveraged in a development setting such as a Jupyter notebook. Any -action available to `validphys` can be invoked by Python code, with the same -parameters as a runcard. - -For example: - -```python -from validphys.api import API - -figs = API.plot_pdfs(pdfs=["NNPDF40_nlo_as_01180"], Q=2) -for f, _ in figs: - f.show() -``` - -The `API` object provides a high level interface to the validphys code. Note -that the user doesn't need to know exactly how to load the PDF, create the -relevant grid to be plotted and then pass that to the plotting function, this is -all handled by the underlying code of `reportengine`. This abstraction provides -a convenient way to explore the functionality of `validphys` (or any other -`reportengine` application) as well as to develop further functionality for -`validphys` itself. - -All the actions available to `validphys` are translated into methods of the `API` -object. The arguments are the same as the parameters that the validphys runcard -would need to evaluate the action. - -## Generic Example - -An important use case of this functionality is the development -Consider that you wanted to develop a provider which depends on some expensive providers, defined -somewhere in the `validphys` modules, - -```python -def expensive_provider1(pdf:PDF, Q, theoryid): - ... - -def expensive_provider2(experiments, ...): - ... - -``` - -Now in a notebook we can do - -```python -from validphys.api import API - -expensive1 = API.expensive_provider1(pdf="NNPDF40_nlo_as_01180", Q=100, theoryid=208) -expensive2 = API.expensive_provider2(dataset_inputs={"from_": "fit"}, fit="NNPDF40_nlo_as_01180") - -``` - -We can then define and test our new function (e.g. in a separate notebook cell), - -```python -def developing_provider(expensive_provider1, expensive_provider2): - ... - -test_output = developing_provider(expensive1, expensive2) -``` - -`expensive1` and `expensive2` have already been evaluated using the validphys machinery, and we just -had to declare the `validphys2` runcard inputs in order to use those providers. The output of these -expensive function is now saved. So for the remainder of our notebook session we don't need to -re-run the expensive providers every time we wish to change something with our `developing_provider`. -Clearly this massively reduces the time to develop and test the new provider since, the expensive -providers which the new `developing_provider` depends on are cached for the rest of the jupyter -session. - -For `expensive_provider2` the input was slightly more complicated. When using the API remember that -the input is exactly the same as a `validphys2` runcard. The runcards are in a `yaml` format which -is then parsed as a `dict`. If it seems more intuitive one can utilise this when declaring the -inputs for the API providers, for example: - -```python -input2 = { - "experiments": { - "from_": "fit" - }, - "fit": "NNPDF40_nlo_as_01180" -} -expensive2 = API.expensive_provider2(**input2) -``` - -The `input2` dictionary is visually almost identical the corresponding `validphys2` runcard, we just -need to be careful the separate items with commas, that all dict keys are strings and that -the typing is correct for the various inputs, we can always look up the appropriate typing by using -the `validphys --help` functionality. - -## Creating figures in the `validphys` style - -If a figure is created using the api, as with the first example: - -```python -from validphys.api import API - -fig = API.some_plot(...) -fig.show() -``` - -you might notice that the style of the plot is very different to those produce by `validphys`. If you -want to use the same style as `validphys` then consider using the following commands at the top of -your script or notebook: - -```python -import matplotlib -from validphys import mplstyles -matplotlib.style.use(str(mplstyles.smallstyle)) -``` - -also consider using `fig.tight_layout()` which reportengine uses before saving figures. For the -example used earlier we would then have - -```python -import matplotlib -from validphys import mplstyles -matplotlib.style.use(str(mplstyles.smallstyle)) - -from validphys.api import API - -figs = API.plot_pdfs(pdfs=["NNPDF40_nlo_as_01180"], Q=2) -for f, _ in figs: - f.tight_layout() - f.show() -``` - -## Mixing declarative input with custom resources (NOTE: Experimental) - -For some actions it is possible to mix declarative input with custom resources. - -Take for example `xplotting_grid`, which minimally requires us to specify -`pdf`, `Q`. We see from `validphys --help xplotting_grid` that it depends on the provider `xgrid` -which in turn returns a tuple of `(scale, x_array)`. Using the API we could specify our own custom -xgrid input, but then rely on the API to collect the other relevant resources, for example: - -```python -import numpy as np -from validphys.api import API - -new_xgrid = ("linear", np.array([0.1, 0.2]) -pdf_grid = API.xplotting_grid(pdf="NNPDF40_nlo_as_01180", Q=2, xgrid=new_xgrid) - -``` - -The API offers flexibility to mix declarative inputs such as `pdf=` with python objects -`xgrid=(, )`, note that this is very dependent on the provider in question -and is not guaranteed to work all the time. diff --git a/doc/sphinx/source/vp/api.rst b/doc/sphinx/source/vp/api.rst index bc5f274c56..aef6cf5dbf 100644 --- a/doc/sphinx/source/vp/api.rst +++ b/doc/sphinx/source/vp/api.rst @@ -1,6 +1,4 @@ -.. code:: eval_rst - - .. _vpapi: +.. _vpapi: Using the validphys API ======================= From e933e0616f4137c76b1a311c115c9bd85763525e Mon Sep 17 00:00:00 2001 From: achiefa Date: Fri, 27 Sep 2024 10:15:01 +0200 Subject: [PATCH 04/22] Updated index --- doc/sphinx/source/vp/index.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/sphinx/source/vp/index.rst b/doc/sphinx/source/vp/index.rst index fcde1c3e31..6075339b24 100644 --- a/doc/sphinx/source/vp/index.rst +++ b/doc/sphinx/source/vp/index.rst @@ -53,7 +53,7 @@ Using validphys ./upload.md ./nnprofile.md ./complex_runcards.rst - ./cuts.md + ./cuts.rst ./datthcomp.md ./reports.rst ./scripts.rst From 2f0b0b1809f131230118be21fd90ca7803cd494a Mon Sep 17 00:00:00 2001 From: achiefa Date: Fri, 27 Sep 2024 14:25:13 +0200 Subject: [PATCH 05/22] Progress - WIP --- doc/sphinx/source/serverconf/index.md | 269 ------------------ doc/sphinx/source/serverconf/index.rst | 268 +++++++++++++++++ .../source/vp/{datthcomp.md => datthcomp.rst} | 26 +- doc/sphinx/source/vp/index.rst | 6 +- doc/sphinx/source/vp/upload.md | 191 ------------- doc/sphinx/source/vp/upload.rst | 187 ++++++++++++ 6 files changed, 470 insertions(+), 477 deletions(-) delete mode 100644 doc/sphinx/source/serverconf/index.md create mode 100644 doc/sphinx/source/serverconf/index.rst rename doc/sphinx/source/vp/{datthcomp.md => datthcomp.rst} (60%) delete mode 100644 doc/sphinx/source/vp/upload.md create mode 100644 doc/sphinx/source/vp/upload.rst diff --git a/doc/sphinx/source/serverconf/index.md b/doc/sphinx/source/serverconf/index.md deleted file mode 100644 index e0822099a3..0000000000 --- a/doc/sphinx/source/serverconf/index.md +++ /dev/null @@ -1,269 +0,0 @@ -```eval_rst -.. _server: -``` -Servers -======= - -The NNPDF collaboration employs a storage server that host various data files, -meant for both public and internal consumption. It hosts the following URLs: - - - : Hosts **public** - NNPDF data such as PDF fits, releases etc. - - : Hosts the - [`validphys`](vp-index) report and displays an index of all of the reports. - - : Hosts the github wiki version. - - : Hosts the `conda` binary packages. - - : Hosts this documentation. - -SSH is used to interact with the server, as described in [Access](#access) -below. - - -The NNPDF server is a virtual machine (VM) maintained by -the Centro Calcolo at the physics department of the -University of Milan. The machine has 2 CPUs, 4GB of RAM, -1 TB of disk and it is running CentOS7. - -The full disk is backed up every week by the Centro Calcolo. -We perform every Sunday a `rsync` from the `/home/nnpdf` folder -to the `nnpdf@lxplus` account at CERN. - - -```eval_rst -.. _server-access: -``` -Access ------- - -### User access - -The access to the server is provided by -`ssh`/[`vp-upload`](upload) with the following restrictions: - -- `ssh` access to `root` is forbidden. -- There is a shared `nnpdf` user with low privileges. In order to login -the user must send his public ssh key (usually in `~/.ssh/id_rsa.pub`) to SC. -The `nnpdf` is not allowed to login with password. - -The `nnpdf` user shares a common `/home/nnpdf` folder where all NNPDF -material is stored. Public access to data is available for all files -in the `/home/nnpdf/WEB` folder. The `validphys` reports are stored in -`/home/nnpdf/validphys-reports` and the wiki in -`/home/nnpdf/WEB/wiki`. - -### Access for continuous deployment tools - -The [`conda` packages](conda) as well as the documentation are -automatically uploaded to the server by the Continous Integration service -(Travis), through an user called `dummy` which has further reduction in -privileges (it uses the [`rssh` shell](https://linux.die.net/man/1/rssh)) and it -is only allowed to run the `scp` command. An accepted private key is stored -securely in the [Travis configuration](travis-variables). The packages -are uploaded to `/home/nnpdf/packages`. - -### HTTP access - -Tools such as [conda](conda) and [vp-get](download) require access to -private URLs, which are password-protected, using HTTP basic_auth. The -access is granted by a `/.netrc` file containing the user and password -for the relevant servers. The `/.netrc` file is typically generated -at [installation](conda) time. It should look similar to -``` -machine vp.nnpdf.science - login nnpdf - password - -machine packages.nnpdf.science - login nnpdf - password -``` - -The relevant passwords can be found -[here](https://www.wiki.ed.ac.uk/pages/viewpage.action?pageId=292165461). - - -```eval_rst -.. _web-scripts: -``` -Web scripts ------------ - -Validphys2 interacts with the NNPDF server by [downloading resources](download) -and [uploading results](upload). - -The server scripts live in the validphys2 -repository under the `serverscripts` folder. - -The server side -infrastructure that makes this possible currently aims to be -minimalistic, although it may need to be expanded to a more robust web -application in time. -At the moment, only thing that is done is maintaining some index -files (currently for theories, fits, reports and LHAPDF sets) -which essentially list the files in a given directory. The indexes are -regenerated automatically when their correspondent folders are -modified. This is achieved by waiting for changes using the Linux -`inotify` API and the -[`asynwatch`](https://github.com/Zaharid/asyncwatch) module. These scripts are -often controlled by [cron jobs](#cron-jobs). - -The report index is used to display a webpage indexing the reports. It -retrieves extra information from a `meta.yaml` file in the top level -output directory, and (with lower priority) by parsing an `index.html` -page contained in the report folder. Properties like title, author and tags -are retrieved from the HTML header of this file, and are expected to -be in the same format that Pandoc would have used to write them when -`meta.yaml` is passed as a input. To produce it, the most convenient -way is setting the `main` flag of a report, as described in [Uploading -the result]. - -Additionally information from the mailing list is added to the index -page. Specifically we query the list for links to validphys reports -and add links to the emails next to the entries of the reports that -are mentioned. This is achieved with the `index-email.py` script. It -needs some authentication credentials to access the mailing list. The -password is stored in a file called `EMAIL_BOT_PASSWORD`, which is not -tracked by git. The script outputs two files in the root folder, -`email_mentions.json` which should be used by other applications (such -as the report indexer) and `seen_emails_cache.pkl`, which is there to -avoid downloading emails that are already indexes. These files need to -be deleted when the format of the index is updated. - -The report index uses the -[DataTables](https://datatables.net/) JS library. It provides -filtering and sorting capabilities to the indexes tables. The source -file is: -``` -serverscripts/validphys-reports/index.html -``` -in the validphys2 directory. It should be updated from time to time to -highlight the most interesting reports at a given moment. This can be -done by for example displaying in a separate table at the beginning -the reports marked with some keyword (for example 'nnpdf31'). - -The Makefile inside will synchronize them with -the server. - -The report indexing script generates thumbnails in the -`WEB/thumbnails` which are then associated to each report. This is -done by looking at the image files inside the `figures` folder of each -uploaded report (see the source of the script for more details). It is -expected that the server redirects the requests for -`vp.nnpdf.science/thumbnails` to this folder. - - -Cron jobs ---------- - -The following cron jobs are registered for the `nnpdf` user: - -- every day at 4 AM run the `index-email.py` script. -- at every reboot run `index-reports.py`, `index-fits.py`, `index-hyperscan.py`, - `index-packahes-public.sh` and `index-packages-private.sh`, which monitor - continuously the respective folders and create indexes that can be used by - various applications. The first two are homegrown scripts (see [Web - Scripts](#web-scripts)) and the later two use - [`conda-index`](https://docs.conda.io/projects/conda-build/en/latest/resources/commands/conda-index.html). - - -The following cron jobs are registered for the `root` user: - -- perform backup of `/home/nnpdf` in lxplus every Saturday at noon. -- perform a certbot renew every Monday. -- reboot every Sunday at 6am (in order to use new kernels). -- perform system update every day. - -Web server Configuration ------------------------- - -We are using `nginx` as a lightweight and simple web server engine. The -`nginx` initial configuration depends on the linux distribution in -use. Usually debian packages provide a ready-to-go version where the -`/etc/nginx/nginx.conf` is already set to work with server blocks -(subdomains). - -Other distributions like CentOS7 requires more gymnastics, here some tricks: - -- make sure the `/home/nnpdf` folder can be accessed by the `nginx` user -- folders served by `nginx` must have permission 755 -- create 2 folders in `/etc/nginx`: `sites-available` and `sites-enabled`. -- in the `/etc/nginx/nginx.conf` file indicate the new include path with `include /etc/nginx/sites-enabled/*;` and remove all location statements. -- for each server block create a new file in `/etc/nginx/sites-available` and build a symbolic link in `/etc/nginx/sites-enabled`. -- remember to perform a `sudo service nginx restart` or `sudo nginx -s reload` to update the server block configuration. - - -Finally, here an example of `nginx` configuration for the `vp.nnpdf.science` server block without ssl encryption: -``` -server { - listen 80; - listen [::]:80; - server_name vp.nnpdf.science; - - root /home/nnpdf/validphys-reports; - location / { - try_files $uri $uri/ =404; - auth_basic "Restricted"; - auth_basic_user_file /home/nnpdf/validphys-reports/.htpasswd; - } - - location /thumbnails { - alias /home/nnpdf/thumbnails; - try_files $uri $uri/ =404; - auth_basic "Restricted"; - auth_basic_user_file /home/nnpdf/validphys-reports/.htpasswd; - } -} -``` - -Some URLs are password protected using the HTTP `basic_auth` mechanism. This is -implemented by setting the corresponding configuration in nginx, as shown above -(specifically with the `auth_basic` and `auth_basic_user_file` keys). The -`.htpasswd` files mentioned in the configuration are generated with the -`htpasswd` tool. - -### DNS - -The domain is hosted by [Namecheap](https://namecheap.com), which also manages the -DNS entries. For each subdomain there is an `A` record always pointing to the -same server IP, currently 159.149.47.24. The subdomains are then handled as -described in [Web server](#web-server). For example, a DNS query for -`packages.nnpdf.science` returns - -``` - $ dig packages.nnpdf.science - -; <<>> DiG 9.11.3-1ubuntu1.7-Ubuntu <<>> packages.nnpdf.science -;; global options: +cmd -;; Got answer: -;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 26766 -;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 - -;; OPT PSEUDOSECTION: -; EDNS: version: 0, flags:; udp: 65494 -;; QUESTION SECTION: -;packages.nnpdf.science. IN A - -;; ANSWER SECTION: -packages.nnpdf.science. 1799 IN A 159.149.47.24 - -;; Query time: 170 msec -;; SERVER: 127.0.0.53#53(127.0.0.53) -;; WHEN: Tue May 28 14:26:53 BST 2019 -;; MSG SIZE rcvd: 67 -``` - - -### SSL encryption - -SSL encription is provided by [Let's Encrypt](https://letsencrypt.org). -The certificates are created using the `certbot` program with the `nginx` module. - -In order to create new ssl certificates, first prepare the `nginx` server block -configuration file and then run the interactive command: -``` -sudo certbot --nginx -d -``` -This will ask you several questions, including if you would like to automatically -update the `nginx` server block file. We fully recommend this approach. - -The certificate is automatically renewed by a [cron job](#cron-jobs). diff --git a/doc/sphinx/source/serverconf/index.rst b/doc/sphinx/source/serverconf/index.rst new file mode 100644 index 0000000000..e03df2cb0e --- /dev/null +++ b/doc/sphinx/source/serverconf/index.rst @@ -0,0 +1,268 @@ +.. _server: + +Servers +======= + +The NNPDF collaboration employs a storage server that host various data files, +meant for both public and internal consumption. It hosts the following URLs: + + - https://data.nnpdf.science: Hosts **public** + NNPDF data such as PDF fits, releases etc. + - https://vp.nnpdf.science: Hosts the :ref:`validphys ` + report and displays an index of all of the reports. + - https://wiki.nnpdf.science: Hosts the github wiki version. + - https://packages.nnpdf.science/: Hosts the ``conda`` binary packages. + - https://docs.nnpdf.science/: Hosts this documentation. + +SSH is used to interact with the server, as described in :ref:`Access ` +below. + + +The NNPDF server is a virtual machine (VM) maintained by +the Centro Calcolo at the physics department of the +University of Milan. The machine has 2 CPUs, 4GB of RAM, +1 TB of disk and it is running CentOS7. + +The full disk is backed up every week by the Centro Calcolo. +We perform every Sunday a ``rsync`` from the ``/home/nnpdf`` folder +to the ``nnpdf@lxplus`` account at CERN. + + +.. _server-access: +Access +------ + +User access +~~~~~~~~~~~ + +The access to the server is provided by +``ssh``/:ref:`vp-upload ` with the following restrictions: + +- ``ssh`` access to ``root`` is forbidden. +- There is a shared ``nnpdf`` user with low privileges. In order to login +the user must send his public ssh key (usually in ``~/.ssh/id_rsa.pub``) to SC. +The ``nnpdf`` is not allowed to login with password. + +The ``nnpdf`` user shares a common ``/home/nnpdf`` folder where all NNPDF +material is stored. Public access to data is available for all files +in the ``/home/nnpdf/WEB`` folder. The ``validphys`` reports are stored in +``/home/nnpdf/validphys-reports`` and the wiki in +``/home/nnpdf/WEB/wiki``. + +Access for continuous deployment tools +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The :ref:`conda packages` as well as the documentation are +automatically uploaded to the server by the Continous Integration service +(Travis), through an user called ``dummy`` which has further reduction in +privileges (it uses the ``rssh`` `shell `_ and it +is only allowed to run the ``scp`` command. An accepted private key is stored +securely in the :ref:`Travis configuration `. The packages +are uploaded to ``/home/nnpdf/packages``. + +HTTP access +~~~~~~~~~~~ + +Tools such as :ref:`conda ` and :ref:`vp-get` require access to +private URLs, which are password-protected, using HTTP basic_auth. The +access is granted by a ``/.netrc`` file containing the user and password +for the relevant servers. The ``/.netrc`` file is typically generated +at :ref:`installation` time. It should look similar to:: + + machine vp.nnpdf.science + login nnpdf + password + + machine packages.nnpdf.science + login nnpdf + password + +The relevant passwords can be found +`here `_. + + +.. _web-scripts: + +Web scripts +----------- + +Validphys2 interacts with the NNPDF server by :ref:`downloading resources ` +and :ref:`uploading results `. + +The server scripts live in the validphys2 +repository under the ``serverscripts`` folder. + +The server side +infrastructure that makes this possible currently aims to be +minimalistic, although it may need to be expanded to a more robust web +application in time. +At the moment, only thing that is done is maintaining some index +files (currently for theories, fits, reports and LHAPDF sets) +which essentially list the files in a given directory. The indexes are +regenerated automatically when their correspondent folders are +modified. This is achieved by waiting for changes using the Linux +``inotify`` API and the +`asynwatch `_ module. These scripts are +often controlled by :ref:`cron jobs `. + +The report index is used to display a webpage indexing the reports. It +retrieves extra information from a ``meta.yaml`` file in the top level +output directory, and (with lower priority) by parsing an ``index.html`` +page contained in the report folder. Properties like title, author and tags +are retrieved from the HTML header of this file, and are expected to +be in the same format that Pandoc would have used to write them when +``meta.yaml`` is passed as a input. To produce it, the most convenient +way is setting the ``main`` flag of a report, as described in :ref:`Uploading +the result `. + +Additionally information from the mailing list is added to the index +page. Specifically we query the list for links to validphys reports +and add links to the emails next to the entries of the reports that +are mentioned. This is achieved with the ``index-email.py`` script. It +needs some authentication credentials to access the mailing list. The +password is stored in a file called ``EMAIL_BOT_PASSWORD``, which is not +tracked by git. The script outputs two files in the root folder, +``email_mentions.json`` which should be used by other applications (such +as the report indexer) and ``seen_emails_cache.pkl``, which is there to +avoid downloading emails that are already indexes. These files need to +be deleted when the format of the index is updated. + +The report index uses the +`DataTables `_ JS library. It provides +filtering and sorting capabilities to the indexes tables. The source +file is: :: + + serverscripts/validphys-reports/index.html + +in the validphys2 directory. It should be updated from time to time to +highlight the most interesting reports at a given moment. This can be +done by for example displaying in a separate table at the beginning +the reports marked with some keyword (for example '`nnpdf31`'). + +The Makefile inside will synchronize them with +the server. + +The report indexing script generates thumbnails in the +``WEB/thumbnails`` which are then associated to each report. This is +done by looking at the image files inside the ``figures`` folder of each +uploaded report (see the source of the script for more details). It is +expected that the server redirects the requests for +``vp.nnpdf.science/thumbnails`` to this folder. + +.. _cron-jobs: + +Cron jobs +--------- + +The following cron jobs are registered for the ``nnpdf`` user: + +- every day at 4 AM run the ``index-email.py`` script. +- at every reboot run ``index-reports.py``, ``index-fits.py``, ``index-hyperscan.py``, ``index-packahes-public.sh`` + and ``index-packages-private.sh``, which monitor continuously the respective folders and create indexes that + can be used by various applications. The first two are homegrown scripts (see :ref:`Web Scripts `) + and the later two use `conda-index `_. + + +The following cron jobs are registered for the ``root`` user: + +- perform backup of ``/home/nnpdf`` in lxplus every Saturday at noon. +- perform a certbot renew every Monday. +- reboot every Sunday at 6am (in order to use new kernels). +- perform system update every day. + +Web server Configuration +------------------------ + +We are using ``nginx`` as a lightweight and simple web server engine. The +``nginx`` initial configuration depends on the linux distribution in +use. Usually debian packages provide a ready-to-go version where the +``/etc/nginx/nginx.conf`` is already set to work with server blocks +(subdomains). + +Other distributions like CentOS7 requires more gymnastics, here some tricks: + +- make sure the ``/home/nnpdf`` folder can be accessed by the ``nginx`` user +- folders served by ``nginx`` must have permission 755 +- create 2 folders in ``/etc/nginx``: ``sites-available`` and ``sites-enabled``. +- in the ``/etc/nginx/nginx.conf`` file indicate the new include path with + ``include /etc/nginx/sites-enabled/*;`` and remove all location statements. +- for each server block create a new file in ``/etc/nginx/sites-available`` and + build a symbolic link in ``/etc/nginx/sites-enabled``. +- remember to perform a ``sudo service nginx restart`` or ``sudo nginx -s reload`` + to update the server block configuration. + + +Finally, here an example of ``nginx`` configuration for the ``vp.nnpdf.science`` server block without ssl encryption: :: + + server { + listen 80; + listen [::]:80; + server_name vp.nnpdf.science; + + root /home/nnpdf/validphys-reports; + location / { + try_files $uri $uri/ =404; + auth_basic "Restricted"; + auth_basic_user_file /home/nnpdf/validphys-reports/.htpasswd; + } + + location /thumbnails { + alias /home/nnpdf/thumbnails; + try_files $uri $uri/ =404; + auth_basic "Restricted"; + auth_basic_user_file /home/nnpdf/validphys-reports/.htpasswd; + } + } + +Some URLs are password protected using the HTTP ``basic_auth`` mechanism. This is +implemented by setting the corresponding configuration in nginx, as shown above +(specifically with the ``auth_basic`` and ``auth_basic_user_file`` keys). The +``.htpasswd`` files mentioned in the configuration are generated with the +``htpasswd`` tool. + +DNS +~~~ + +The domain is hosted by `Namecheap `_, which also manages the +DNS entries. For each subdomain there is an ``A`` record always pointing to the +same server IP, currently 159.149.47.24. The subdomains are then handled as +described in :ref:`Web server `. For example, a DNS query for +``packages.nnpdf.science`` returns:: + + $ dig packages.nnpdf.science + + ; <<>> DiG 9.11.3-1ubuntu1.7-Ubuntu <<>> packages.nnpdf.science + ;; global options: +cmd + ;; Got answer: + ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 26766 + ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 + + ;; OPT PSEUDOSECTION: + ; EDNS: version: 0, flags:; udp: 65494 + ;; QUESTION SECTION: + ;packages.nnpdf.science. IN A + + ;; ANSWER SECTION: + packages.nnpdf.science. 1799 IN A 159.149.47.24 + + ;; Query time: 170 msec + ;; SERVER: 127.0.0.53#53(127.0.0.53) + ;; WHEN: Tue May 28 14:26:53 BST 2019 + ;; MSG SIZE rcvd: 67 + + +SSL encryption +~~~~~~~~~~~~~~ + +SSL encription is provided by `Let's Encrypt `_. +The certificates are created using the ``certbot`` program with the ``nginx`` module. + +In order to create new ssl certificates, first prepare the ``nginx`` server block +configuration file and then run the interactive command: :: + + sudo certbot --nginx -d + +This will ask you several questions, including if you would like to automatically +update the ``nginx`` server block file. We fully recommend this approach. + +The certificate is automatically renewed by a :ref:`cron job `. diff --git a/doc/sphinx/source/vp/datthcomp.md b/doc/sphinx/source/vp/datthcomp.rst similarity index 60% rename from doc/sphinx/source/vp/datthcomp.md rename to doc/sphinx/source/vp/datthcomp.rst index 004d8ca89f..e187b9687a 100644 --- a/doc/sphinx/source/vp/datthcomp.md +++ b/doc/sphinx/source/vp/datthcomp.rst @@ -1,31 +1,29 @@ -```eval_rst .. _data-theory-comp: -``` Comparing data and theory ------------------------- -For a tutorial on how to do a data-theory comparison, see [here](../tutorials/datthcomp.html). +For a tutorial on how to do a data-theory comparison, see :ref:`here `. + +The name of the data-theory comparison tool is ``plot_fancy``. You can +see what parameters in the runcard influence it by typing: :: + + validphys --help plot_fancy -The name of the data-theory comparison tool is `plot_fancy`. You can -see what parameters in the runcard influence it by typing: -``` -validphys --help plot_fancy -``` The basic inputs are a dataset and one or more PDFs. The way a dataset is to be plotted is controlled by one or more PLOTTING files in the -`commondata` format. These are simple YAML files and ideally each +``commondata`` format. These are simple YAML files and ideally each dataset should have them. It is possible to specify how to transform the kinematics stored in the commondata, what to use as x-axis or -how to group the plots. The format is described in detail in [Plotting -format specification](plotting-format). The plotting +how to group the plots. The format is described in detail in :ref:`Plotting +format specification `. The plotting specifications are supported by small amounts of Python (defining the various transformations), which are declared in the -`validphys.plotoptions` package. +``validphys.plotoptions`` package. -Note that PLOTTING files are considered part of `nnpdfcpp`, and as +Note that PLOTTING files are considered part of ``nnpdfcpp``, and as such they are assumed to be correct, so in principle they have no guarantee of failing early with a good error message. However, you can -set `check_plotting: True` in the input configurations to cause the +set ``check_plotting: True`` in the input configurations to cause the PLOTTING files to be processed as soon as the dataset is loaded. This can be useful while debugging the plotting files, but might cause a noticeable delay to the startup (due to loading datasets and fktables). diff --git a/doc/sphinx/source/vp/index.rst b/doc/sphinx/source/vp/index.rst index 6075339b24..5a7570a250 100644 --- a/doc/sphinx/source/vp/index.rst +++ b/doc/sphinx/source/vp/index.rst @@ -49,12 +49,12 @@ Using validphys :maxdepth: 1 ./getting-started.rst - ./download.md - ./upload.md + ./download.rst + ./upload.rst ./nnprofile.md ./complex_runcards.rst ./cuts.rst - ./datthcomp.md + ./datthcomp.rst ./reports.rst ./scripts.rst ./api.rst diff --git a/doc/sphinx/source/vp/upload.md b/doc/sphinx/source/vp/upload.md deleted file mode 100644 index d2f156dd3e..0000000000 --- a/doc/sphinx/source/vp/upload.md +++ /dev/null @@ -1,191 +0,0 @@ -```eval_rst -.. _upload: -``` -Uploading results to the `validphys` repository -=============================================== - -The primary method to share results within the collaboration is by uploading the -corresponding files to the [NNPDF server](server). Most commonly, results are -uploaded to the `validphys` repository, so that they are accessible from - - - -The files in this repository are backed up to two locations, indexed and cross -referenced with the [mailing list](mail). The HTTP access to the files is -password protected. - -The uploading system is designed to be integrated with `validphys`. Reports, -hopefully filled with the [appropriate metadata](#metadata) in the runcard, can -be [uploaded directly](#uploading-directly-from-validphys), or after they have -been completed using the [`vp-upload` script](#the-vp-upload-script). Arbitrary -files can be uploaded using the [`wiki-upload` script](#the-wiki-upload-script), -which will interactively ask the user to fill in the metadata. In either case an -URL will be returned with the location of the resource accessible with a web -browser. - -In order to be able to upload files, the user must have a valid SSH key -installed in the NNPDF server [access](../get-started/access), and the `rsync` -command must be present. - -Several settings relevant to uploading files are configured in [profile -files](nnprofile). - -## Metadata - -Currently the following information is used to index the results: - - - `title` (string) - - `author` (string) - - `keywords` (list of strings) - -The first two are self explanatory, and `keywords` is a list of tags used to -categorize the result, such as *ATLAS jets* or *nn31final*. You can see more -examples in the [webpage](https://vp.nnpdf.science). Keywords are used in -various ways to aid the discoverability of the result, and so it is important to -set them properly. Some keywords may be used to display the report in a -prominent place of the index page. - -For `validphys` runcards, this data is read from a `meta` mapping declared in -the runcard. For example - -```yaml -meta: - title: PDF comparisons - author: NNPDF Collaboration - keywords: [gallery] -``` - -`validphys` uses this mapping to write a `meta.yaml` file with the same -information to the output folder. This file is then used by the indexers. - - -### Conventions for writing metadata - -Reports endowed with the correct metadata can be retrieved from the index even -several years after they are uploaded. - - - Always fill appropriately the metadata fields for anything you upload. - - - Fill the author field with a complete form of your name, e.g. *Zahari - Kassabov* rather than *ZK*, and always use the same name. - - - Add keywords that are relevant to the result you are uploading. Use existing - tags if possible. - -#### Metadata from HTML fallback - -An `index.html` file in the uploaded output folder will serve as a source of -metadata if the `meta.yaml` file is not present (e.g. because the `meta` mapping -was not defined in the runcard). To automatically generate an `index.html` file -from a `report` action, one may set the option `main:True` (alternatively there -is the `out_filename` option, which may be used to specify the filename). In the -template, use the [pandoc-markdown -syntax](http://pandoc.org/MANUAL.html#metadata-blocks) to set the metadata at -the top of the file. In the runcard you would write: - -~~~yaml -template: mytemplate.md -actions_: - - report(main=True) -~~~ -and you would begin `mytemplate.md`, using YAML syntax, like: -```yaml ---- -title: Testing the fit {@fit@} -author: Zahari Kassabov -keywords: [nnpdf31, nolhc] -... -``` -Note that you can use the report syntax to get the parameters from the -runcard. If you only want to set title or author, you can also -prefix the two first lines of the markdown templates with `%`: -```markdown -% Template title -% Myself, the template author - -Content... -``` -This is mostly useful for sub-reports not at the top level, in -more complicated documents. - - -Uploading directly from `validphys` ----------------------------------- - -When the `--upload` flag is set in the invocation of the `validphys` command, -the contents of the output folder will be uploaded to the NNPDF data server, -after validphys is done. Use this if you have [filled the meta mapping in the -runcard](#metadata) and already know that the output is going to be good enough -to share. Otherwise use [`vp-upload`](#the-vp-upload-script) after checking the result. - -`validphys` will check the SSH connection before doing any work, and -it will fail early if it cannot be established. - -```eval_rst -.. _vpupload: -``` -The `vp-upload` script ----------------------- - -The `vp-upload` script uploads completed results to the NNPDF server, such as -reports and fits. To upload a completed `validphys` report, use -``` -vp-upload -``` -The output folder is expected to contain the [metadata](#metadata) (e.g. in the -form of a `meta.yaml` file). If it doesn't exist or you want to upload and index -arbitrary files, use the [`wiki-upload` command](#the-wiki-upload-script). - -```eval_rst -The script automatically detects (:py:func:`validphys.uploadutils.check_input`) the type of the input. -A `fit` is defined to be any folder structure that contains a `filter.yml` file at its root, a `PDF` is any -folder containing a `.info` file at the root and a replica 0, and a report is any such structure containing an -`index.html` file at the root. The input folder is then placed in the correct location in the -server accordingly. -``` - -```eval_rst -.. note:: - If there is already a fit or PDF on the server with the same name as the fit or PDF - you wish to upload, then this command will *not* overwrite the resource that already - exists. To overwite such a resource on the server, use the :code:`--force` option. -``` - -```eval_rst -The code is documented at :py:mod:`validphys.scripts.vp_upload`. -``` - -Note that fits are indexed separately, and can be retrieved with the [`vp-get` -command](download). - - -The `wiki-upload` script ------------------------- - -The `wiki-upload` script is a more interactive counterpart to `vp-upload`. It -allows uploading arbitrary files that do not have metadata attached. It will -construct the metadata by asking the user to fill it in before uploading the -result. The usage is - -``` -wiki-upload -``` -This will cause the user to be prompted for the various metadata fields and the -file or folder to be uploaded to the server, together with a generated -`meta.yaml` file used for indexing. - -```eval_rst -The code is documented at :py:mod:`validphys.scripts.wiki_upload`. -``` - -The `validphys` index page --------------------------- - -The source of the report index page is -``` -serverscripts/validphys-reports/index.html -``` -inside the `validphys2` directory in the main repository. This page can be -edited to reflect the current interests (the Makefile directly uploads to the -server). See the documentation on [web scripts](web-scripts) for more details. - diff --git a/doc/sphinx/source/vp/upload.rst b/doc/sphinx/source/vp/upload.rst new file mode 100644 index 0000000000..16f32c4ccd --- /dev/null +++ b/doc/sphinx/source/vp/upload.rst @@ -0,0 +1,187 @@ +.. _upload: + +Uploading results to the ``validphys`` repository +=============================================== + +The primary method to share results within the collaboration is by uploading the +corresponding files to the :ref:`NNPDF server `. Most commonly, results are +uploaded to the ``validphys`` repository, so that they are accessible from + +`https://vp.nnpdf.science `_ + +The files in this repository are backed up to two locations, indexed and cross +referenced with the :ref:`mailing list `. The HTTP access to the files is +password protected. + +The uploading system is designed to be integrated with ``validphys``. Reports, +hopefully filled with the :ref:`appropriate metadata ` in the runcard, can +be :ref:`uploaded directly `, or after they have +been completed using the ``vp-upload`` :ref:`script `. Arbitrary +files can be uploaded using the ``wiki-upload `` :ref:`script `, +which will interactively ask the user to fill in the metadata. In either case an +URL will be returned with the location of the resource accessible with a web +browser. + +In order to be able to upload files, the user must have a valid SSH key +installed in the NNPDF server `access `_, and the ``rsync`` +command must be present. + +Several settings relevant to uploading files are configured in :ref:`profile +files `. + +.. _vpmetadata: +Metadata +-------- + +Currently the following information is used to index the results: + + - ``title`` (string) + - ``author`` (string) + - ``keywords`` (list of strings) + +The first two are self explanatory, and ``keywords`` is a list of tags used to +categorize the result, such as *ATLAS jets* or *nn31final*. You can see more +examples in the `webpage `_. Keywords are used in +various ways to aid the discoverability of the result, and so it is important to +set them properly. Some keywords may be used to display the report in a +prominent place of the index page. + +For ``validphys`` runcards, this data is read from a `meta` mapping declared in +the runcard. For example + +.. code:: yaml + + meta: + title: PDF comparisons + author: NNPDF Collaboration + keywords: [gallery] + +``validphys`` uses this mapping to write a ``meta.yaml`` file with the same +information to the output folder. This file is then used by the indexers. + + +Conventions for writing metadata +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Reports endowed with the correct metadata can be retrieved from the index even +several years after they are uploaded. + + - Always fill appropriately the metadata fields for anything you upload. + + - Fill the author field with a complete form of your name, e.g. *Zahari Kassabov* + rather than *ZK*, and always use the same name. + + - Add keywords that are relevant to the result you are uploading. Use existing + tags if possible. + +Metadata from HTML fallback +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +An ``index.html`` file in the uploaded output folder will serve as a source of +metadata if the ``meta.yaml`` file is not present (e.g. because the ``meta`` mapping +was not defined in the runcard). To automatically generate an ``index.html`` file +from a ``report`` action, one may set the option ``main:True`` (alternatively there +is the ``out_filename`` option, which may be used to specify the filename). In the +template, use the `pandoc-markdown syntax `_ +to set the metadata at the top of the file. In the runcard you would write: + +.. code:: yaml + + template: mytemplate.md + actions_: + - report(main=True) + +and you would begin ``mytemplate.md``, using YAML syntax, like: + +.. code:: yaml + + --- + title: Testing the fit {@fit@} + author: Zahari Kassabov + keywords: [nnpdf31, nolhc] + ... + +Note that you can use the report syntax to get the parameters from the +runcard. If you only want to set title or author, you can also +prefix the two first lines of the markdown templates with ``%``: + +.. code:: markdown + + % Template title + % Myself, the template author + + Content... + +This is mostly useful for sub-reports not at the top level, in +more complicated documents. + +.. _uploading-directly-from-validphys: +Uploading directly from ``validphys`` +------------------------------------- + +When the ``--upload`` flag is set in the invocation of the ``validphys`` command, +the contents of the output folder will be uploaded to the NNPDF data server, +after validphys is done. Use this if you have :ref:`filled the meta mapping in the +runcard ` and already know that the output is going to be good enough +to share. Otherwise use :ref:`vp-upload ` after checking the result. + +``validphys`` will check the SSH connection before doing any work, and +it will fail early if it cannot be established. + +.. _vpupload: +The ``vp-upload`` script +------------------------ + +The ``vp-upload`` script uploads completed results to the NNPDF server, such as +reports and fits. To upload a completed ``validphys`` report, use:: + + vp-upload + +The output folder is expected to contain the :ref:`metadata ` (e.g. in the +form of a ``meta.yaml`` file). If it doesn't exist or you want to upload and index +arbitrary files, use the ``wiki-upload`` :ref:`command `. + +The script automatically detects (:py:func:`validphys.uploadutils.check_input`) the type of the input. +A ``fit`` is defined to be any folder structure that contains a ``filter.yml`` file at its root, a ``PDF`` is any +folder containing a ``.info`` file at the root and a replica 0, and a report is any such structure containing an +``index.html`` file at the root. The input folder is then placed in the correct location in the +server accordingly. + +.. note:: + If there is already a fit or PDF on the server with the same name as the fit or PDF + you wish to upload, then this command will *not* overwrite the resource that already + exists. To overwite such a resource on the server, use the :code:`--force` option. + +The code is documented at :py:mod:`validphys.scripts.vp_upload`. + +Note that fits are indexed separately, and can be retrieved with the ``vp-get`` +:ref:`command `. + +.. _the-wiki-upload-script: +The ``wiki-upload`` script +-------------------------- + +The ``wiki-upload`` script is a more interactive counterpart to ``vp-upload``. It +allows uploading arbitrary files that do not have metadata attached. It will +construct the metadata by asking the user to fill it in before uploading the +result. The usage is:: + + wiki-upload + +This will cause the user to be prompted for the various metadata fields and the +file or folder to be uploaded to the server, together with a generated +``meta.yaml`` file used for indexing. + +The code is documented at :py:mod:`validphys.scripts.wiki_upload`. + +The ``validphys`` index page +---------------------------- + +The source of the report index page is:: + + serverscripts/validphys-reports/index.html + +inside the ``validphys2`` directory in the main repository. This page can be +edited to reflect the current interests (the Makefile directly uploads to the +server). See the documentation on :ref:`web scripts ` for more details. + From 38754727182e105acfa5ad55ebaa656bb104e192 Mon Sep 17 00:00:00 2001 From: achiefa Date: Fri, 27 Sep 2024 18:19:16 +0200 Subject: [PATCH 06/22] added vp/download.rst --- .../source/vp/{download.md => download.rst} | 175 +++++++++--------- 1 file changed, 88 insertions(+), 87 deletions(-) rename doc/sphinx/source/vp/{download.md => download.rst} (52%) diff --git a/doc/sphinx/source/vp/download.md b/doc/sphinx/source/vp/download.rst similarity index 52% rename from doc/sphinx/source/vp/download.md rename to doc/sphinx/source/vp/download.rst index bab42837c5..133b019b25 100644 --- a/doc/sphinx/source/vp/download.md +++ b/doc/sphinx/source/vp/download.rst @@ -1,62 +1,60 @@ -```eval_rst .. _download: -``` Downloading resources ===================== -`validphys` is designed so that, by default, resources stored in known remote +``validphys`` is designed so that, by default, resources stored in known remote locations are downloaded automatically and seamlessly used where necessary. Available resources include PDF sets, completed fits, theories, and results of -past `validphys` runs that have been [uploaded to the server](upload). -The `vp-get` tool, [described below](#the-vp-get-tool), +past ``validphys`` runs that have been :ref:`uploaded to the server `. +The ``vp-get`` tool, :ref:`described below `, can be used to download the same items manually. Automatic operation ------------------- -By default when some resource such as a PDF is required by `validphys` (or -derived tools such as `vp-setupfit`), the code will first look for it in some +By default when some resource such as a PDF is required by ``validphys`` (or +derived tools such as ``vp-setupfit``), the code will first look for it in some local directory specified in the [profile file](nnprofile). If it is not found there, it will try to download it from some remote repository (also specified in the profile). -For example a `validphys` runcard such as +For example a ``validphys`` runcard such as -```yaml -pdf: NNPDF40_nnlo_as_01180 -fit: NNPDF40_nlo_as_01180 +.. code:: yaml -theoryid: 208 + pdf: NNPDF40_nnlo_as_01180 + fit: NNPDF40_nlo_as_01180 -use_cuts: "fromfit" + theoryid: 208 -dataset_input: - dataset: ATLAS_DY_7TEV_36PB_ETA - cfac: [EWK] + use_cuts: "fromfit" -actions_: - - plot_fancy - - plot_chi2dist -``` + dataset_input: + dataset: ATLAS_DY_7TEV_36PB_ETA + cfac: [EWK] -Will download if necessary the fit called `NNPDF40_nlo_as_01180`, the -PDF set called `NNPDF40_nnlo_as_01180` and the theory with ID 208, when validphys + actions_: + - plot_fancy + - plot_chi2dist + + +Will download if necessary the fit called ``NNPDF40_nlo_as_01180``, the +PDF set called ``NNPDF40_nnlo_as_01180`` and the theory with ID 208, when validphys is executed with the default settings. In practice one rarely has to worry about installing resources by hand when working with NNPDF tools. The behaviour of downloading automatically can be disabled by passing the -`--no-net` flag to supported tools. In that case, failure to find a given +``--no-net`` flag to supported tools. In that case, failure to find a given resource locally will result in an error and exiting the program. The ``--net`` flag makes the default behaviour explicit and has no effect otherwise. - +.. _what-can-be-downloaded: What can be downloaded ---------------------- The following resources are found automatically: -```eval_rst Fits Fits (specified by the ``fit`` key) can be downloaded if they have previously been uploaded with :ref:`vp-upload `. The corresponding PDF @@ -79,11 +77,8 @@ Theories the :ref:`bootstrap script `. Output files are not specified by any top level config key, but instead actions can specify their own logic, for example for using an existing file instead of computing it. -``` -```eval_rst .. _vp-get: -``` The `vp-get` tool ----------------- @@ -92,38 +87,41 @@ The ``vp-get`` tool can be used to download resources manually, in the same way ``validphys`` would do. The basic syntax is -```bash -vp-get -``` -The available options for `` can be seen with `vp-get --list`. -They correspond to the resources described [above](#what-can-be-downloaded). - -```bash -$ vp-get --list -Available resource types: - - fit - - pdf - - theoryID - - vp_output_file -``` + +.. code:: bash + + vp-get + +The available options for ```` can be seen with ``vp-get --list``. +They correspond to the resources described :ref:`above `. + +.. code:: bash + + $ vp-get --list + Available resource types: + - fit + - pdf + - theoryID + - vp_output_file + For example to download the fit ``NNPDF31_nlo_as_0118_1000`` we would write -```bash -$ vp-get fit NNPDF31_nlo_as_0118_1000 -``` +.. code:: bash + + $ vp-get fit NNPDF31_nlo_as_0118_1000 If the resource is already installed locally, the tool will display some information on it and bail out: -```bash -$ vp-get fit NNPDF31_nlo_as_0118_1000 -FitSpec(name='NNPDF31_nlo_as_0118_1000', path=PosixPath('/home/zah/anaconda3/envs/nnpdf-dev/share/NNPDF/results/NNPDF31_nlo_as_0118_1000')) -``` +.. code:: bash + + $ vp-get fit NNPDF31_nlo_as_0118_1000 + FitSpec(name='NNPDF31_nlo_as_0118_1000', path=PosixPath('/home/zah/anaconda3/envs/nnpdf-dev/share/NNPDF/results/NNPDF31_nlo_as_0118_1000')) Downloading resources in code (``validphys.loader``) ---------------------------------------------------- -```eval_rst + The automatic download logic is implemented in the :py:mod:`validphys.loader`, specifically by the :py:class:`validphys.loader.RemoteLoader` and :py:class:`validphys.loader.FallbackLoader` classes. @@ -138,55 +136,58 @@ of ``FallbackLoader`` (which is generated dynamically) will intercept the installed in such a way that the subsequent call of the ``Loader.check_`` method succeeds. That is it should downoad the resource to the relevant search path, and uncompress it if needed. -``` + In practice one can get a download aware loader by using a ``FallbackLoader`` instance, which will try to obtain all the required resources from remote locations. -```python -from validphys.loader import FallbackLoader as Loader +.. code:: python -l = Loader() -#Will download theory 151 if needed. -l.check_dataset('NMC', theoryid=151) -``` + from validphys.loader import FallbackLoader as Loader + + l = Loader() + #Will download theory 151 if needed. + l.check_dataset('NMC', theoryid=151) Conversely the ``Loader`` class will only search locally. -```python -from validphys.loader import Loader +.. code:: python + + from validphys.loader import Loader -l = Loader() + l = Loader() -l.check_dataset('NMC', theoryid=151) ---------------------------------------------------------------------------- -TheoryNotFound Traceback (most recent call last) - in -----> 1 l.check_dataset('NMC', theoryid=151) + l.check_dataset('NMC', theoryid=151) + --------------------------------------------------------------------------- + TheoryNotFound Traceback (most recent call last) + in + ----> 1 l.check_dataset('NMC', theoryid=151) -~/nngit/nnpdf/validphys2/src/validphys/loader.py in check_dataset(self, name, rules, sysnum, theoryid, cfac, frac, cuts, use_fitcommondata, fit, weight) - 416 - 417 if not isinstance(theoryid, TheoryIDSpec): ---> 418 theoryid = self.check_theoryID(theoryid) - 419 - 420 theoryno, _ = theoryid + ~/nngit/nnpdf/validphys2/src/validphys/loader.py in check_dataset(self, name, rules, sysnum, theoryid, cfac, frac, cuts, use_fitcommondata, fit, weight) + 416 + 417 if not isinstance(theoryid, TheoryIDSpec): + --> 418 theoryid = self.check_theoryID(theoryid) + 419 + 420 theoryno, _ = theoryid -~/nngit/nnpdf/validphys2/src/validphys/loader.py in check_theoryID(self, theoryID) - 288 if not theopath.exists(): - 289 raise TheoryNotFound(("Could not find theory %s. " ---> 290 "Folder '%s' not found") % (theoryID, theopath) ) - 291 return TheoryIDSpec(theoryID, theopath) - 292 + ~/nngit/nnpdf/validphys2/src/validphys/loader.py in check_theoryID(self, theoryID) + 288 if not theopath.exists(): + 289 raise TheoryNotFound(("Could not find theory %s. " + --> 290 "Folder '%s' not found") % (theoryID, theopath) ) + 291 return TheoryIDSpec(theoryID, theopath) + 292 -TheoryNotFound: Could not find theory 151. Folder '/home/zah/anaconda3/share/NNPDF/data/theory_151' not found -``` + TheoryNotFound: Could not find theory 151. Folder '/home/zah/anaconda3/share/NNPDF/data/theory_151' not found -Output files uploaded to the `validphys` can be retrieved specifying their path + +Output files uploaded to the ``validphys`` can be retrieved specifying their path (starting from the report ID). They will be either downloaded (when using -`FallbackLoader`) or retrieved from the cache: -```python -from validphys.loader import FallbackLoader as Loader -l = Loader() -l.check_vp_output_file('qTpvLZLwS924oAsmpMzhFw==/figures/f_ns0_fitunderlyinglaw_plot_closure_pdf_histograms_0.pdf') -PosixPath('/home/zah/anaconda3/share/NNPDF/vp-cache/qTpvLZLwS924oAsmpMzhFw==/figures/f_ns0_fitunderlyinglaw_plot_closure_pdf_histograms_0.pdf') -``` +``FallbackLoader``) or retrieved from the cache: + +.. code:: python + + from validphys.loader import FallbackLoader as Loader + l = Loader() + l.check_vp_output_file('qTpvLZLwS924oAsmpMzhFw==/figures/f_ns0_fitunderlyinglaw_plot_closure_pdf_histograms_0.pdf') + PosixPath('/home/zah/anaconda3/share/NNPDF/vp-cache/qTpvLZLwS924oAsmpMzhFw==/figures/f_ns0_fitunderlyinglaw_plot_closure_pdf_histograms_0.pdf') + From c22669ca37df5913b5354e9813075b841b291fba Mon Sep 17 00:00:00 2001 From: achiefa Date: Fri, 27 Sep 2024 18:25:16 +0200 Subject: [PATCH 07/22] added vp/nnprofile.rst --- doc/sphinx/source/vp/index.rst | 2 +- .../source/vp/{nnprofile.md => nnprofile.rst} | 25 ++++++++----------- 2 files changed, 11 insertions(+), 16 deletions(-) rename doc/sphinx/source/vp/{nnprofile.md => nnprofile.rst} (89%) diff --git a/doc/sphinx/source/vp/index.rst b/doc/sphinx/source/vp/index.rst index 5a7570a250..5d5eda67a4 100644 --- a/doc/sphinx/source/vp/index.rst +++ b/doc/sphinx/source/vp/index.rst @@ -51,7 +51,7 @@ Using validphys ./getting-started.rst ./download.rst ./upload.rst - ./nnprofile.md + ./nnprofile.rst ./complex_runcards.rst ./cuts.rst ./datthcomp.rst diff --git a/doc/sphinx/source/vp/nnprofile.md b/doc/sphinx/source/vp/nnprofile.rst similarity index 89% rename from doc/sphinx/source/vp/nnprofile.md rename to doc/sphinx/source/vp/nnprofile.rst index 10bfde066d..2a62f89e0b 100644 --- a/doc/sphinx/source/vp/nnprofile.md +++ b/doc/sphinx/source/vp/nnprofile.rst @@ -1,25 +1,23 @@ -```eval_rst .. _nnprofile: -``` -The `nnprofile.yaml` file +The ``nnprofile.yaml`` file ========================= -The NNPDF code stores some configuration options (mostly various URLs and paths) in a `.yaml` file +The NNPDF code stores some configuration options (mostly various URLs and paths) in a ``.yaml`` file which is installed alongside the code. The default values can be consulted in ``validphys/default_nnprofile.yaml``. -This configuration is used by `validphys` to locate, -[upload](upload) and [download](download) resources. +This configuration is used by ``validphys`` to locate, +:ref:`upload ` and :ref:`download ` resources. Altering profile settings -------------------------- -It is possible to set up a custom profile file in: -``` +It is possible to set up a custom profile file in: :: + ${XDG_CONFIG_HOME}/NNPDF/nnprofile.yaml -``` -such that it will be used by every NNPDF installation (note that `${XDG_CONFIG_HOME}` defaults to `~/.config`) + +such that it will be used by every NNPDF installation (note that ``${XDG_CONFIG_HOME}`` defaults to ``~/.config``) or by defining the environment variable ``NNPDF_PROFILE_PATH`` to point to a different profile file, which will be loaded instead by the code. Specifying a custom profile could be useful to add repositories for specific projects or @@ -33,9 +31,8 @@ Options The following settings in the profile file are interpreted by different parts of the code. These should be specified in YAML format. -```eval_rst -``nnpdf_share``` +``nnpdf_share`` Main folder for NNPDF shared resources: theories, fits, hyperscans, etc. Ex: ``nnpdf_share: ~/.local/share/NNPDF``. All other paths are defined relative to ``nnpdf_share``. @@ -79,8 +76,7 @@ the code. These should be specified in YAML format. The name of the remote PDF index. Shouldn't be changed. ``upload_host`` - The SSH host (with user name as in ``user@host``) used to upload - ``validphys`` reports and fits. + The SSH host (with user name as in ``user@host``) used to upload ``validphys`` reports and fits. ``reports_target_dir`` The file path in the server where reports are uploaded to. @@ -93,4 +89,3 @@ the code. These should be specified in YAML format. ``fits_root_url`` The HTTP URL where to download fits from. -``` From c9a2d57fb0f59672835c162f33109884e3d39b9c Mon Sep 17 00:00:00 2001 From: achiefa Date: Fri, 27 Sep 2024 19:14:21 +0200 Subject: [PATCH 08/22] added ci/index.rst --- doc/sphinx/source/ci/index.md | 107 ------------------------------- doc/sphinx/source/ci/index.rst | 112 +++++++++++++++++++++++++++++++++ 2 files changed, 112 insertions(+), 107 deletions(-) delete mode 100644 doc/sphinx/source/ci/index.md create mode 100644 doc/sphinx/source/ci/index.rst diff --git a/doc/sphinx/source/ci/index.md b/doc/sphinx/source/ci/index.md deleted file mode 100644 index e1b18baca6..0000000000 --- a/doc/sphinx/source/ci/index.md +++ /dev/null @@ -1,107 +0,0 @@ -```eval_rst -.. _CI: -``` -# Continuous integration and deployment - -The NNPDF code base makes use of externally hosted services, to aid development -and testing. These are typically called *Continuous integration (CI)* or -*Continuous deployment* services. Their main task is to execute automated tests -on the code and produce [binary builds](conda) which allow it to be -automatically deployed. The services are configured so that they react to -[git](git) pushes to the GitHub server. - -Currently we are using actively [GitHub Actions](https://help.github.com/en/actions). -In the past, the [Travis CI](https://travis-ci.com/) service was used, but owing to timeout failures on Mac we have decided to move the CI to GitHub Actions. -The [Gitlab CI service hosted at -CERN](https://gitlab.cern.ch/) was also used in the past, but support was -discontinued due to the burden of requiring everyone to have a CERN account. - -Furthermore, we implement a self-hosted runner for GitHub Actions for long duration workflows, such as running a full fit pipeline. - -## Operation of CI tools - -Our CI service works roughly as follows: - - 1. Every time a commit is made to a given branch, a request to process the - code in that branch is made automatically. - 2. The code for the branch is downloaded to the CI server, and some action is - taken based on configuration found both in the git repository itself and in - the settings on the CI service. These actions include: - * Compiling the code. - * Running the tests. - * Possibly, uploading the compiled binaries and documentation to the - [NNPDF server](server). - We use [Conda-build](https://docs.conda.io/projects/conda-build/en/latest/) - to - do much of the heavy lifting for these actions. - 3. The CI service reports whether it has *succeeded* or *failed* to the GitHub - server, which displays that information next to the relevant pull request or - commit. Some logs are generated, which can aid in determining the cause of - errors. - 4. Generate docker images for tag releases with ready-to-run NNPDF installation. - -The progress reports of the various jobs at GitHub Actions, as well as the -corresponding logs are available at , upon logging in -with an authorized GitHub account. - - -## Configuration of GitHub Actions - -GitHub Actions uses both files found in the NNPDF repository and settings stored in the tab `Settings -> Secrets` of the repository itself. - -### Secrets stored in the GitHub Actions configuration - -To build and upload the packages GitHub Actions needs to be able to access some -secrets, which should not be stored in the git repository. These are represented -as environment variables, under the [secrets for the NNPDF -repository](https://github.com/NNPDF/nnpdf/settings/secrets). The secrets are encoded -using `base64` for simplicity. To use -them, do something like `echo "$" | base64 --decode`. - -The secrets are. - - - `NETRC_FILE` a base64 encoded string containing a `~/.netrc` file with - the [credentials](server-access) to the private conda repository - - - `NNPDF_SSH_KEY` a base64 string containing a private SSH key which is - authorized to access the [upload account of the NNPDF server](server-access). - -### Repository configuration - -The entry point for GitHub Actions are yaml rules files that can be found in the -[`.github/workflows/`](https://github.com/NNPDF/nnpdf/blob/master/.github/workflows/) folder. -They specify which operating systems and versions are tested, which -versions of Python, some environment variables, and command instructions for linux and macos. The commands basically call `conda build` and upload the relevant packages if required. - -By default only packages corresponding to commits to the master branch get -uploaded. For other branches, the build and testing happens, but the results are -discarded in the end. This behavior can be changed by (temporarily) commenting the lines starting with `if: github.ref == 'refs/heads/master'` in the `.github/workflows/rules.yml` file. This can be -useful to test modifications to the uploading. - -When a new tag is created the github action stored in -`.github/workflows/docker.yml` is executed. This action generates a new docker -image containing the tagged code. This docker image is then uploaded to the -[NNPDF GitHub Package -registry](https://github.com/NNPDF/nnpdf/pkgs/container/nnpdf). Finally, the -action stores the conda environment file created during the installation process -and opens a pull request placing the file inside `n3fit/runcards`. This feature -allows recovering the code and results obtained with specific tag of the code. - -## Operation of automatic fit bot - -Our GitHub Action service implements: - - 1. Every time the label `run-fit-bot` is added to a pull request, a request to process the code in that branch is made automatically. - 2. The code for the branch is downloaded to the CI self-hosted runner, and some action is - taken based on configuration found both in the git repository itself in `.github/workflow/rules.yml`. These actions include: - * Compiling and installing the code. - * Running a complete fit using the `n3fit/runcards/development.yml` runcard. - * Produces a report and upload results to the [NNPDF server](server). - 3. The CI service reports whether it has *succeeded* or *failed* to the GitHub - server, which displays that information next to the relevant pull request or - commit. Some logs are generated, which can aid in determining the cause of - errors. - 4. If the workflow succeeds, a comment to the initial pull request will appear with link references to the generated report and fit. - -The progress reports of the various jobs at [GitHub Actions](https://github.com/NNPDF/actions), upon logging in -with an authorized GitHub account. diff --git a/doc/sphinx/source/ci/index.rst b/doc/sphinx/source/ci/index.rst new file mode 100644 index 0000000000..35be98cdb1 --- /dev/null +++ b/doc/sphinx/source/ci/index.rst @@ -0,0 +1,112 @@ +.. _CI: + +Continuous integration and deployment +===================================== + +The NNPDF code base makes use of externally hosted services, to aid development +and testing. These are typically called *Continuous integration (CI)* or +*Continuous deployment* services. Their main task is to execute automated tests +on the code and produce :ref:`binary builds ` which allow it to be +automatically deployed. The services are configured so that they react to +:ref:`git ` pushes to the GitHub server. + +Currently we are using actively `GitHub Actions `_. +In the past, the `Travis CI `_ service was used, but owing to timeout failures on Mac we have decided to move the CI to GitHub Actions. +The `Gitlab CI service hosted at CERN `_ was also used in the past, but support was +discontinued due to the burden of requiring everyone to have a CERN account. + +Furthermore, we implement a self-hosted runner for GitHub Actions for long duration workflows, such as running a full fit pipeline. + +Operation of CI tools +--------------------- + +Our CI service works roughly as follows: + + 1. Every time a commit is made to a given branch, a request to process the + code in that branch is made automatically. + 2. The code for the branch is downloaded to the CI server, and some action is + taken based on configuration found both in the git repository itself and in + the settings on the CI service. These actions include: + + - Compiling the code. + - Running the tests. + - Possibly, uploading the compiled binaries and documentation to the :ref:`NNPDF server `. + We use `Conda-build `_ to do much of the heavy lifting for these actions. + 3. The CI service reports whether it has *succeeded* or *failed* to the GitHub, + server, which displays that information next to the relevant pull request or + commit. Some logs are generated, which can aid in determining the cause of errors. + 4. Generate docker images for tag releases with ready-to-run NNPDF installation. + +The progress reports of the various jobs at GitHub Actions, as well as the +corresponding logs are available at https://github.com/NNPDF/nnpdf/actions, upon logging in +with an authorized GitHub account. + + +Configuration of GitHub Actions +------------------------------- + +GitHub Actions uses both files found in the NNPDF repository and settings stored in the tab ``Settings -> Secrets`` of the repository itself. + +Secrets stored in the GitHub Actions configuration +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +To build and upload the packages GitHub Actions needs to be able to access some +secrets, which should not be stored in the git repository. These are represented +as environment variables, under the +`secrets for the NNPDF repository `_. The secrets are encoded +using ``base64`` for simplicity. To use +them, do something like:: + + echo "$" | base64 --decode + +The secrets are: + + - ``NETRC_FILE`` a base64 encoded string containing a ``~/.netrc`` file with + the :ref:`credentials ` to the private conda repository + https://packages.nnpdf.science/conda-private/ + - ``NNPDF_SSH_KEY`` a base64 string containing a private SSH key which is + authorized to access the :ref:`upload account of the NNPDF server `. + +Repository configuration +~~~~~~~~~~~~~~~~~~~~~~~~ + +The entry point for GitHub Actions are yaml rules files that can be found in the +`.github/workflows/ `_ folder. +They specify which operating systems and versions are tested, which +versions of Python, some environment variables, and command instructions for linux and macos. The commands basically call ``conda build`` and upload the relevant packages if required. + +By default only packages corresponding to commits to the master branch get +uploaded. For other branches, the build and testing happens, but the results are +discarded in the end. This behavior can be changed by (temporarily) commenting the lines starting with ``if: github.ref == 'refs/heads/master'`` in the ``.github/workflows/rules.yml`` file. This can be +useful to test modifications to the uploading. + +When a new tag is created the github action stored in +``.github/workflows/docker.yml`` is executed. This action generates a new docker +image containing the tagged code. This docker image is then uploaded to the +`NNPDF GitHub Package +registry `_. Finally, the +action stores the conda environment file created during the installation process +and opens a pull request placing the file inside ``n3fit/runcards``. This feature +allows recovering the code and results obtained with specific tag of the code. + +Operation of automatic fit bot +------------------------------ + +Our GitHub Action service implements: + + 1. Every time the label ``run-fit-bot`` is added to a pull request, a request to process the code in that branch is made automatically. + 2. The code for the branch is downloaded to the CI self-hosted runner, and some action is + taken based on configuration found both in the git repository itself in ``.github/workflow/rules.yml``. These actions include: + + * Compiling and installing the code. + * Running a complete fit using the ``n3fit/runcards/development.yml`` runcard. + * Produces a report and upload results to the :ref:`NNPDF server `. + + 3. The CI service reports whether it has *succeeded* or *failed* to the GitHub + server, which displays that information next to the relevant pull request or + commit. Some logs are generated, which can aid in determining the cause of errors. + 4. If the workflow succeeds, a comment to the initial pull request will appear with link references to the generated report and fit. + +The progress reports of the various jobs at `GitHub Actions `_, upon logging in +with an authorized GitHub account. + From e7df4ca991e8f8b835b64972d065ab95b1b950de Mon Sep 17 00:00:00 2001 From: achiefa Date: Wed, 9 Oct 2024 12:10:15 +0100 Subject: [PATCH 09/22] Converted vp/cuts --- doc/sphinx/source/vp/cuts.md | 184 --------------------------------- doc/sphinx/source/vp/cuts.rst | 185 ++++++++++++++++++++++++++++++++++ 2 files changed, 185 insertions(+), 184 deletions(-) delete mode 100644 doc/sphinx/source/vp/cuts.md create mode 100644 doc/sphinx/source/vp/cuts.rst diff --git a/doc/sphinx/source/vp/cuts.md b/doc/sphinx/source/vp/cuts.md deleted file mode 100644 index ff2921fc80..0000000000 --- a/doc/sphinx/source/vp/cuts.md +++ /dev/null @@ -1,184 +0,0 @@ -Specifying data cuts --------------------- - -The experimental ``CommonData`` files contain more data points than we -actually fit. Some data points are excluded for reasons such as the -instability of the perturbative expansion in their corresponding -kinematic regions. - -There are four possibilities for handling the experimental cuts -within validphys, which are controlled with the ``use_cuts`` -configuration setting: - -``use_cuts: 'nocuts'`` - * This causes the content of the data files to be taken unmodified. - Note that some theory predictions may be ill defined in this - situation. - -``use_cuts: 'fromfit'`` - * The cuts are read from the masks given as input to [``n3fit``](../n3fit/index.html), and - generated by [``vp-setupfit``](scripts.html). An existing fit is required, to load the - cuts, and must contain the masks for all the datasets analyzed in - the active namespace. - -``use_cuts: 'internal'`` - * Compute the cut masks as ``vp-setupfit`` would do. Currently the - parameters ``q2min`` and ``w2min`` must be given. These can in turn be - set to the same as the fit values by loading the ``datacuts`` - namespace from the fit. In this case, the cuts will normally - coincide with the ones loaded with the ``fromfit`` setting. - -``use_cuts: 'fromintersection'`` - * Compute the internal cuts as per ``use_cuts: 'internal'`` - within each namespace in a [namespace list](#multiple-inputs-and-namespaces) called - ``cuts_intersection_spec`` and take the intersection of the results as - the cuts for the given dataset. This is useful for example for - requiring the common subset of points that pass the cuts at NLO and - NNLO. - -``use_cuts: 'fromsimilarpredictions'`` - * Compute the intersection between two namespaces (similar to for - ``fromintersection``) but additionally require that the predictions computed for - each dataset across the namespaces are *similar*, specifically that the ratio - between the absolute difference in the predictions and the total experimental - uncertainty is smaller than a given value, ``cut_similarity_threshold`` that - must be provided. Note that for this to work with different C-factors across - the namespaces, one must provide a different ``dataset_inputs`` list for each. - * This mechanism can be ignored selectively for specific datasets. To do - that, add their names to a list called ``do_not_require_similarity_for``. The - datasets in the list do not need to appear in the ``cuts_intersection_spec`` - namespace and will be filtered according to the internal cuts unconditionally. - - -The following example demonstrates the first three options: - -```yaml -meta: - title: Test the various options for CutsPolicy - author: Zahari Kassabov - keywords: [test, debug] - -fit: NNPDF40_nlo_as_01180 - -theory: - from_: fit - -theoryid: - from_: theory - -#Load q2min and w2min from the fit -datacuts: - from_: fit - - -# Used for intersection cuts -cuts_intersection_spec: - - theoryid: 208 - - theoryid: 162 - -dataset_input: {dataset: ATLASDY2D8TEV} - -dataspecs: - - speclabel: "No cuts" - use_cuts: "nocuts" - - - speclabel: "Fit cuts" - use_cuts: "fromfit" - - - speclabel: "Internal cuts" - use_cuts: "internal" - - - speclabel: "Intersected cuts" - use_cuts: "fromintersection" - -template_text: | - {@with fitpdf::datacuts@} - # Plot - - {@fitpdf::datacuts plot_fancy_dataspecs@} - - # χ² plots - - {@with dataspecs@} - ## {@speclabel@} - - {@plot_chi2dist@} - - {@endwith@} - {@endwith@} - - -actions_: - - report(main=True) -``` - -Here we put together the results with the different filtering policies -in a [data-theory comparison](data-theory-comp) plot and then plot the χ² distribution -for each one individually. With these settings the latter three -[dataspecs](#general-data-specification-the-dataspec-api) give the -same result. - -The following example demonstrates the use of `fromsimilarpredictions`: - -```yaml -meta: - title: "Test similarity cuts: Threshold 1,2" - author: Zahari Kassabov - keywords: [test] - -show_total: True - -NNLODatasts: &NNLODatasts -- {dataset: ATLAS_SINGLETOP_7TEV_TCHANNEL-XSEC, frac: 1.0, variant: legacy} # N -- {dataset: ATLAS_SINGLETOP_13TEV_TCHANNEL-XSEC, frac: 1.0, variant: legacy} # N -- {dataset: ATLAS_SINGLETOP_7TEV_T-Y-NORM, frac: 1.0, variant: legacy} # N -- {dataset: ATLAS_SINGLETOP_7TEV_TBAR-Y-NORM, frac: 1.0, variant: legacy} # N -- {dataset: ATLAS_SINGLETOP_8TEV_T-RAP-NORM, frac: 0.75, variant: legacy} # N - -NLODatasts: &NLODatasts -- {dataset: ATLAS_SINGLETOP_7TEV_TCHANNEL-XSEC, frac: 1.0, variant: legacy} # N -- {dataset: ATLAS_SINGLETOP_13TEV_TCHANNEL-XSEC, frac: 1.0, variant: legacy} # N -- {dataset: ATLAS_SINGLETOP_7TEV_T-Y-NORM, frac: 1.0, variant: legacy} # N -- {dataset: ATLAS_SINGLETOP_7TEV_TBAR-Y-NORM, frac: 1.0, variant: legacy} # N -- {dataset: ATLAS_SINGLETOP_8TEV_T-RAP-NORM, frac: 0.75, variant: legacy} # N -- {dataset: ATLAS_SINGLETOP_8TEV_TBAR-RAP-NORM, frac: 0.75, variant: legacy} # N - -do_not_require_similarity_for: [ATLAS_SINGLETOP_8TEV_TBAR-RAP-NORM] - - -dataset_inputs: *NLODatasts - -cuts_intersection_spec: - - theoryid: 208 - pdf: NNPDF40_nlo_as_01180 - dataset_inputs: *NLODatasts - - - theoryid: 200 - pdf: NNPDF40_nnlo_as_01180 - dataset_inputs: *NNLODatasts - - -theoryid: 208 -pdf: NNPDF40_nlo_as_01180 - -dataspecs: - - - use_cuts: internal - speclabel: "No cuts" - - - - cut_similarity_threshold: 2 - speclabel: "Threshold 2" - use_cuts: fromsimilarpredictions - - - - cut_similarity_threshold: 1 - speclabel: "Threshold 1" - use_cuts: fromsimilarpredictions - -template_text: | - {@dataspecs_chi2_table@} - -actions_: - - report(main=True) -``` diff --git a/doc/sphinx/source/vp/cuts.rst b/doc/sphinx/source/vp/cuts.rst new file mode 100644 index 0000000000..34f8eea7ff --- /dev/null +++ b/doc/sphinx/source/vp/cuts.rst @@ -0,0 +1,185 @@ +Specifying data cuts +-------------------- + +The experimental :code:`CommonData` files contain more data points than we +actually fit. Some data points are excluded for reasons such as the +instability of the perturbative expansion in their corresponding +kinematic regions. + +There are four possibilities for handling the experimental cuts +within validphys, which are controlled with the ``use_cuts`` +configuration setting: + +:code:`use_cuts: 'nocuts'` + * This causes the content of the data files to be taken unmodified. + Note that some theory predictions may be ill defined in this + situation. + +:code:`use_cuts: 'fromfit'` + * The cuts are read from the masks given as input to :ref:`n3fit `, and + generated by :ref:`vp-setupfit `. An existing fit is required, to load the + cuts, and must contain the masks for all the datasets analyzed in + the active namespace. + +:code:`use_cuts: 'internal'` + * Compute the cut masks as ``vp-setupfit`` would do. Currently the + parameters ``q2min`` and ``w2min`` must be given. These can in turn be + set to the same as the fit values by loading the ``datacuts`` + namespace from the fit. In this case, the cuts will normally + coincide with the ones loaded with the ``fromfit`` setting. + +:code:`use_cuts: 'fromintersection'` + * Compute the internal cuts as per ``use_cuts: 'internal'`` + within each namespace in a [namespace list](#multiple-inputs-and-namespaces) called + ``cuts_intersection_spec`` and take the intersection of the results as + the cuts for the given dataset. This is useful for example for + requiring the common subset of points that pass the cuts at NLO and + NNLO. + +:code:`use_cuts: 'fromsimilarpredictions'` + * Compute the intersection between two namespaces (similar to for + ``fromintersection``) but additionally require that the predictions computed for + each dataset across the namespaces are *similar*, specifically that the ratio + between the absolute difference in the predictions and the total experimental + uncertainty is smaller than a given value, ``cut_similarity_threshold`` that + must be provided. Note that for this to work with different C-factors across + the namespaces, one must provide a different ``dataset_inputs`` list for each. + * This mechanism can be ignored selectively for specific datasets. To do + that, add their names to a list called ``do_not_require_similarity_for``. The + datasets in the list do not need to appear in the ``cuts_intersection_spec`` + namespace and will be filtered according to the internal cuts unconditionally. + + +The following example demonstrates the first three options: + +.. code:: yaml + + meta: + title: Test the various options for CutsPolicy + author: Zahari Kassabov + keywords: [test, debug] + + fit: NNPDF40_nlo_as_01180 + + theory: + from_: fit + + theoryid: + from_: theory + + #Load q2min and w2min from the fit + datacuts: + from_: fit + + + # Used for intersection cuts + cuts_intersection_spec: + - theoryid: 208 + - theoryid: 162 + + dataset_input: {dataset: ATLASDY2D8TEV} + + dataspecs: + - speclabel: "No cuts" + use_cuts: "nocuts" + + - speclabel: "Fit cuts" + use_cuts: "fromfit" + + - speclabel: "Internal cuts" + use_cuts: "internal" + + - speclabel: "Intersected cuts" + use_cuts: "fromintersection" + + template_text: | + {@with fitpdf::datacuts@} + # Plot + + {@fitpdf::datacuts plot_fancy_dataspecs@} + + # χ² plots + + {@with dataspecs@} + ## {@speclabel@} + + {@plot_chi2dist@} + + {@endwith@} + {@endwith@} + + + actions_: + - report(main=True) + +Here we put together the results with the different filtering policies +in a [data-theory comparison](data-theory-comp) plot and then plot the χ² distribution +for each one individually. With these settings the latter three +[dataspecs](#general-data-specification-the-dataspec-api) give the +same result. + +The following example demonstrates the use of `fromsimilarpredictions`: + +.. code:: yaml + + meta: + title: "Test similarity cuts: Threshold 1,2" + author: Zahari Kassabov + keywords: [test] + + show_total: True + + NNLODatasts: &NNLODatasts + - {dataset: ATLAS_SINGLETOP_7TEV_TCHANNEL-XSEC, frac: 1.0, variant: legacy} # N + - {dataset: ATLAS_SINGLETOP_13TEV_TCHANNEL-XSEC, frac: 1.0, variant: legacy} # N + - {dataset: ATLAS_SINGLETOP_7TEV_T-Y-NORM, frac: 1.0, variant: legacy} # N + - {dataset: ATLAS_SINGLETOP_7TEV_TBAR-Y-NORM, frac: 1.0, variant: legacy} # N + - {dataset: ATLAS_SINGLETOP_8TEV_T-RAP-NORM, frac: 0.75, variant: legacy} # N + + NLODatasts: &NLODatasts + - {dataset: ATLAS_SINGLETOP_7TEV_TCHANNEL-XSEC, frac: 1.0, variant: legacy} # N + - {dataset: ATLAS_SINGLETOP_13TEV_TCHANNEL-XSEC, frac: 1.0, variant: legacy} # N + - {dataset: ATLAS_SINGLETOP_7TEV_T-Y-NORM, frac: 1.0, variant: legacy} # N + - {dataset: ATLAS_SINGLETOP_7TEV_TBAR-Y-NORM, frac: 1.0, variant: legacy} # N + - {dataset: ATLAS_SINGLETOP_8TEV_T-RAP-NORM, frac: 0.75, variant: legacy} # N + - {dataset: ATLAS_SINGLETOP_8TEV_TBAR-RAP-NORM, frac: 0.75, variant: legacy} # N + + do_not_require_similarity_for: [ATLAS_SINGLETOP_8TEV_TBAR-RAP-NORM] + + + dataset_inputs: *NLODatasts + + cuts_intersection_spec: + - theoryid: 208 + pdf: NNPDF40_nlo_as_01180 + dataset_inputs: *NLODatasts + + - theoryid: 200 + pdf: NNPDF40_nnlo_as_01180 + dataset_inputs: *NNLODatasts + + + theoryid: 208 + pdf: NNPDF40_nlo_as_01180 + + dataspecs: + + - use_cuts: internal + speclabel: "No cuts" + + + - cut_similarity_threshold: 2 + speclabel: "Threshold 2" + use_cuts: fromsimilarpredictions + + + - cut_similarity_threshold: 1 + speclabel: "Threshold 1" + use_cuts: fromsimilarpredictions + + template_text: | + {@dataspecs_chi2_table@} + + actions_: + - report(main=True) + From ca7670eed81fd0117cb120369b8eba4f3fc472ad Mon Sep 17 00:00:00 2001 From: achiefa Date: Wed, 9 Oct 2024 14:05:51 +0100 Subject: [PATCH 10/22] Converted contributing/git --- .../source/contributing/{git.md => git.rst} | 43 ++++++++++--------- 1 file changed, 23 insertions(+), 20 deletions(-) rename doc/sphinx/source/contributing/{git.md => git.rst} (52%) diff --git a/doc/sphinx/source/contributing/git.md b/doc/sphinx/source/contributing/git.rst similarity index 52% rename from doc/sphinx/source/contributing/git.md rename to doc/sphinx/source/contributing/git.rst index 476ef8a8f0..618adc5782 100644 --- a/doc/sphinx/source/contributing/git.md +++ b/doc/sphinx/source/contributing/git.rst @@ -1,44 +1,47 @@ -```eval_rst .. _git: -``` -# Git, GitHub and GitLab +Git, GitHub and GitLab +====================== -## Git -[Git](https://git-scm.com/) is the version control system adopted within the NNPDF Collaboration. +Git +--- + +`Git `_ is the version control system adopted within the NNPDF Collaboration. Among other things, Git allows multiple people to edit code simultaneously; it allows users to track changes to the code, i.e. a user can see who edited what and when; and, it allows a user to view any past version of the code. The latter two points are particularly useful for running tests should a bug be introduced to the code, for example. -### Learning Git +Learning Git +~~~~~~~~~~~~ Using Git can be slightly confusing at first, but by mastering a few basic commands you will be able to do most of the tasks that you need to do day-to-day. Git also has the advantage that at the moment it is probably the most popular version control system out there, so any time you in invest in learning Git will most likely be useful for projects outside of NNPDF. Many online tutorials and -guides exist for learning Git, but here are two that I have used before: a [quick -guide](http://rogerdudler.github.io/git-guide/) and a more in depth -[tutorial](https://www.codecademy.com/learn/learn-git). The [official -documentation](https://git-scm.com/docs) might be useful as well. +guides exist for learning Git, but here are two that I have used before: a `quick +guide `_ and a more in depth +`tutorial `_. The +`official documentation `_ might be useful as well. -### GitHub development workflow +GitHub development workflow +~~~~~~~~~~~~~~~~~~~~~~~~~~~ GitHub provides the following workflow: * Users can create Projects and Milestones for each project. * For each project users can open issues, which can be used to request bug fixes, new features, new -documentation, or simply to facilitate a general discussion. + documentation, or simply to facilitate a general discussion. * When it is clear how an issue should be dealt with, a -[branch](https://github.com/NNPDF/nnpdf/branches) can be opened where a user can implement the -requested feature. - -* Once a feature is ready to be considered for merging into the master version of the code, a [pull -request](https://github.com/NNPDF/nnpdf/pulls) (PR) can be opened. At least two code reviewers -must then be assigned, after which the code will be reviewed and discussed. The modification will -then be accepted or rejected. Further general information on PRs can found -[here](https://help.github.com/en/articles/about-pull-requests). + `branch `_ can be opened where a user can implement the + requested feature. + +* Once a feature is ready to be considered for merging into the master version of the code, a + `pull request `_ (PR) can be opened. At least two code reviewers + must then be assigned, after which the code will be reviewed and discussed. The modification will + then be accepted or rejected. Further general information on PRs can found + `here `_. From 4eb3667a1d385498e08521328cc8f286656c57db Mon Sep 17 00:00:00 2001 From: achiefa Date: Wed, 9 Oct 2024 14:06:14 +0100 Subject: [PATCH 11/22] Converted contributing/python-tools --- .../{python-tools.md => python-tools.rst} | 166 +++++++++--------- 1 file changed, 85 insertions(+), 81 deletions(-) rename doc/sphinx/source/contributing/{python-tools.md => python-tools.rst} (52%) diff --git a/doc/sphinx/source/contributing/python-tools.md b/doc/sphinx/source/contributing/python-tools.rst similarity index 52% rename from doc/sphinx/source/contributing/python-tools.md rename to doc/sphinx/source/contributing/python-tools.rst index fe9fc93180..1913786698 100644 --- a/doc/sphinx/source/contributing/python-tools.md +++ b/doc/sphinx/source/contributing/python-tools.rst @@ -1,195 +1,199 @@ -```eval_rst .. _pytools: -``` -# Tools for developing with the Python programming language +Tools for developing with the Python programming language +========================================================= This page summarizes auxiliary Python tools that we commonly use to develop the project. Note that this page is meant to be a quick index. Consult the documentation on each specific tool for details. -## Python editors +Python editors +-------------- - - The [Spyder editor](https://www.spyder-ide.org/) is good for getting started + - The `Spyder editor `_ is good for getting started with scientific Python coding, because of various inspection and interactive features. - - [`vscode`](https://code.visualstudio.com/) is a more full featured editor. + - `vscode `_ is a more full featured editor. - In the long run, the most efficient approach is to learn a terminal based - editor such as [`vim`](https://www.vim.org/). Note that `vim` editing modes can be - added as extensions to graphical editors such as `vscode`. + editor such as `vim `_. Note that `vim` editing modes + can be added as extensions to graphical editors such as :code:`vscode`. -## Interactive development +Interactive development +----------------------- Python code can be evaluated interactively, which can speed up the development. - - [IPython shell](https://ipython.org/): It is notably nicer to use than the + - `IPython shell `_: It is notably nicer to use than the standard interactive interpreter. - - [Jupyter notebook](https://jupyter.org/): Interactive development + - `Jupyter notebook `_: Interactive development environment running on the browser. Useful for bigger experiments. -```eval_rst .. note:: When developing :ref:`validphys ` related code interactively, be sure to read about the :ref:`API functionality `. -``` -## Testing +Testing +------- - - [`pytest`](https://docs.pytest.org/en/latest/): It is a framework for + - `pytest `_: It is a framework for writing and running tests. It finds tests in the codebase (basically - modules and functions that start with `test`), enhances the `assert` + modules and functions that start with ``test``), enhances the ``assert`` statement to provide rich error reporting and allows to structure - dependencies between the tests (in a way similar to `reportengine`). + dependencies between the tests (in a way similar to ``reportengine``). Tests are stored in the codebase and executed by pytest either manually or as a part of the continuous integration process. - - [`coverage.py`](https://coverage.readthedocs.io/en/coverage-5.2.1/) is a + - `coverage.py `_ is a program that traces which lines of code have been executed when a given Python program (notably pytest) is running. The main use case is to verify that tests probe our code paths. - -```eval_rst .. _pytoolsqa: -``` -## Code quality and reviewing -See also [*Reviewing pull requests*](reviews). Note that these can typically be +Code quality and reviewing +-------------------------- + +See also :ref:`reviewing pull requests `. Note that these can typically be integrated with your editor of choice. - - The [`pylint`](https://www.pylint.org/) tool allows for the catching of + - The `pylint `_ tool allows for the catching of common problems in Python code. The top level - [`.pylintrc` file](https://github.com/NNPDF/nnpdf/blob/master/.pylintrc) + ``.pylintrc`` `file `_ comes with a useful and not overly noisy configuration. - New Python code should come formatted with - [`black` tool](https://github.com/psf/black) with [our default - configuration](https://github.com/NNPDF/nnpdf/blob/master/pyproject.toml) - - The [`isort`](https://pycqa.github.io/isort/) library sorts imports + ``black`` `tool `_ with `our default + configuration `_ + - The ``isort`` `library `_ sorts imports alphabetically, and automatically separated into sections and by type. - - [`pre-commit`](https://pre-commit.com/) is a tool that, can automatically + - `pre-commit `_ is a tool that, can automatically check for stylistic problems in code such as trailing whitespaces or forgotten debug statements. Our configuration can be found in - [`.pre-commit-configuration.yaml`](https://github.com/NNPDF/nnpdf/blob/master/.pre-commit-configuration.yaml) - and also ensures that `black` and `isort` are run. + `.pre-commit-configuration.yaml `_ + and also ensures that ``black`` and ``isort`` are run. -## Debugging +Debugging +--------- Usually the most efficient way to debug a piece of Python code, such as a -`validphys` action is to insert `print` statements to check the state at various +``validphys`` action is to insert ``print`` statements to check the state at various places in the code. A few alternatives exists when that is not enough: - - [IPython embed](https://ipython.readthedocs.io/en/stable/api/generated/IPython.terminal.embed.html): - The [IPython](https://ipython.org/) shell can be easily dropped at any + - `IPython embed `_: + The `IPython `_ shell can be easily dropped at any arbitrary point in the code. Write - ```python - import IPython - IPython.embed() - ``` + .. code:: python + + import IPython + + IPython.embed() + at the location of the code you want to debug. You will then be able to query (and manipulate) the state of the code using a rich shell. - - PDB: The standard [Python debugger](https://docs.python.org/3/library/pdb.html) - can be used as an alternative. Compared to `IPython` it has the advantage that + - PDB: The standard `Python debugger `_ + can be used as an alternative. Compared to ``IPython`` it has the advantage that it allows to automatically step in the execution of the code, but the disadvantage that the interface is somewhat more complex and often surprising (hint: always - [prefix interpreter commands with `!`](https://docs.python.org/3/library/pdb.html#pdbcommand-!)). + `prefix interpreter commands with `_ ``!``. -## Performance profiling +Performance profiling +--------------------- Sometimes a piece of code runs slower than expected. The reasons can often be surprising. It is a good idea to measure where the problems actually are. - - [`py-spy`](https://github.com/benfred/py-spy): A performance measuring + - `py-spy `_: A performance measuring program (*profiler*) that provides good information and little overhead. - Prefer it to the standard `cProfile`. The output is typically presented in + Prefer it to the standard ``cProfile``. The output is typically presented in the form of "Flamegraphs" that show the relative time spent on each piece of code. -## Documentation +Documentation +------------- - - We use the [Sphinx tool](https://www.sphinx-doc.org/) to document code + - We use the `Sphinx tool `_ to document code projects. It can render and organize special purpose documentation files as well as read Python source files to automatically document interfaces. It supports extensive customization and plugins. In particular because the default formatting for docstrings is somewhat unwieldy, it is recommended - to enable the `napoleon` extension which allows for a more lenient - [`numpydoc`](https://numpydoc.readthedocs.io/en/latest/format.html) style. + to enable the ``napoleon`` extension which allows for a more lenient + `numpydoc `_ style. Similarly the default RST markup language can be overwhelming for simple documents. We enable the - [recommonmark](https://recommonmark.readthedocs.io/en/latest/) extension to + `recommonmark `_ extension to be able to compose files also in markdown format. -## Python static checks and code style +Python static checks and code style -We use [Pylint](https://www.pylint.org/) to provide static checking e.g. +We use `Pylint `_ to provide static checking e.g. finding basic errors that a compiler would catch in compiled languages. An example is using an unknown variable name. Pylint also provides basic guidelines on the structure of the code (e.g. avoid functions that are to complicated). Because Pylint is way too pedantic by default, we limit the checks to only those -considered useful. The `.pylintrc` file at the top level configures Pylint to +considered useful. The ``.pylintrc`` file at the top level configures Pylint to only mind those checks. Most Python IDEs and editors have some kind of support for Pylint. It is strongly recommended to configure the editor to show the problematic pieces of code proactively. -New code should use the [Black](https://black.readthedocs.io/en/stable/>) tool to +New code should use the `Black `_ tool to format the code. This tool should not be used to aggressively reformat existing files. -## Matplotlib Image Comparison Tests +Matplotlib Image Comparison Tests +--------------------------------- It is possible to create tests which perform an image comparison between a generated plot and a pre-existing baseline plot. Clearly this allows one to check consistency in figure generation. Before beginning you will need to ensure that you have the tests dependencies, -which can be checked in `nnpdf/conda-recipe/meta.yml`. +which can be checked in :code:`nnpdf/conda-recipe/meta.yml`. The next step is to write the test function. It is highly recommended to use the -[validphys API](../vp/api.md) for this, both to simplify the code and to make it agnostic to the +:ref:`validphys API ` for this, both to simplify the code and to make it agnostic to the structure of backend providers - provided that they produce the same results. See -for example a function which tests the `plot_pdfs` provider: - -```python -@pytest.mark.mpl_image_compare -def test_plotpdfs(): - pdfs = ["NNPDF31_nnlo_as_0118"] - Q = 10 - flavours = ["g"] - # plot_pdfs returns a generator with (figure, name_hint) - return next(API.plot_pdfs(pdfs=pdfs, Q=Q, flavours=flavours))[0] -``` +for example a function which tests the ``plot_pdfs`` provider: + +.. code:: python + + @pytest.mark.mpl_image_compare + def test_plotpdfs(): + pdfs = ["NNPDF31_nnlo_as_0118"] + Q = 10 + flavours = ["g"] + # plot_pdfs returns a generator with (figure, name_hint) + return next(API.plot_pdfs(pdfs=pdfs, Q=Q, flavours=flavours))[0] We see that the function needs to return a valid matplotlib figure, and should -be decorated with `@pytest.mark.mpl_image_compare`. +be decorated with :code:`@pytest.mark.mpl_image_compare`. Now the baseline figure needs to be generated, this can be achieved by running -``` -pytest -k --mpl-generate-path=baseline -``` +.. code:: bash + + pytest -k --mpl-generate-path=baseline + -which will generate a PNG of the figure in the `src/validphys/tests/baseline` +which will generate a PNG of the figure in the :code:`src/validphys/tests/baseline` directory. It is recommended to put all baseline plots in this directory so that they are automatically installed, and so will be in the correct location when -the [CI](../ci/index.md) runs the test suite. +the :ref:`CI ` runs the test suite. Now that the baseline figure exists you can check that your test works: -``` -pytest -k --mpl -``` +.. code:: bash + pytest -k --mpl Also you can check that the test has been added to the full test suite: -``` -pytest --pyargs --mpl validphys -``` +.. code:: bash + pytest --pyargs --mpl validphys -Just note that if you do not put the `--mpl` flag then the test will just check +Just note that if you do not put the :code:`--mpl` flag then the test will just check that the function runs without error, and won't check that the output matches to baseline. From f6e06530efe4533cc35c5509edf1b84729101f2b Mon Sep 17 00:00:00 2001 From: achiefa Date: Wed, 9 Oct 2024 15:19:58 +0100 Subject: [PATCH 12/22] Converted contributing/rules --- .../contributing/{rules.md => rules.rst} | 66 ++++++++++--------- 1 file changed, 34 insertions(+), 32 deletions(-) rename doc/sphinx/source/contributing/{rules.md => rules.rst} (72%) diff --git a/doc/sphinx/source/contributing/rules.md b/doc/sphinx/source/contributing/rules.rst similarity index 72% rename from doc/sphinx/source/contributing/rules.md rename to doc/sphinx/source/contributing/rules.rst index 7f37cca83b..2ca1ee6ee5 100644 --- a/doc/sphinx/source/contributing/rules.md +++ b/doc/sphinx/source/contributing/rules.rst @@ -1,58 +1,58 @@ -```eval_rst .. _rules: -``` -# Code development +Code development +================ Code development is carried out using Github. -For more information on the Git workflow that NNPDF adopts, see the [Git and GitHub](./git.md) section. +For more information on the Git workflow that NNPDF adopts, see the :ref:`Git and GitHub] ` section. -## Code contributions +Code contributions +------------------ -Code contributions should be presented in the form of [Pull -Requests](https://github.com/NNPDF/nnpdf/pulls)(PRs) to the repository. +Code contributions should be presented in the form of `Pull +Requests `_ (PRs) to the repository. Avoid committing modifications directly to the master version of the code. Instead, create a new branch and make modifications on it. This PR should adhere to the following rules: * **A clear explanation of the aims of the PR** should be given, i.e. what issue(s) are you trying to -address? If the reason for the PR has already been detailed in an issue, then this issue should be -linked in the PR. + address? If the reason for the PR has already been detailed in an issue, then this issue should be + linked in the PR. -* The PR should contain **[documentation](../sphinx-documentation.md) describing - the new features**, if applicable. +* The PR should contain **documentation describing the new features**, if applicable. * If the PR is fixing a bug, information should be given such that a reviewer can reproduce the bug. -* The PR should have **at least one developer assigned to it**, whose task it is to [review](reviews) the -code. The PR cannot be merged into master before the reviewer has approved it. +* The PR should have **at least one developer assigned to it**, whose task it is to :ref:`review ` the + code. The PR cannot be merged into master before the reviewer has approved it. -* Before a PR can be merged into master, the **Travis build for it must pass** (see [here](../ci/index.md)). -Practically, this means that you should find a green tick next to your PR on the relevant [PR -page](https://github.com/NNPDF/nnpdf/pulls). If you instead find a red cross next to your PR, the -reason for the failure must be investigated and dealt with appropriately. +* Before a PR can be merged into master, the **Travis build for it must pass** (see :ref:`here `). + Practically, this means that you should find a green tick next to your PR on the relevant `PR + page `_. If you instead find a red cross next to your PR, the + reason for the failure must be investigated and dealt with appropriately. * When writing examples, please use the recommended resources detailed -[here](vpexamples). + :ref:`here `. -## Example pull request +Example pull request +-------------------- You may find it instructive to go though this pull request that implements new convolution methods: - + https://github.com/NNPDF/nnpdf/pull/708/ It demonstrates how to add a new feature, together with relevant tests and documentation, and refine it based on the discussion. -```eval_rst .. _reviews: -``` -## Reviewing pull requests -All changes to the code [should](rules) be reviewed by at least one person (and ideally +Reviewing pull requests +----------------------- + +All changes to the code :ref:`should ` be reviewed by at least one person (and ideally at least two). The expected benefits of the policy are: - It should improve the overall quality of the code. @@ -65,7 +65,8 @@ at least two). The expected benefits of the policy are: and maintain in the future, and conform to the structure of the rest of the project. -### Guidelines for reviewing +Guidelines for reviewing +~~~~~~~~~~~~~~~~~~~~~~~~ The following approach has been found helpful for reviewers, when reviewing pull requests: @@ -101,13 +102,14 @@ requests: code should not break them. Some commits corresponding to major cosmetic changes have been collected in - [`.git-blame-ignore-revs`]( - https://docs.github.com/en/repositories/working-with-files/using-files/viewing-a-file#ignore-commits-in-the-blame-view - ). It is possible to configure the local git to ignore these commits when - running `git blame`: - ``` - git config blame.ignoreRevsFile .git-blame-ignore-revs - ``` + `.git-blame-ignore-revs + `_. + It is possible to configure the local git to ignore these commits when + running ``git blame``: + + .. code:: bash + + git config blame.ignoreRevsFile .git-blame-ignore-revs - Regardless of automated tests, always run code with the new changes From a2a3534187cb5373b7701932644ff2957ed963db Mon Sep 17 00:00:00 2001 From: achiefa Date: Wed, 9 Oct 2024 15:36:27 +0100 Subject: [PATCH 13/22] Converted external-code/cross-secs --- .../{cross-secs.md => cross-secs.rst} | 18 ++++++++++-------- doc/sphinx/source/external-code/index.rst | 6 +++--- 2 files changed, 13 insertions(+), 11 deletions(-) rename doc/sphinx/source/external-code/{cross-secs.md => cross-secs.rst} (69%) diff --git a/doc/sphinx/source/external-code/cross-secs.md b/doc/sphinx/source/external-code/cross-secs.rst similarity index 69% rename from doc/sphinx/source/external-code/cross-secs.md rename to doc/sphinx/source/external-code/cross-secs.rst index 355e1795b1..354feb4591 100644 --- a/doc/sphinx/source/external-code/cross-secs.md +++ b/doc/sphinx/source/external-code/cross-secs.rst @@ -1,4 +1,5 @@ -# Partonic cross section generation +Partonic cross section generation +================================= Many programmes exist to evaluate partonic cross sections. Some are general purpose, such as MadGraph5\_aMC@NLO and MCFM, in that they compute predictions for a variety of physical processes @@ -10,26 +11,27 @@ supplements these with NNLO QCD corrections which are computed with a code with These C-factors are often provided to the collaboration by external parties, rather than the code being run in-house. -[MadGraph5\_aMC@NLO](https://launchpad.net/mg5amcnlo) is the programme that will be used for most of +`MadGraph5\_aMC@NLO `_ is the programme that will be used for most of the future NNPDF calculations of partonic cross sections. This is in large part due to its ability to compute predictions at NLO in QCD with NLO EW corrections. To generate APPLgrids from -MadGraph5\_aMC@NLO, one can use [aMCfast](https://amcfast.hepforge.org/), which interfaces between +MadGraph5\_aMC@NLO, one can use `aMCfast `_, which interfaces between the two formats. -## Other codes +Other codes +----------- -[MCFM](https://mcfm.fnal.gov/) ('Monte Carlo for FeMtobarn processes') is an alternative programme +`MCFM `_ ('Monte Carlo for FeMtobarn processes') is an alternative programme to MadGraph5\_aMC@NLO, which instead uses mcfm-bridge as an interface to generate APPLgrids. -[FEWZ](https://arxiv.org/abs/1011.3540) ('Fully Exclusive W and Z Production') is a programme for +`FEWZ `_ ('Fully Exclusive W and Z Production') is a programme for calculating (differential) cross sections for the Drell-Yan production of lepton pairs up to NNLO in QCD. -[NLOjet++](http://www.desy.de/~znagy/Site/NLOJet++.html) is a programme that can compute cross +`NLOjet++ `_ is a programme that can compute cross sections for a variety of processes up to NLO. The processes include electron-positron annihilation, deep-inelastic scattering (DIS), photoproduction in electron-proton collisions, and a variety of processes in hadron-hadron collisions. -[Top++](http://www.precision.hep.phy.cam.ac.uk/top-plus-plus/) is a programme for computing top +`Top++ `_ is a programme for computing top quark pair production inclusive cross sections at NNLO in QCD with soft gluon resummation included up to next-to-next-to-leading log (NNLL). diff --git a/doc/sphinx/source/external-code/index.rst b/doc/sphinx/source/external-code/index.rst index 824dbb10b5..eaa43b51f0 100644 --- a/doc/sphinx/source/external-code/index.rst +++ b/doc/sphinx/source/external-code/index.rst @@ -15,6 +15,6 @@ various external codes that you will frequently encounter are described. .. toctree:: :maxdepth: 1 - ./pdf-codes.md - ./grids.md - ./cross-secs.md + ./pdf-codes.rst + ./grids.rst + ./cross-secs.rst From cd0f6331664c5dd267f16516e3424409301c8c45 Mon Sep 17 00:00:00 2001 From: achiefa Date: Wed, 9 Oct 2024 15:36:43 +0100 Subject: [PATCH 14/22] Converted external-code/grids --- .../external-code/{grids.md => grids.rst} | 20 +++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) rename doc/sphinx/source/external-code/{grids.md => grids.rst} (71%) diff --git a/doc/sphinx/source/external-code/grids.md b/doc/sphinx/source/external-code/grids.rst similarity index 71% rename from doc/sphinx/source/external-code/grids.md rename to doc/sphinx/source/external-code/grids.rst index d7d2e95924..c37018807b 100644 --- a/doc/sphinx/source/external-code/grids.md +++ b/doc/sphinx/source/external-code/grids.rst @@ -1,4 +1,5 @@ -# Grid generation +Grid generation +=============== Grids play a crucial role in NNPDF fits. This is because they enable otherwise very time consuming computations to be computed on the fly during an NNPDF fit. The guiding principle behind producing @@ -10,7 +11,7 @@ functions) while FK tables combine APPLgrids with DGLAP evolution kernels from A means that FK tables can simply be combined with PDFs at the fitting scale to produce predictions for observables at the scale of the process. -[APPLgrid](https://applgrid.hepforge.org/) is a C++ programme that allows the user to change certain +`APPLgrid `_ is a C++ programme that allows the user to change certain settings within observable calculations a posteriori. Most importantly, the user can change the PDF set used, but they can also alter the renormalisation scale, factorisation scale and the strong coupling constant. Without APPLgrids, such changes would usually require a full rerun of the code, @@ -18,15 +19,14 @@ which is very time consuming. Moreover, these features are crucial for PDF fits, sections must be convolved with different PDFs on the fly many times. APPLgrid works for hadron collider processes up to NLO in QCD, although work is ongoing to also include NLO electroweak corrections in the APPLgrid format. In addition to the standard version of APPLgrid, a modified -version of APPLgrid exists which includes photon channels. This is known as APPLgridphoton. To -learn how to generate APPLgrids, please see the tutorial [here](../tutorials/APPLgrids.md). +version of APPLgrid exists which includes photon channels. This is known as APPLgridphoton. -APFELcomb generates FK tables for NNPDF fits. Information on how to use it can be found -[here](./apfelcomb.md). You can read about the mechanism behind APFELcomb -[here](https://arxiv.org/abs/1605.02070) and find more information about the theory behind FK tables -in the [Theory section](../Theory/FastInterface.rst). +APFELcomb generates FK tables for NNPDF fits. You can read about the mechanism behind APFELcomb +`here `_ and find more information about the theory behind FK tables +in the :ref:`Theory section `. -## Other codes +Other codes +----------- -[fastNLO](https://fastnlo.hepforge.org/) is an alternative code to APPLgrid, which is currently not +`fastNLO `_ is an alternative code to APPLgrid, which is currently not used by NNPDF, since the grids produced by fastNLO are not interfaced with the NNPDF code. From ae74c267f00f0ccf4ce521801c89e4fa74ad5a6d Mon Sep 17 00:00:00 2001 From: achiefa Date: Wed, 9 Oct 2024 16:32:59 +0100 Subject: [PATCH 15/22] converted tutorial docs to rst --- .../{pdf-codes.md => pdf-codes.rst} | 35 +-- doc/sphinx/source/theory/index.rst | 2 + .../{closuretest.md => closuretest.rst} | 190 +++++++------ .../{compare-fits.md => compare-fits.rst} | 23 +- .../source/tutorials/{conda.md => conda.rst} | 9 +- doc/sphinx/source/tutorials/datthcomp.md | 54 ---- doc/sphinx/source/tutorials/datthcomp.rst | 58 ++++ doc/sphinx/source/tutorials/index.rst | 16 +- doc/sphinx/source/tutorials/list-resources.md | 43 --- .../source/tutorials/list-resources.rst | 45 +++ doc/sphinx/source/tutorials/report.md | 83 ------ doc/sphinx/source/tutorials/report.rst | 82 ++++++ .../source/tutorials/reportengine_parallel.md | 172 ------------ .../tutorials/reportengine_parallel.rst | 203 ++++++++++++++ doc/sphinx/source/tutorials/run-fit.md | 257 ------------------ doc/sphinx/source/tutorials/run-fit.rst | 255 +++++++++++++++++ 16 files changed, 788 insertions(+), 739 deletions(-) rename doc/sphinx/source/external-code/{pdf-codes.md => pdf-codes.rst} (66%) rename doc/sphinx/source/tutorials/{closuretest.md => closuretest.rst} (62%) rename doc/sphinx/source/tutorials/{compare-fits.md => compare-fits.rst} (74%) rename doc/sphinx/source/tutorials/{conda.md => conda.rst} (60%) delete mode 100644 doc/sphinx/source/tutorials/datthcomp.md create mode 100644 doc/sphinx/source/tutorials/datthcomp.rst delete mode 100644 doc/sphinx/source/tutorials/list-resources.md create mode 100644 doc/sphinx/source/tutorials/list-resources.rst delete mode 100644 doc/sphinx/source/tutorials/report.md create mode 100644 doc/sphinx/source/tutorials/report.rst delete mode 100644 doc/sphinx/source/tutorials/reportengine_parallel.md create mode 100644 doc/sphinx/source/tutorials/reportengine_parallel.rst delete mode 100644 doc/sphinx/source/tutorials/run-fit.md create mode 100644 doc/sphinx/source/tutorials/run-fit.rst diff --git a/doc/sphinx/source/external-code/pdf-codes.md b/doc/sphinx/source/external-code/pdf-codes.rst similarity index 66% rename from doc/sphinx/source/external-code/pdf-codes.md rename to doc/sphinx/source/external-code/pdf-codes.rst index e5e5f97fe7..2df016bc5d 100644 --- a/doc/sphinx/source/external-code/pdf-codes.md +++ b/doc/sphinx/source/external-code/pdf-codes.rst @@ -1,45 +1,48 @@ -```eval_rst .. _lhapdf: -``` -# PDF set storage and interpolation -[LHAPDF](https://lhapdf.hepforge.org/) is a C++ library that evaluates PDFs by interpolating the +PDF set storage and interpolation +================================= + +`LHAPDF `_ is a C++ library that evaluates PDFs by interpolating the discretised PDF 'grids' that PDF collaborations produce. It also gives its users access to proton and nuclear PDF sets from a variety of PDF collaborations, including NNPDF, MMHT and CTEQ. A list of all currently available PDF sets can be found on their -[website](https://lhapdf.hepforge.org/pdfsets.html). Particle physics programmes that typically make +`website `_. Particle physics programmes that typically make use of PDFs, such as Monte Carlo event generators, will usually be interfaced with LHAPDF, to allow a user to easily specify the PDF set that they wish to use in their calculations. You can read more -about LHAPDF by reading the [paper](https://arxiv.org/abs/1412.7420) that marked their latest +about LHAPDF by reading the `paper `_ that marked their latest release. -## PDF evolution +PDF evolution +------------- -[APFEL](https://apfel.hepforge.org/) ('A PDF Evolution Library') is the PDF evolution code currently +`APFEL `_ ('A PDF Evolution Library') is the PDF evolution code currently used by the NNPDF Collaboration. In addition to its PDF evolution capabilities, it also produces predictions of deep-inelastic scattering structure functions. In recent years it has been developed alongside NNPDF, and so it therefore contains the features and settings required in an NNPDF fit. That is, it includes quark masses in the MSbar scheme, the various FONLL heavy quark schemes, scale variations up to NLO, etc. Note that at the time of writing, a more streamlined code is being written to replace APFEL, which is currently dubbed EKO ('Evolution Kernel Operator'). To find more -general information about PDF evolution and the DGLAP equations, you can go to the [Theory -section](dglap.md). +general information about PDF evolution and the DGLAP equations, you can go to the :ref:`Theory section `. + +PDF compression +--------------- -## PDF compression PDF compression seeks to maintain the statistical accuracy of a large sample of replicas produced by a fit when using a PDF set with a smaller number of replicas (and thus fewer convolutions required to compute cross sections with PDF uncertainties). For example the main published PDFs are typically based on a 1000 replica fit, which can then be compressed to around a 100 replicas PDF set while maintaining good accuracy of most relevant statistical estimators. -This is done with the [pyCompressor](https://n3pdf.github.io/pycompressor/) library, +This is done with the `pyCompressor `_ library, a python compression code that extracts, from an initial PDF set of replicas, the subset that most truthfully reproduces the underlying probability distribution of the prior. -[pyCompressor](https://n3pdf.github.io/pycompressor/) is an updated python version of -[compressor](https://github.com/scarrazza/compressor), which was used in previous releases. +`pyCompressor `_ is an updated python version of +`compressor `_, which was used in previous releases. -### Other codes +Other codes +~~~~~~~~~~~ -[Hoppet](https://hoppet.hepforge.org/) ('Higher Order Perturbative Parton Evolution Toolkit') is an +`Hoppet `_ ('Higher Order Perturbative Parton Evolution Toolkit') is an alternative PDF evolution code which is capable of evolving unpolarised PDFs to NNLO and linearly polarised PDFs to NLO. The unpolarised evolution includes heavy-quark thresholds in the MSbar scheme. diff --git a/doc/sphinx/source/theory/index.rst b/doc/sphinx/source/theory/index.rst index 87a70ba17f..5f8a288305 100644 --- a/doc/sphinx/source/theory/index.rst +++ b/doc/sphinx/source/theory/index.rst @@ -1,3 +1,5 @@ +.. _theory: + Theory ====== diff --git a/doc/sphinx/source/tutorials/closuretest.md b/doc/sphinx/source/tutorials/closuretest.rst similarity index 62% rename from doc/sphinx/source/tutorials/closuretest.md rename to doc/sphinx/source/tutorials/closuretest.rst index 04f06621c6..9d7368ecb4 100644 --- a/doc/sphinx/source/tutorials/closuretest.md +++ b/doc/sphinx/source/tutorials/closuretest.rst @@ -1,17 +1,18 @@ -```eval_rst .. _tut_closure: -``` -# How to run a closure test + +How to run a closure test +========================= Closure tests are a way to validate methodology by fitting on pseudodata generated from pre-existing PDFs. There are different levels of closure tests which aim to validate different components of the fitting toolchain. -## Brief background +Brief background +---------------- For more detailed information on the conception of closure tests, see the -[NNPDF3.0 paper](https://arxiv.org/abs/1410.8849). +`NNPDF3.0 paper `_. Each closure test defines a ``fakepdf`` in the runcard, which will be referred to here as the underlying law. For the purpose of the closure test it can be thought @@ -22,17 +23,17 @@ There are three levels of closure test: 1. level 0 - central pseudodata is given by central predictions of the underlying law - no MC noise is added on top of the central data, each replica is fitting - the same set of data + the same set of data 2. level 1 - central pseudodata is shifted by some noise η which is drawn - from the experimental covariance matrix and represents - 'real' central values provided by experimentalists which do not sit exactly - on the underlying law but are consistent with it according to their own - uncertainty + from the experimental covariance matrix and represents + 'real' central values provided by experimentalists which do not sit exactly + on the underlying law but are consistent with it according to their own + uncertainty - no MC noise is added, each replica fits a subset of the same shifted data. - There is however a difference in the training/validation split used for - stopping, the spread on replicas can be thought of as the spread due to this - split in addition to any methodological uncertainty. + There is however a difference in the training/validation split used for + stopping, the spread on replicas can be thought of as the spread due to this + split in addition to any methodological uncertainty. 3. level 2 - central pseudodata is shifted by level 1 noise η - MC noise is added on top of the level 1 shift @@ -44,19 +45,21 @@ methodology is extracting this from shifted data, using closure test estimators. The main obvious disadvantage is that a pre-existing PDF may not be a suitable proxy for the underlying law. -## Preparing the closure test runcard +.. _prep_ct_runcard: +Preparing the closure test runcard +---------------------------------- To run a closure test we require a standard fit runcard. The main section which controls closure test specific behaviour can be found under ``closuretest``. Before you've made any changes, a typical ``closuretest`` section will be as follows: -```yaml -closuretest: - filterseed : 0 # Random seed to be used in filtering data partitions - fakedata : False - fakepdf : MMHT2014nnlo68cl - fakenoise : False -``` +.. code:: yaml + + closuretest: + filterseed : 0 # Random seed to be used in filtering data partitions + fakedata : False + fakepdf : MMHT2014nnlo68cl + fakenoise : False Setting ``fakedata`` to ``True`` will cause closure test pseudodata to be generated and subsequently fitted. The PDf which the pseudodata will be generated from @@ -70,102 +73,109 @@ add to the pseudodata during the filtering step, this is require for An example of a typical level 1 or level 2 ``closuretest`` specification is given -```yaml -closuretest: - filterseed : 0 # Random seed to be used in filtering data partitions - fakedata : True - fakepdf : MMHT2014nnlo68cl - fakenoise : True -``` +.. code:: yaml + + closuretest: + filterseed : 0 # Random seed to be used in filtering data partitions + fakedata : True + fakepdf : MMHT2014nnlo68cl + fakenoise : True + Note that it is *critical* that two closure tests which are to be compared have the same ``filterseed``. They should also both have been run during a time where no major changes were made to data generation. This is because fits with different level 1 noise produce different closure test estimators. See for -example a [report](https://vp.nnpdf.science/mbcTUd6-TQmQFvaGd37bkg==/) +example a `report `_ comparing two level 2 closure tests with identical settings apart from ``filterseed``. There are still some relevant settings to the closure test. For the above example we would choose that the t0 set was the same as the underlying law: -```yaml -datacuts: - t0pdfset : MMHT2014nnlo68cl # PDF set to generate t0 covmat - ... -``` +.. code:: yaml + + datacuts: + t0pdfset : MMHT2014nnlo68cl # PDF set to generate t0 covmat + ... Finally we need to specify whether or not MC replicas will be generated in the fit, differentiating between a level 1 and level 2 closure test. This can be achieved by setting ``genrep`` under ``fitting`` to be ``True`` -```yaml -fitting: - ... - genrep : True - ... -``` +.. code:: yaml + + fitting: + ... + genrep : True + ... -### Summary for each level of closure test + +Summary for each level of closure test +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ See below for the keys which specify each level of closure test, other keys can be chosen by the user. -#### Level 0 - -```yaml -fitting: - ... - genrep : False - ... -closuretest: - ... - fakedata : True - fakenoise : False - ... -``` - -#### Level 1 - -```yaml -fitting: - ... - genrep : False - ... -closuretest: - ... - fakedata : True - fakenoise : True - ... -``` - -#### Level 2 - -```yaml -fitting: - ... - genrep : True - ... -closuretest: - ... - fakedata : True - fakenoise : True - ... -``` - -## Running a closure test with ``n3fit`` +Level 0 +^^^^^^^ + +.. code:: yaml + + fitting: + ... + genrep : False + ... + closuretest: + ... + fakedata : True + fakenoise : False + ... + +Level 1 +^^^^^^^ + +.. code:: yaml + + fitting: + ... + genrep : False + ... + closuretest: + ... + fakedata : True + fakenoise : True + ... + + +Level 2 +^^^^^^^ + +.. code:: yaml + fitting: + ... + genrep : True + ... + closuretest: + ... + fakedata : True + fakenoise : True + ... + +Running a closure test with ``n3fit`` +------------------------------------- Running a closure test with ``n3fit`` will require a valid ``n3fit`` runcard, with the closure test settings modified as shown -[above](#preparing-the-closure-test-runcard). The difference +:ref:`above `. The difference between running a closure fit in ``n3fit`` and a standard fit is that the user is required to run ``vp-setupfit`` on the runcard before running ``n3fit``. This is because the filtering of the data is required to generate the pseudodata central values. The workflow is as follows: -```bash -$ vp-setupfit fitname.yml -$ n3fit fitname.yml -``` +.. code:: bash + + $ vp-setupfit fitname.yml + $ n3fit fitname.yml You will still need to evolve the fit and run ``postfit`` as with a standard -[``n3fit``](../n3fit/usage.md). +:ref:`n3fit `. diff --git a/doc/sphinx/source/tutorials/compare-fits.md b/doc/sphinx/source/tutorials/compare-fits.rst similarity index 74% rename from doc/sphinx/source/tutorials/compare-fits.md rename to doc/sphinx/source/tutorials/compare-fits.rst index 374a847b9a..e75e30dd90 100644 --- a/doc/sphinx/source/tutorials/compare-fits.md +++ b/doc/sphinx/source/tutorials/compare-fits.rst @@ -1,18 +1,18 @@ -```eval_rst .. _compare-fits: -``` -# How to compare two fits + +How to compare two fits +======================= After running a fit, one will usually want to check that the fit gives expected results and to see how the new fit compares to an older fit, perhaps the -baseline `t0`. One may of course write their own `validphys` runcard such that +baseline ``t0``. One may of course write their own ``validphys`` runcard such that they can directly look at the statistical estimators and plots that they are -interested in (see the [validphys](../vp/index.html) section of the docs for +interested in (see the :ref:`validphys ` section of the docs for help in doing this). However, a convenient and zero-thinking script exists for comparing two fits, which contains all the estimators and plots that one will usually want to see when looking at a new fit. This script can be run with the -command `vp-comparefits -i`, where the `-i` flag runs the script in interactive +command :code:`vp-comparefits -i`, where the :code:`-i` flag runs the script in interactive mode. Once launched, the user is then prompted to enter the name of the current fit @@ -22,15 +22,14 @@ to), a label for the reference fit, the title of the report (though a sensible default is suggested), the name of the author, any keywords for the report, and a choice of whether or not to use a theory covariance matrix for the statistical estimators that will be used in the report (the default is to include the -contribution due to the theory covariance matrix). The `keywords` field is for -`validphys` indexing and as such similar reports should make use of the same +contribution due to the theory covariance matrix). The ``keywords`` field is for +``validphys`` indexing and as such similar reports should make use of the same keywords, which are relevant to the project in question. The resulting report produces a summary of the two fits and can be uploaded to -the server by using `vp-upload `, where the folder is called -`output` by default. +the server by using :code:`vp-upload `, where the folder is called +``output`` by default. -```eval_rst The `vp-comparefits` application is implemented as a small wrapper on top of a specific `validphys` report. The wrapper code is defined in the :py:mod:`validphys.scripts.vp_comparefits` module, and the specific templates @@ -38,4 +37,4 @@ are in :py:mod:`validphys.comparefittemplates`. The template sets some reasonable defaults such as the energy scale of the PDF comparisons or the type of covariance matrix used for χ² comparisons (experimental and ignoring the weights). -``` + diff --git a/doc/sphinx/source/tutorials/conda.md b/doc/sphinx/source/tutorials/conda.rst similarity index 60% rename from doc/sphinx/source/tutorials/conda.md rename to doc/sphinx/source/tutorials/conda.rst index 483f1b6a6e..efe08cac2a 100644 --- a/doc/sphinx/source/tutorials/conda.md +++ b/doc/sphinx/source/tutorials/conda.rst @@ -1,10 +1,11 @@ -### How to build conda-packages +How to build conda-packages +--------------------------- A conda package is a compressed tarball containing the module to be installed and the information on how to install it. A detailed tutorial on how build conda-packages ca be found here -``` -https://docs.conda.io/projects/conda-build/en/latest/index.html -``` + + https://docs.conda.io/projects/conda-build/en/latest/index.html + diff --git a/doc/sphinx/source/tutorials/datthcomp.md b/doc/sphinx/source/tutorials/datthcomp.md deleted file mode 100644 index 563a7b4b78..0000000000 --- a/doc/sphinx/source/tutorials/datthcomp.md +++ /dev/null @@ -1,54 +0,0 @@ -```eval_rst -.. _datthcomp: -``` -# How to do a data theory comparison - -This tutorial explains how to compare the data and theory for a given data set or list of data sets. - -You need to provide: - -1. A PDF which includes your data set; -2. A valid theory ID; -3. A choice of cuts policy; -4. A list of data sets to do the comparison for. - -Below is an example runcard for a data theory comparison for BCDMSP, `runcard.yaml`: -```yaml -meta: - title: BCDMSP data/theory comparison - keywords: [example] - author: Rosalyn Pearson - -pdfs: - - id: NNPDF31_nnlo_as_0118 - label: NNPDF31_nnlo_as_0118 - -theoryid: 53 - -use_cuts: false - -dataset_inputs: - - { dataset: BCDMSP} - -template: dthcomparison.md - -actions_: - - report(main=true) -``` - -The corresponding template, `dthcomparison.md`, looks like this: -```yaml -%BCDMSP (theory ID 52) - -{@ dataset_inputs plot_fancy @} -{@ dataset_inputs::pdfs plot_fancy(normalize_to=data)@} -{@ dataset_inputs::pdfs plot_chi2dist @} -{@ dataset_inputs::pdfs group_result_table @} -``` - -1. `plot_fancy` produces data-theory comparison plots for the data. This is called twice to produce both normalised and unnormalised sets of plots. -2. `plot_chi2dist` gives the chi2 distribution between the theory and data. -3. `group_result_table` gives the numerical values which appear in the plots. - -Running `validphys runcard.yaml` should produce a `validphys` report of the data-theory comparison like the one [here](https://vp.nnpdf.science/ErmVZEPGT42GCfreWwzalg==/) - see the -[vp-guide](https://data.nnpdf.science/validphys-docs/guide.html#development-installs). diff --git a/doc/sphinx/source/tutorials/datthcomp.rst b/doc/sphinx/source/tutorials/datthcomp.rst new file mode 100644 index 0000000000..8bfdd2f81e --- /dev/null +++ b/doc/sphinx/source/tutorials/datthcomp.rst @@ -0,0 +1,58 @@ +.. _datthcomp: + +How to do a data theory comparison +================================== + +This tutorial explains how to compare the data and theory for a given data set or list of data sets. + +You need to provide: + +1. A PDF which includes your data set; +2. A valid theory ID; +3. A choice of cuts policy; +4. A list of data sets to do the comparison for. + +Below is an example runcard for a data theory comparison for BCDMSP, ``runcard.yaml``: + +.. code:: yaml + + meta: + title: BCDMSP data/theory comparison + keywords: [example] + author: Rosalyn Pearson + + pdfs: + - id: NNPDF31_nnlo_as_0118 + label: NNPDF31_nnlo_as_0118 + + theoryid: 53 + + use_cuts: false + + dataset_inputs: + - { dataset: BCDMSP} + + template: dthcomparison.md + + actions_: + - report(main=true) + +The corresponding template, ``dthcomparison.md``, looks like this: + +.. code:: yaml + + %BCDMSP (theory ID 52) + + {@ dataset_inputs plot_fancy @} + {@ dataset_inputs::pdfs plot_fancy(normalize_to=data)@} + {@ dataset_inputs::pdfs plot_chi2dist @} + {@ dataset_inputs::pdfs group_result_table @} + +1. ``plot_fancy`` produces data-theory comparison plots for the data. This is called + twice to produce both normalised and unnormalised sets of plots. +2. ``plot_chi2dist`` gives the chi2 distribution between the theory and data. +3. ``group_result_table`` gives the numerical values which appear in the plots. + +Running :code:`validphys runcard.yaml` should produce a ``validphys`` report of the data-theory +comparison like the one `here `_ - see the +`vp-guide `_. diff --git a/doc/sphinx/source/tutorials/index.rst b/doc/sphinx/source/tutorials/index.rst index 0fd9b49a98..a9902db814 100644 --- a/doc/sphinx/source/tutorials/index.rst +++ b/doc/sphinx/source/tutorials/index.rst @@ -11,7 +11,7 @@ Running fits .. toctree:: :maxdepth: 1 - ./run-fit.md + ./run-fit.rst ./run-iterated-fit.rst ./run-qed-fit.rst ./polarized_fits.rst @@ -23,12 +23,12 @@ Analysing results .. toctree:: :maxdepth: 1 - ./compare-fits.md - ./report.md - ./reportengine_parallel.md + ./compare-fits.rst + ./report.rst + ./reportengine_parallel.rst ./plot_pdfs.rst ./pdfbases.rst - ./datthcomp.md + ./datthcomp.rst ./overfit_metric.rst Closure tests @@ -36,7 +36,7 @@ Closure tests .. toctree:: :maxdepth: 1 - ./closuretest.md + ./closuretest.rst ./closureestimators.rst Special PDF sets @@ -53,9 +53,9 @@ Miscellaneous .. toctree :: :maxdepth: 1 - ./list-resources.md + ./list-resources.rst ./pseudodata.rst ./newplottingfn.rst ./addspecialgrouping.rst - ./conda.md + ./conda.rst ./futuretests.rst diff --git a/doc/sphinx/source/tutorials/list-resources.md b/doc/sphinx/source/tutorials/list-resources.md deleted file mode 100644 index 1ebb83ba44..0000000000 --- a/doc/sphinx/source/tutorials/list-resources.md +++ /dev/null @@ -1,43 +0,0 @@ -# How to list the available resources - -```eval_rst -.. _vp-list: -``` - -## Using `vp-list` - -In order to check what resources are available locally and for download, use -`vp-list` which will print out the names of resources. - -```bash -vp-list -``` - -The options for resource type can be seen with `vp-list --help`. - -```bash -$ vp-list --help -usage: vp-list [-h] [-r | -l] resource - -vp-list Script which lists available resources locally and remotely - -positional arguments: - resource The type of resource to check availability for (locally - and/or remotely). Choose from: theories, fits, pdfs, - datasets. - -``` - -You can use the options `-l/--local-only` or `-r/--remote-only` to only check -for resources available locally or remotely respectively. - -## Manually checking server - example with fits - -You can also check manually on the storage servers for these resources. For example, -the bulk of the available fits can be found by going to the fits folder of the -NNPDF data server. Some other fits may be found in standalone folders in the home -folder of this server, but of course finding a specific fit here may require some -digging. For help in accessing the server, -please see [here](NNPDF-server). For information on how to download fits and -other resources, -please see the [Downloading resources](download) section of the vp-guide. \ No newline at end of file diff --git a/doc/sphinx/source/tutorials/list-resources.rst b/doc/sphinx/source/tutorials/list-resources.rst new file mode 100644 index 0000000000..ce7d4c81c1 --- /dev/null +++ b/doc/sphinx/source/tutorials/list-resources.rst @@ -0,0 +1,45 @@ +How to list the available resources +=================================== + +.. _vp-list: + + +Using ``vp-list`` +---------------- + +In order to check what resources are available locally and for download, use +``vp-list`` which will print out the names of resources. + +.. code:: bash + + vp-list + +The options for resource type can be seen with :code:`vp-list --help`. + +.. code:: bash + + $ vp-list --help + usage: vp-list [-h] [-r | -l] resource + + vp-list Script which lists available resources locally and remotely + + positional arguments: + resource The type of resource to check availability for (locally + and/or remotely). Choose from: theories, fits, pdfs, + datasets. + + +You can use the options :code:`-l/--local-only` or :code:`-r/--remote-only` to only check +for resources available locally or remotely respectively. + +Manually checking server - example with fits +-------------------------------------------- + +You can also check manually on the storage servers for these resources. For example, +the bulk of the available fits can be found by going to the fits folder of the +NNPDF data server. Some other fits may be found in standalone folders in the home +folder of this server, but of course finding a specific fit here may require some +digging. For help in accessing the server, +please see :ref:`here `. For information on how to download fits and +other resources, please see the :ref:`Downloading resources ` section +of the vp-guide. \ No newline at end of file diff --git a/doc/sphinx/source/tutorials/report.md b/doc/sphinx/source/tutorials/report.md deleted file mode 100644 index 4b7f432bd3..0000000000 --- a/doc/sphinx/source/tutorials/report.md +++ /dev/null @@ -1,83 +0,0 @@ -```eval_rst -.. _tut_report: -``` - -# How to generate a report - -Suppose that we want to generate a custom report that includes plots and -statistics that are not included as part of the report generated by -[`vp-comparefits`](./compare-fits.md). We may be lucky enough to find an example -runcard that produces what we need in -[`validphys2/examples`](https://github.com/NNPDF/nnpdf/tree/master/validphys2/examples). -However, we may need to write our own own [`yaml`](https://yaml.org/) runcard -from scratch. - -Suppose we want to have histograms of the χ2 per replica for some set of -experimental datasets. - -The calling of [`validphys`](vp-index) actions are used as normal. The action we -are looking for is -[`plot_chi2dist`](https://github.com/NNPDF/nnpdf/blob/d79059975e4ef97063c6bdd9f19dfb908586e453/validphys2/src/validphys/dataplots.py#L50). -Here's an example report that does what we're looking for: - -```yaml -meta: - title: Distribution plots per replica across experiments - author: Shayan Iranipour - keywords: [chi2, replica, distribution, DISonly] - -fit: NNPDF31_nnlo_as_0118_DISonly - -pdf: - from_: "fit" - -experiments: - from_: "fit" - -theoryid: 53 - -use_cuts: "fromfit" - -template_text: | - # Histograms of χ2 - ## DIS only distributions - {@experiments::experiment plot_chi2dist@} - -actions_: - - report(main=True) -``` - -The `report(main=True)` command is what generates the report. We can customize -the formatting of the report using -[`markdown`](https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet) -syntax. Note for example that `# Histograms of χ2` gives an appropriate title -to the section where we will find our plot. - -If the `template_text` section of the runcard becomes large and unwieldy, it may -be preferable to put the information from this section in a separate file. In -such cases one can create a `markdown` template file, usually called `template.md` -such as - -```md -# Histograms of χ2 -## DIS only distributions -{@experiments::experiment plot_chi2dist@} -``` - -and change the `template_text` section of the runcard to the following - -```yaml -template: template.md -``` - -where this assumes that `template.md` is in the same folder as that in which you -execute the `validphys` command. - -The `meta` field is important for retrieving the report once it has been -uploaded to the [validphys server](https://vp.nnpdf.science/). `title` and -`author` are fields that appear when browsing through reports while the -`keywords` allow quick retrieval of the report by the search functionality of -the server. Setting appropriate keywords is especially important when working on -a big project, within which it is likely that many reports will be produced. In -such cases a `keyword` should be chosen for the project and set in each uploaded -report. diff --git a/doc/sphinx/source/tutorials/report.rst b/doc/sphinx/source/tutorials/report.rst new file mode 100644 index 0000000000..b96fabc35a --- /dev/null +++ b/doc/sphinx/source/tutorials/report.rst @@ -0,0 +1,82 @@ +.. _tut_report: + +How to generate a report +======================== + +Suppose that we want to generate a custom report that includes plots and +statistics that are not included as part of the report generated by +:ref:`vp-comparefits `. We may be lucky enough to find an example +runcard that produces what we need in +`validphys2/examples `_. +However, we may need to write our own own `yaml `_ runcard +from scratch. + +Suppose we want to have histograms of the χ2 per replica for some set of +experimental datasets. + +The calling of :ref:`validphys ` actions are used as normal. The action we +are looking for is +`plot_chi2dist `_. +Here's an example report that does what we're looking for: + +.. code:: yaml + + meta: + title: Distribution plots per replica across experiments + author: Shayan Iranipour + keywords: [chi2, replica, distribution, DISonly] + + fit: NNPDF31_nnlo_as_0118_DISonly + + pdf: + from_: "fit" + + experiments: + from_: "fit" + + theoryid: 53 + + use_cuts: "fromfit" + + template_text: | + # Histograms of χ2 + ## DIS only distributions + {@experiments::experiment plot_chi2dist@} + + actions_: + - report(main=True) + +The ``report(main=True)`` command is what generates the report. We can customize +the formatting of the report using +`markdown `_ +syntax. Note for example that ``# Histograms of χ2`` gives an appropriate title +to the section where we will find our plot. + +If the ``template_text`` section of the runcard becomes large and unwieldy, it may +be preferable to put the information from this section in a separate file. In +such cases one can create a ``markdown`` template file, usually called ``template.md`` +such as + +.. code:: md + + # Histograms of χ2 + ## DIS only distributions + {@experiments::experiment plot_chi2dist@} + +and change the `template_text` section of the runcard to the following + +.. code:: yaml + + template: template.md + +where this assumes that ``template.md`` is in the same folder as that in which you +execute the ``validphys`` command. + +The ``meta`` field is important for retrieving the report once it has been +uploaded to the `validphys server `_. ``title`` and +``author`` are fields that appear when browsing through reports while the +``keywords`` allow quick retrieval of the report by the search functionality of +the server. Setting appropriate keywords is especially important when working on +a big project, within which it is likely that many reports will be produced. In +such cases a ``keyword`` should be chosen for the project and set in each uploaded +report. diff --git a/doc/sphinx/source/tutorials/reportengine_parallel.md b/doc/sphinx/source/tutorials/reportengine_parallel.md deleted file mode 100644 index 22f9c371cd..0000000000 --- a/doc/sphinx/source/tutorials/reportengine_parallel.md +++ /dev/null @@ -1,172 +0,0 @@ -# How to run an analysis in parallel - - -In this tutorial, we will demonstrate how to use the parallel resource executor of reportengine to run a `validphys` analysis (or any other `reportengine` app analysis). Typically, when running a `validphys` script, `reportengine` creates a directed acyclic graph (DAG) that is executed sequentially, meaning that each node must wait for the previous node to complete before it can be evaluated. This approach is not very efficient, especially if the nodes are independent of each other. -The parallel execution of a reportengine task is based on `dask.distributed` ([dask-distributed](https://distributed.dask.org/en/stable/)). - -The main steps to follow when running a task in parallel are: - -1. Initialize a dask-scheduler (this can be done, for instance, on a separate screen opened from command line) - - ```$ dask-scheduler &``` - - Note that the above command should output the scheduler address (e.g. Scheduler at: tcp://171.24.141.13:8786). - -2. Assign some workers to the scheduler and specify the amount of memory each of them can use. For example: - - ```$ dask-worker --nworkers 10 --nthreads 1 --memory-limit "14 GiB" &``` - - Note that in the above example we also fixed the number of threads at disposal by each worker to be 1, this is important in order to avoid possible racing conditions triggered by the matplotlib library. - -3. Run the process using the correct flags: - - ```$ validphys --parallel --scheduler ``` - - running the process this way should output a client dashboard link (e.g.: http://172.24.142.17:8787/status) from which the status of the job can be monitored. Note that for this to work you will need the package `bokeh` with version `>=2.4.3`. This can be easily obtained, e.g., - ```pip install bokeh==2.4.3```. - -The main thing to take care of when running a certain process is point 2 and, in particular, the amount of memory that is being assigned to each worker. In the following we will give some explicit examples of the use of memory for some standard validphys scripts. - - - -Example 1: PDF plots, `validphys2/examples/plot_pdfs.yaml` ----------------------------------------------------------- - -Suppose we have the following runcard - -```yaml -meta: - title: PDF plot example - author: Rosalyn Pearson - keywords: [parallel, example] - - -pdfs: - - {id: "NNPDF40_nlo_as_01180", label: "4.0 NLO"} - - {id: "NNPDF40_nnlo_lowprecision", label: "4.0 NNLO low precision"} - - {id: "NNPDF40_nnlo_as_01180", label: "4.0 NNLO"} - - -pdfs_noband: ["NNPDF40_nnlo_as_01180"] # Equivalently [3] - -show_mc_errors: True - -Q: 10 - -PDFnormalize: - - normtitle: Absolute # normalize_to default is None - - normalize_to: 1 # Specify index in list of PDFs or name of PDF - normtitle: Ratio - -Basespecs: - - basis: flavour - basistitle: Flavour basis - - basis: evolution - basistitle: Evolution basis - -PDFscalespecs: - - xscale: log - xscaletitle: Log - - xscale: linear - xscaletitle: Linear - -template_text: | - {@with PDFscalespecs@} - {@xscaletitle@} scale - ===================== - {@with Basespecs@} - {@basistitle@} - ------------- - {@with PDFnormalize@} - {@normtitle@} - {@plot_pdfs@} - {@plot_pdf_uncertainties@} - {@plot_pdfreplicas@} - {@endwith@} - {@endwith@} - {@endwith@} - -actions_: - - report(main=True) - -``` - -As an example we can run the above job by assigning 5 workers to the dask scheduler each of which has access to 5 GiB of memory for a total of 25 GiB: - -```$ dask-worker tcp://172.24.142.17:8786 --nworkers 5 --nthreads 1 --memory-limit "5 GiB"``` - -We then run the task as - -`$ validphys plot_pdfs.yaml --parallel --scheduler tcp://172.24.142.17:8786` - -The time needed for this task (on a machine with 8 cores and 32 GiB of RAM) is - -```console -real 0m43.464s -user 0m2.419s -sys 0m0.607s -``` - -as compared to the sequential execution which gives - -```console -real 2m0.531s -user 8m20.506s -sys 1m45.868s -``` - - - - -Example 2: Comparison of Fits ------------------------------ - -This example shows how to perform a comparison between two fits, that is, how to perform a `vp-comparefits` analysis using the parallel implementation. -Note that this example is computationally more expensive, so it is recommended to run it on a computer with large memory availability. - -Once a `dask-scheduler` has been initialised we assign to it the following workers - -```$ dask-worker --nworkers 15 --nthreads 1 --memory-limit '13 GiB'``` - -As a toy example we then compare the `NNPDF40_nnlo_as_01180_1000` fit to itself: - -```$ vp-comparefits NNPDF40_nnlo_as_01180_1000 NNPDF40_nnlo_as_01180_1000 --title example --author mnc --keywords example --parallel --scheduler ``` - -The time needed for this task on a computer with the following attributes - -```console -========================================================================= - Ubuntu 20.04.6 LTS (focal) in DAMTP - Host: zprime, Group: HEP, Kernel: Linux 5.4 - Memory: 515890M, Swap: 16383M - Arch: x86_64, AMD EPYC 7453 28-Core Processor [28 cores] - Make: Giga Computing, Model: R182-Z91-00 Rack Mount Chassis -========================================================================= -``` - -is: - -```console -real 5m21.546s -user 0m17.064s -sys 0m4.401s -``` - -The time needed on the same machine when running the job sequentially is - -```console -real 30m22.245s -user 57m9.356s -sys 15m40.624s -``` - -Using dask without a Scheduler -============================== - -It is possible to run validphys scripts without having to explicitly initialise a dask scheduler by simply adding a `--parallel` flag to the task: - -```validphys --parallel``` - -this method, however, should not be used for analyses that are computationally more expensive than `plot_pdfs.yaml` since the default memory limit that is assigned to each worker could potentially not be enough to carry out the task. - - diff --git a/doc/sphinx/source/tutorials/reportengine_parallel.rst b/doc/sphinx/source/tutorials/reportengine_parallel.rst new file mode 100644 index 0000000000..310a378944 --- /dev/null +++ b/doc/sphinx/source/tutorials/reportengine_parallel.rst @@ -0,0 +1,203 @@ +How to run an analysis in parallel +================================== + + +In this tutorial, we will demonstrate how to use the parallel resource executor +of reportengine to run a ``validphys`` analysis (or any other ``reportengine`` app analysis). +Typically, when running a ``validphys`` script, ``reportengine`` creates a directed acyclic +graph (DAG) that is executed sequentially, meaning that each node must wait for the previous +node to complete before it can be evaluated. This approach is not very efficient, especially +if the nodes are independent of each other. The parallel execution of a reportengine task is +based on ``dask.distributed`` (`dask-distributed `_). + +The main steps to follow when running a task in parallel are: + +1. Initialize a dask-scheduler (this can be done, for instance, on a separate screen opened from + command line) + + .. code:: bash + + $ dask-scheduler & + + Note that the above command should output the scheduler address (e.g. Scheduler at: tcp://171.24.141.13:8786). + +2. Assign some workers to the scheduler and specify the amount of memory each of them can use. For example: + + .. code:: bash + + $ dask-worker --nworkers 10 --nthreads 1 --memory-limit "14 GiB" & + + Note that in the above example we also fixed the number of threads at disposal by each worker to + be 1, this is important in order to avoid possible racing conditions triggered by the matplotlib + library. + +3. Run the process using the correct flags: + + .. code:: bash + + $ validphys --parallel --scheduler + + running the process this way should output a client dashboard link (e.g.: + http://172.24.142.17:8787/status) from which the status of the job can be + monitored. Note that for this to work you will need the package ``bokeh`` + with version ``>=2.4.3``. This can be easily obtained, e.g., ```pip install bokeh==2.4.3```. + +The main thing to take care of when running a certain process is point 2 and, in particular, +the amount of memory that is being assigned to each worker. In the following we will give some +explicit examples of the use of memory for some standard validphys scripts. + + +Example 1: PDF plots, ``validphys2/examples/plot_pdfs.yaml`` +---------------------------------------------------------- + +Suppose we have the following runcard + +.. code:: yaml + meta: + title: PDF plot example + author: Rosalyn Pearson + keywords: [parallel, example] + + + pdfs: + - {id: "NNPDF40_nlo_as_01180", label: "4.0 NLO"} + - {id: "NNPDF40_nnlo_lowprecision", label: "4.0 NNLO low precision"} + - {id: "NNPDF40_nnlo_as_01180", label: "4.0 NNLO"} + + + pdfs_noband: ["NNPDF40_nnlo_as_01180"] # Equivalently [3] + + show_mc_errors: True + + Q: 10 + + PDFnormalize: + - normtitle: Absolute # normalize_to default is None + - normalize_to: 1 # Specify index in list of PDFs or name of PDF + normtitle: Ratio + + Basespecs: + - basis: flavour + basistitle: Flavour basis + - basis: evolution + basistitle: Evolution basis + + PDFscalespecs: + - xscale: log + xscaletitle: Log + - xscale: linear + xscaletitle: Linear + + template_text: | + {@with PDFscalespecs@} + {@xscaletitle@} scale + ===================== + {@with Basespecs@} + {@basistitle@} + ------------- + {@with PDFnormalize@} + {@normtitle@} + {@plot_pdfs@} + {@plot_pdf_uncertainties@} + {@plot_pdfreplicas@} + {@endwith@} + {@endwith@} + {@endwith@} + + actions_: + - report(main=True) + +As an example we can run the above job by assigning 5 workers to the dask scheduler +each of which has access to 5 GiB of memory for a total of 25 GiB: + +.. code:: bash + + $ dask-worker tcp://172.24.142.17:8786 --nworkers 5 --nthreads 1 --memory-limit "5 GiB" + +We then run the task as + +.. code:: bash + + $ validphys plot_pdfs.yaml --parallel --scheduler tcp://172.24.142.17:8786 + +The time needed for this task (on a machine with 8 cores and 32 GiB of RAM) is + +.. code:: console + + real 0m43.464s + user 0m2.419s + sys 0m0.607s + +as compared to the sequential execution which gives + +.. code:: console + + real 2m0.531s + user 8m20.506s + sys 1m45.868s + + +Example 2: Comparison of Fits +----------------------------- + +This example shows how to perform a comparison between two fits, +that is, how to perform a ``vp-comparefits`` analysis using the +parallel implementation. Note that this example is computationally more +expensive, so it is recommended to run it on a computer with large memory availability. + +Once a ``dask-scheduler`` has been initialised we assign to it the following workers + +.. code:: bash + + $ dask-worker --nworkers 15 --nthreads 1 --memory-limit '13 GiB' + +As a toy example we then compare the `NNPDF40_nnlo_as_01180_1000` fit to itself: + +.. code:: bash + + $ vp-comparefits NNPDF40_nnlo_as_01180_1000 NNPDF40_nnlo_as_01180_1000 --title example --author mnc --keywords example --parallel --scheduler + +The time needed for this task on a computer with the following attributes + +.. code:: + + ========================================================================= + Ubuntu 20.04.6 LTS (focal) in DAMTP + Host: zprime, Group: HEP, Kernel: Linux 5.4 + Memory: 515890M, Swap: 16383M + Arch: x86_64, AMD EPYC 7453 28-Core Processor [28 cores] + Make: Giga Computing, Model: R182-Z91-00 Rack Mount Chassis + ========================================================================= + + +is: + +.. code:: + + real 5m21.546s + user 0m17.064s + sys 0m4.401s + +The time needed on the same machine when running the job sequentially is + +.. code:: + + real 30m22.245s + user 57m9.356s + sys 15m40.624s + +Using dask without a Scheduler +============================== + +It is possible to run validphys scripts without having to explicitly +initialise a dask scheduler by simply adding a ``--parallel`` flag to the task: + +.. code:: bash + + validphys --parallel + +this method, however, should not be used for analyses that are computationally +more expensive than ``plot_pdfs.yaml`` since the default memory limit that +is assigned to each worker could potentially not be enough to carry out the task. + + diff --git a/doc/sphinx/source/tutorials/run-fit.md b/doc/sphinx/source/tutorials/run-fit.md deleted file mode 100644 index 0c6b0ef4b4..0000000000 --- a/doc/sphinx/source/tutorials/run-fit.md +++ /dev/null @@ -1,257 +0,0 @@ -```eval_rst -.. _n3fit-usage: -``` - -How to run a PDF fit -==================== - - -The user should perform the steps documented below in order to obtain a complete -PDF fit using the latest release of the NNPDF fitting code: ``n3fit``. -The fitting methodology is detailed in the [Methodology](methodology) page. - -- [Preparing a fit runcard](#preparing-a-fit-runcard) -- [Running the fitting code](#running-the-fitting-code) -- [Upload and analyse the fit](#upload-and-analyse-the-fit) - - -```eval_rst -.. _prepare-fits: -``` - -Preparing a fit runcard ------------------------ - -The runcard is written in YAML. The runcard is the unique identifier of a fit -and contains all required information to perform a fit, which includes the -experimental data, the theory setup and the fitting setup. - -A detailed explanation on the parameters accepted by the ``n3fit`` runcards -can be found in the [detailed guide](runcard-detailed). - -For newcomers, it is recommended to start from an already existing runcard, -example runcards (and runcard used in NNPDF releases) are available at -[``n3fit/runcards``](https://github.com/NNPDF/nnpdf/tree/master/n3fit/runcards). -The runcards are mostly self explanatory, see for instance below an -example of the ``parameter`` dictionary that defines the Machine Learning framework. - -```yaml -# runcard example -... -parameters: - nodes_per_layer: [15, 10, 8] - activation_per_layer: ['sigmoid', 'sigmoid', 'linear'] - initializer: 'glorot_normal' - optimizer: - optimizer_name: 'RMSprop' - learning_rate: 0.01 - clipnorm: 1.0 - epochs: 900 - positivity: - multiplier: 1.05 - threshold: 1e-5 - stopping_patience: 0.30 # Ratio of the number of epochs - layer_type: 'dense' - dropout: 0.0 -... -``` - -The runcard system is designed such that the user can utilize the program without having to -tinker with the codebase. -One can simply modify the options in ``parameters`` to specify the -desired architecture of the Neural Network as well as the settings for the optimization algorithm. - -An important feature of ``n3fit`` is the ability to perform [hyperparameter scans](hyperoptimization), -for this we have also introduced a ``hyperscan_config`` key which specifies -the trial ranges for the hyperparameter scan procedure. -See the following self-explanatory example: -```yaml -hyperscan_config: - stopping: # setup for stopping scan - min_epochs: 5e2 # minimum number of epochs - max_epochs: 40e2 # maximum number of epochs - min_patience: 0.10 # minimum stop patience - max_patience: 0.40 # maximum stop patience - positivity: # setup for the positivity scan - min_multiplier: 1.04 # minimum lagrange multiplier coeff. - max_multiplier: 1.1 # maximum lagrange multiplier coeff. - min_initial: 1.0 # minimum initial penalty - max_initial: 5.0 # maximum initial penalty - optimizer: # setup for the optimizer scan - - optimizer_name: 'Adadelta' - learning_rate: - min: 0.5 - max: 1.5 - - optimizer_name: 'Adam' - learning_rate: - min: 0.5 - max: 1.5 - architecture: # setup for the architecture scan - initializers: 'ALL' # Use all implemented initializers from keras - max_drop: 0.15 # maximum dropout probability - n_layers: [2,3,4] # number of layers - min_units: 5 # minimum number of nodes - max_units: 50 # maximum number of nodes - activations: ['sigmoid', 'tanh'] # list of activation functions -``` - -It is also possible to take the configuration of the hyperparameter scan from a previous -run in the NNPDF server by using the key `from_hyperscan`: -```yaml -hyperscan_config: - from_hyperscan: 'some_previous_hyperscan' -``` - -or to directly take the trials from said hyperscan: -```yaml -hyperscan_config: - use_tries_from: 'some_previous_hyperscan' -``` - - -```eval_rst -.. _run-n3fit-fit: -``` - -Running the fitting code ------------------------- - -After successfully installing the ``n3fit`` package and preparing a runcard -following the points presented above you can proceed with a fit. - -1. Prepare the fit: ``vp-setupfit runcard.yml``. This command will generate a - folder with the same name as the runcard (minus the file extension) in the - current directory, which will contain a copy of the original YAML runcard. - The required resources (such as the theory and t0 PDF set) will be - downloaded automatically. Alternatively they can be obtained with the - ``vp-get`` tool. - -2. The ``n3fit`` program takes a ``runcard.yml`` as input and a replica number, e.g. -```n3fit runcard.yml replica``` where ``replica`` goes from 1-n where n is the -maximum number of desired replicas. Note that if you desire, for example, a 100 -replica fit you should launch more than 100 replicas (e.g. 130) because not -all of the replicas will pass the checks in ``postfit`` -([see here](postfit-selection-criteria) for more info). - -3. Wait until you have fit results. Then run the ``evolven3fit`` program once to -evolve all replicas using DGLAP. The arguments are ``evolven3fit evolve -runcard_folder``. - -4. Wait until you have results, then use ``postfit number_of_replicas -runcard_folder`` to finalize the PDF set by applying post selection criteria. -This will produce a set of ``number_of_replicas + 1`` replicas. This time the -number of replicas should be that which you desire in the final fit (100 in the -above example). Note that the -standard behaviour of ``postfit`` can be modified by using various flags. -More information can be found at [Processing a fit](postfit). - -It is possible to run more than one replica in one single run of ``n3fit`` by -using the ``--replica_range`` option. Running ``n3fit`` in this way increases the -memory usage as all replicas need to be stored in memory but decreases disk load -as the reading of the datasets and fktables is only done once for all replicas. - - -If you are planning to perform a hyperparameter scan just perform exactly the -same steps by adding the ``--hyperopt number_of_trials`` argument to ``n3fit``, -where ``number_of_trials`` is the maximum allowed value of trials required by the -fit. Usually when running hyperparameter scan we switch-off the MC replica -generation so different replicas will correspond to different initial points for -the scan, this approach provides faster results. We provide the ``vp-hyperoptplot`` -script to analyse the output of the hyperparameter scan. - - -Output of the fit ------------------ -Every time a replica is finalized, the output is saved to the ```runcard/nnfit/replica_$replica``` -folder, which contains a number of files: - -- ``chi2exps.log``: a json log file with the χ² of the training every 100 epochs. -- ``runcard.exportgrid``: a file containing the PDF grid. -- ``runcard.json``: Includes information about the fit (metadata, parameters, times) in json format. - -``` note:: The reported χ² refers always to the actual χ², i.e., without positivity loss or other penalty terms. -``` - - - -```eval_rst -.. _upload-fit: -``` - -Upload and analyse the fit --------------------------- -After obtaining the fit you can proceed with the fit upload and analisis by: - -1. Uploading the results using ``vp-upload runcard_folder`` then install the -fitted set with ``vp-get fit fit_name``. - -2. Analysing the results with ``validphys``, see the [vp-guide](../vp/index). -Consider using the ``vp-comparefits`` tool. - - - -Performance of the fit ----------------------- -The ``n3fit`` framework is currently based on [Tensorflow](https://www.tensorflow.org/) and as such, to -first approximation, anything that makes Tensorflow faster will also make ``n3fit`` faster. - -``` note:: Tensorflow only supports the installation via pip. Note, however, that the TensorFlow pip package has been known to break third party packages. Install it at your own risk. Only the conda tensorflow-eigen package is tested by our CI systems. -``` - -When you install the nnpdf conda package, you get the [tensorflow-eigen](https://anaconda.org/anaconda/tensorflow-eigen) package, which is not the default. -This is due to a memory explosion found in some of the conda mkl builds. - -If you want to disable MKL without installing ``tensorflow-eigen`` you can always set the environment variable ``TF_DISABLE_MKL=1`` before running ``n3fit``. -When running ``n3fit`` all versions of the package show similar performance. - - -When using the MKL version of tensorflow you gain more control of the way Tensorflow will use -the multithreading capabilities of the machine by using the following environment variables: - -```bash - -KMP_BLOCKTIME=0 -KMP_AFFINITY=granularity=fine,verbose,compact,1,0 -``` - -These are the best values found for ``n3fit`` when using the mkl version of Tensorflow from conda -and were found for TF 2.1 as the default values were suboptimal. -For a more detailed explanation on the effects of ``KMP_AFFINITY`` on the performance of -the code please see [here](https://software.intel.com/content/www/us/en/develop/documentation/cpp-compiler-developer-guide-and-reference/top/optimization-and-programming-guide/openmp-support/openmp-library-support/thread-affinity-interface-linux-and-windows.html). - -By default, ``n3fit`` will try to use as many cores as possible, but this behaviour can be overriden -from the runcard with the ``maxcores`` parameter. In our tests the point of diminishing returns is found -at ``maxcores=4``. - -Note that everything stated above is machine dependent so the best parameters for you might be -very different. When testing, it is useful to set the environmental variable ``KMP_SETTINGS`` to 1 -to obtain detailed information about the current variables being used by OpenMP. - -Below we present a benchmark that have been run for the Global NNPDF 3.1 case, as found in the -example runcards [folder](https://github.com/NNPDF/nnpdf/tree/master/n3fit/runcards). - -Settings of the benchmark: - - TF version: tensorflow-eigen from conda, TF 2.2 - - NNPDF commit: [f878fc95a4f32e8c3b4c454fc12d438cbb87ea80](https://github.com/NNPDF/nnpdf/commit/f878fc95a4f32e8c3b4c454fc12d438cbb87ea80) - - Number of epochs: 5000 - - maxcores: 4 - - no early stopping - -Hardware: - - Intel(R) Core(TM) i7-6700 CPU @ 4.00GHz - - 16 GB RAM 3000 MHz DDR4 - -Timing for a fit: - - Walltime: 397s - - CPUtime: 1729s - -Iterate the fit ---------------- - -It may be desirable to iterate a fit to achieve a higher degree of convergence/stability in the fit. -To read more about this, see [How to run an iterated fit](run-iterated-fit). - -QED fit -------- - -In order to run a QED fit see [How to run a QED fit](run-qed-fit) diff --git a/doc/sphinx/source/tutorials/run-fit.rst b/doc/sphinx/source/tutorials/run-fit.rst new file mode 100644 index 0000000000..4293563fb2 --- /dev/null +++ b/doc/sphinx/source/tutorials/run-fit.rst @@ -0,0 +1,255 @@ +.. _n3fit-usage: + +How to run a PDF fit +==================== + + +The user should perform the steps documented below in order to obtain a complete +PDF fit using the latest release of the NNPDF fitting code: ``n3fit``. +The fitting methodology is detailed in the [Methodology](methodology) page. + +- :ref:`Preparing a fit runcard ` +- :ref:`Running the fitting code ` +- :ref:`Upload and analyse the fit ` + + +.. _prepare-fits: +Preparing a fit runcard +----------------------- + +The runcard is written in YAML. The runcard is the unique identifier of a fit +and contains all required information to perform a fit, which includes the +experimental data, the theory setup and the fitting setup. + +A detailed explanation on the parameters accepted by the ``n3fit`` runcards +can be found in the :ref:`detailed guide `. + +For newcomers, it is recommended to start from an already existing runcard, +example runcards (and runcard used in NNPDF releases) are available at +`n3fit/runcards `_. +The runcards are mostly self explanatory, see for instance below an +example of the ``parameter`` dictionary that defines the Machine Learning framework. + +.. code:: yaml + + # runcard example + ... + parameters: + nodes_per_layer: [15, 10, 8] + activation_per_layer: ['sigmoid', 'sigmoid', 'linear'] + initializer: 'glorot_normal' + optimizer: + optimizer_name: 'RMSprop' + learning_rate: 0.01 + clipnorm: 1.0 + epochs: 900 + positivity: + multiplier: 1.05 + threshold: 1e-5 + stopping_patience: 0.30 # Ratio of the number of epochs + layer_type: 'dense' + dropout: 0.0 + ... + +The runcard system is designed such that the user can utilize the program +without having to tinker with the codebase. +One can simply modify the options in ``parameters`` to specify the +desired architecture of the Neural Network as well as the settings for the optimization algorithm. + +An important feature of ``n3fit`` is the ability to perform :ref:`hyperparameter scans `, +for this we have also introduced a ``hyperscan_config`` key which specifies +the trial ranges for the hyperparameter scan procedure. +See the following self-explanatory example: + +.. code:: yaml + + hyperscan_config: + stopping: # setup for stopping scan + min_epochs: 5e2 # minimum number of epochs + max_epochs: 40e2 # maximum number of epochs + min_patience: 0.10 # minimum stop patience + max_patience: 0.40 # maximum stop patience + positivity: # setup for the positivity scan + min_multiplier: 1.04 # minimum lagrange multiplier coeff. + max_multiplier: 1.1 # maximum lagrange multiplier coeff. + min_initial: 1.0 # minimum initial penalty + max_initial: 5.0 # maximum initial penalty + optimizer: # setup for the optimizer scan + - optimizer_name: 'Adadelta' + learning_rate: + min: 0.5 + max: 1.5 + - optimizer_name: 'Adam' + learning_rate: + min: 0.5 + max: 1.5 + architecture: # setup for the architecture scan + initializers: 'ALL' # Use all implemented initializers from keras + max_drop: 0.15 # maximum dropout probability + n_layers: [2,3,4] # number of layers + min_units: 5 # minimum number of nodes + max_units: 50 # maximum number of nodes + activations: ['sigmoid', 'tanh'] # list of activation functions + +It is also possible to take the configuration of the hyperparameter scan from a previous +run in the NNPDF server by using the key ``from_hyperscan``: + +.. code:: yaml + + hyperscan_config: + from_hyperscan: 'some_previous_hyperscan' + +or to directly take the trials from said hyperscan: + +.. code:: yaml + + hyperscan_config: + use_tries_from: 'some_previous_hyperscan' + + +.. _run-n3fit-fit: +Running the fitting code +------------------------ + +After successfully installing the ``n3fit`` package and preparing a runcard +following the points presented above you can proceed with a fit. + +1. Prepare the fit: ``vp-setupfit runcard.yml``. This command will generate a + folder with the same name as the runcard (minus the file extension) in the + current directory, which will contain a copy of the original YAML runcard. + The required resources (such as the theory and t0 PDF set) will be + downloaded automatically. Alternatively they can be obtained with the + ``vp-get`` tool. + +2. The ``n3fit`` program takes a ``runcard.yml`` as input and a replica number, e.g. + :code:`n3fit runcard.yml replica` where ``replica`` goes from 1-n where n is the + maximum number of desired replicas. Note that if you desire, for example, a 100 + replica fit you should launch more than 100 replicas (e.g. 130) because not + all of the replicas will pass the checks in ``postfit`` + (:ref:`see here ` for more info). + +3. Wait until you have fit results. Then run the ``evolven3fit`` program once to + evolve all replicas using DGLAP. The arguments are ``evolven3fit evolve + runcard_folder``. + +4. Wait until you have results, then use ``postfit number_of_replicas + runcard_folder`` to finalize the PDF set by applying post selection criteria. + This will produce a set of ``number_of_replicas + 1`` replicas. This time the + number of replicas should be that which you desire in the final fit (100 in the + above example). Note that the + standard behaviour of ``postfit`` can be modified by using various flags. + More information can be found at :ref:`Processing a fit `. + +It is possible to run more than one replica in one single run of ``n3fit`` by +using the ``--replica_range`` option. Running ``n3fit`` in this way increases the +memory usage as all replicas need to be stored in memory but decreases disk load +as the reading of the datasets and fktables is only done once for all replicas. + + +If you are planning to perform a hyperparameter scan just perform exactly the +same steps by adding the ``--hyperopt number_of_trials`` argument to ``n3fit``, +where ``number_of_trials`` is the maximum allowed value of trials required by the +fit. Usually when running hyperparameter scan we switch-off the MC replica +generation so different replicas will correspond to different initial points for +the scan, this approach provides faster results. We provide the ``vp-hyperoptplot`` +script to analyse the output of the hyperparameter scan. + + +Output of the fit +----------------- +Every time a replica is finalized, the output is saved to the ```runcard/nnfit/replica_$replica``` +folder, which contains a number of files: + +- ``chi2exps.log``: a json log file with the χ² of the training every 100 epochs. +- ``runcard.exportgrid``: a file containing the PDF grid. +- ``runcard.json``: Includes information about the fit (metadata, parameters, times) in json format. + +.. note:: + + The reported χ² refers always to the actual χ², i.e., without positivity loss or other penalty terms. + + +.. _upload-fit: +Upload and analyse the fit +-------------------------- +After obtaining the fit you can proceed with the fit upload and analisis by: + +1. Uploading the results using ``vp-upload runcard_folder`` then install the + fitted set with ``vp-get fit fit_name``. + +2. Analysing the results with ``validphys``, see the :ref:`vp-guide `. + Consider using the ``vp-comparefits`` tool. + + + +Performance of the fit +---------------------- +The ``n3fit`` framework is currently based on `Tensorflow `_ and as such, to +first approximation, anything that makes Tensorflow faster will also make ``n3fit`` faster. + +.. note:: + + Tensorflow only supports the installation via pip. Note, however, that the TensorFlow + pip package has been known to break third party packages. Install it at your own risk. + Only the conda tensorflow-eigen package is tested by our CI systems. + +When you install the nnpdf conda package, you get the +`tensorflow-eigen `_ package, +which is not the default. This is due to a memory explosion found in some of +the conda mkl builds. + +If you want to disable MKL without installing ``tensorflow-eigen`` you can always +set the environment variable ``TF_DISABLE_MKL=1`` before running ``n3fit``. +When running ``n3fit`` all versions of the package show similar performance. + + +When using the MKL version of tensorflow you gain more control of the way Tensorflow will use +the multithreading capabilities of the machine by using the following environment variables: + +.. code:: bash + + KMP_BLOCKTIME=0 + KMP_AFFINITY=granularity=fine,verbose,compact,1,0 + +These are the best values found for ``n3fit`` when using the mkl version of Tensorflow from conda +and were found for TF 2.1 as the default values were suboptimal. +For a more detailed explanation on the effects of ``KMP_AFFINITY`` on the performance of +the code please see +`here `_. + +By default, ``n3fit`` will try to use as many cores as possible, but this behaviour can be overriden +from the runcard with the ``maxcores`` parameter. In our tests the point of diminishing returns is found +at ``maxcores=4``. + +Note that everything stated above is machine dependent so the best parameters for you might be +very different. When testing, it is useful to set the environmental variable ``KMP_SETTINGS`` to 1 +to obtain detailed information about the current variables being used by OpenMP. + +Below we present a benchmark that have been run for the Global NNPDF 3.1 case, as found in the +example runcards `folder `_. + +Settings of the benchmark: + - TF version: tensorflow-eigen from conda, TF 2.2 + - NNPDF commit: `f878fc95a4f32e8c3b4c454fc12d438cbb87ea80 `_ + - Number of epochs: 5000 + - maxcores: 4 + - no early stopping + +Hardware: + - Intel(R) Core(TM) i7-6700 CPU @ 4.00GHz + - 16 GB RAM 3000 MHz DDR4 + +Timing for a fit: + - Walltime: 397s + - CPUtime: 1729s + +Iterate the fit +--------------- + +It may be desirable to iterate a fit to achieve a higher degree of convergence/stability in the fit. +To read more about this, see :ref:`How to run an iterated fit `. + +QED fit +------- + +In order to run a QED fit see :ref:`How to run a QED fit `. From 1322b5718fdeccea1d805794c3c3503c571a5c61 Mon Sep 17 00:00:00 2001 From: achiefa Date: Thu, 10 Oct 2024 19:24:27 +0100 Subject: [PATCH 16/22] Removing recommonmark from dependencies --- conda-recipe/meta.yaml | 1 - doc/sphinx/source/conf.py | 1 - 2 files changed, 2 deletions(-) diff --git a/conda-recipe/meta.yaml b/conda-recipe/meta.yaml index 3ae31bca4b..a3496538ec 100644 --- a/conda-recipe/meta.yaml +++ b/conda-recipe/meta.yaml @@ -40,7 +40,6 @@ requirements: - eko >=0.14.2 - fiatlux - sphinx >=5.0.2,<6 # documentation. Needs pinning temporarily due to markdown - - recommonmark - sphinx_rtd_theme >0.5 - sphinxcontrib-bibtex - ruamel.yaml <0.18 diff --git a/doc/sphinx/source/conf.py b/doc/sphinx/source/conf.py index 4ac1fcc4d7..2c34bee2dc 100644 --- a/doc/sphinx/source/conf.py +++ b/doc/sphinx/source/conf.py @@ -53,7 +53,6 @@ # particularly in markdown. See # https://recommonmark.readthedocs.io/en/latest/#linking-to-headings-in-other-files 'sphinx.ext.autosectionlabel', - 'recommonmark', ] From fecc30f180a13c773a7e1a1fe8a27983b1130c2f Mon Sep 17 00:00:00 2001 From: achiefa Date: Mon, 14 Oct 2024 14:55:55 +0100 Subject: [PATCH 17/22] Removed dependence on recommonmark --- doc/sphinx/source/conf.py | 11 ----------- pyproject.toml | 3 +-- 2 files changed, 1 insertion(+), 13 deletions(-) diff --git a/doc/sphinx/source/conf.py b/doc/sphinx/source/conf.py index 2c34bee2dc..713cf0d480 100644 --- a/doc/sphinx/source/conf.py +++ b/doc/sphinx/source/conf.py @@ -15,7 +15,6 @@ # import sys # # -from recommonmark.transform import AutoStructify # -- Project information ----------------------------------------------------- @@ -217,13 +216,3 @@ # If true, `todo` and `todoList` produce output, else they produce nothing. todo_include_todos = True - -# Adapted this from -# https://github.com/readthedocs/recommonmark/blob/ddd56e7717e9745f11300059e4268e204138a6b1/docs/conf.py -# app setup hook -def setup(app): - app.add_config_value('recommonmark_config', { - #'url_resolver': lambda url: github_doc_root + url, - 'enable_eval_rst': True, - }, True) - app.add_transform(AutoStructify) diff --git a/pyproject.toml b/pyproject.toml index 104b979d69..baf60ae30a 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -86,7 +86,6 @@ pytest = {version = "*", optional = true} pytest-mpl = {version = "*", optional = true} hypothesis = {version = "*", optional = true} # docs -recommonmark = {version = "*", optional = true} sphinxcontrib-bibtex = {version = "*", optional = true} sphinx_rtd_theme = {version = "*", optional = true} sphinx = {version = "^5.0", optional = true} @@ -100,7 +99,7 @@ lhapdf-management = {version = "^0.5", optional = true} # Optional dependencies [tool.poetry.extras] tests = ["pytest", "pytest-mpl", "hypothesis"] -docs = ["recommonmark", "sphinxcontrib", "sphinx-rtd-theme", "sphinx", "tabulate"] +docs = ["sphinxcontrib", "sphinx-rtd-theme", "sphinx", "tabulate"] qed = ["fiatlux"] nolha = ["pdfflow", "lhapdf-management"] From c289d4a7d63afb25be8112f198ddb135be80bd65 Mon Sep 17 00:00:00 2001 From: achiefa Date: Mon, 14 Oct 2024 14:56:21 +0100 Subject: [PATCH 18/22] Corrected typo in link --- doc/sphinx/source/releases.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/sphinx/source/releases.rst b/doc/sphinx/source/releases.rst index d5f3a07044..f073fb851c 100644 --- a/doc/sphinx/source/releases.rst +++ b/doc/sphinx/source/releases.rst @@ -21,7 +21,7 @@ developments and to mark versions used to produce main results. The significant releases since the code was made public are: `Version 4.0.9 `_ - Release for 4.0 `N3LO `; + Release for 4.0 `N3LO `_; last release fully backwards-compatible with 4.0 pipeline. 4.0 runcards will still work but external tools, and data and theory not used in the 4.0 family of fits will no longer be guaranteed to work from 4.0.10 onwards Last release compatible with the old commondata format From 9699fa94f54f83520870173a4a03362e732672dd Mon Sep 17 00:00:00 2001 From: achiefa Date: Mon, 14 Oct 2024 15:04:44 +0100 Subject: [PATCH 19/22] Corrected code block --- doc/sphinx/source/contributing/python-tools.rst | 2 ++ doc/sphinx/source/tutorials/reportengine_parallel.rst | 1 + doc/sphinx/source/vp/pydataobjs.rst | 3 ++- 3 files changed, 5 insertions(+), 1 deletion(-) diff --git a/doc/sphinx/source/contributing/python-tools.rst b/doc/sphinx/source/contributing/python-tools.rst index 1913786698..bc18636f27 100644 --- a/doc/sphinx/source/contributing/python-tools.rst +++ b/doc/sphinx/source/contributing/python-tools.rst @@ -187,11 +187,13 @@ the :ref:`CI ` runs the test suite. Now that the baseline figure exists you can check that your test works: .. code:: bash + pytest -k --mpl Also you can check that the test has been added to the full test suite: .. code:: bash + pytest --pyargs --mpl validphys Just note that if you do not put the :code:`--mpl` flag then the test will just check diff --git a/doc/sphinx/source/tutorials/reportengine_parallel.rst b/doc/sphinx/source/tutorials/reportengine_parallel.rst index 310a378944..975be07851 100644 --- a/doc/sphinx/source/tutorials/reportengine_parallel.rst +++ b/doc/sphinx/source/tutorials/reportengine_parallel.rst @@ -53,6 +53,7 @@ Example 1: PDF plots, ``validphys2/examples/plot_pdfs.yaml`` Suppose we have the following runcard .. code:: yaml + meta: title: PDF plot example author: Rosalyn Pearson diff --git a/doc/sphinx/source/vp/pydataobjs.rst b/doc/sphinx/source/vp/pydataobjs.rst index ba8d5013d6..8c2693135f 100644 --- a/doc/sphinx/source/vp/pydataobjs.rst +++ b/doc/sphinx/source/vp/pydataobjs.rst @@ -261,7 +261,8 @@ LHAPDF python interface. This is also the output of the ``pdf.load()`` method. For example, the following will return the values for all 100 members of NNPDF4.0 for the gluon and the d-quark, at three values of ``x`` at ``Q=91.2``. -.. code-block:: python +.. code:: python + from validphys.api import API pdf = API.pdf(pdf="NNPDF40_nnlo_as_01180") l_pdf = pdf.load() From 380165f4627d7d3ec40bef9ff9539a16a1dc24d9 Mon Sep 17 00:00:00 2001 From: Roy Stegeman Date: Mon, 14 Oct 2024 15:45:03 +0100 Subject: [PATCH 20/22] remove references to markdown from the docs --- .pre-commit-config.yaml | 26 ++-- conda-recipe/meta.yaml | 2 +- doc/README.md | 110 +---------------- doc/sphinx/source/conf.py | 98 +++++++-------- .../source/contributing/python-tools.rst | 6 +- .../contributing/sphinx-documentation.rst | 116 +----------------- 6 files changed, 66 insertions(+), 292 deletions(-) diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml index ee9e500b0f..15e348faed 100644 --- a/.pre-commit-config.yaml +++ b/.pre-commit-config.yaml @@ -1,24 +1,24 @@ # See https://pre-commit.com for more information # See https://pre-commit.com/hooks.html for more hooks repos: - - repo: https://github.com/pre-commit/pre-commit-hooks - rev: 'v4.5.0' +- repo: https://github.com/pre-commit/pre-commit-hooks + rev: 'v5.0.0' hooks: - - id: check-merge-conflict - - id: check-toml - - id: check-yaml - - id: debug-statements - - id: end-of-file-fixer - - id: trailing-whitespace + - id: check-merge-conflict + - id: check-toml + - id: check-yaml + - id: debug-statements + - id: end-of-file-fixer + - id: trailing-whitespace - - repo: https://github.com/psf/black-pre-commit-mirror - rev: '24.3.0' +- repo: https://github.com/psf/black-pre-commit-mirror + rev: '24.10.0' hooks: - - id: black + - id: black args: ['--config=./pyproject.toml'] - - repo: https://github.com/pycqa/isort +- repo: https://github.com/pycqa/isort rev: '5.13.2' hooks: - - id: isort + - id: isort args: ['--settings-path=./pyproject.toml'] diff --git a/conda-recipe/meta.yaml b/conda-recipe/meta.yaml index a3496538ec..bb5069ba69 100644 --- a/conda-recipe/meta.yaml +++ b/conda-recipe/meta.yaml @@ -39,7 +39,7 @@ requirements: - pineappl >=0.8.2 - eko >=0.14.2 - fiatlux - - sphinx >=5.0.2,<6 # documentation. Needs pinning temporarily due to markdown + - sphinx >=5.0.2 - sphinx_rtd_theme >0.5 - sphinxcontrib-bibtex - ruamel.yaml <0.18 diff --git a/doc/README.md b/doc/README.md index 67abe94a7b..c73b0d1e11 100644 --- a/doc/README.md +++ b/doc/README.md @@ -13,21 +13,13 @@ documentation, navigate to the `sphinx/` directory and execute the command `make html`. This produces the documentation in the `build/index/` directory. The `index.html` can be viewed with any appropriate browser. -It is required to install `recommonmark` to interpret markdown. To add the -dependencies to your environment, run - -``` -conda install sphinx recommonmark -``` - ### Adding to the Documentation -New documentation can be added in markdown (`.md` or `.txt` suffices) or -restructured text (`.rst` suffix) formats. To add a new section to the -documentation, create an appropriately named directory in the `sphinx/source/` -directory. Inside the new directory, add all relevant documentation in the -markdown or restructured text formats. In addition to these files, create an -`index.rst` file containing: +New documentation can be added in restructured text (`.rst`) format. To +add a new section to the documentation, create an appropriately named directory +in the `sphinx/source/` directory. Inside the new directory, add all relevant +documentation in the restructured text formats. In addition to these +files, create an `index.rst` file containing: ``` Chapter Name @@ -73,98 +65,6 @@ Indices and tables * :ref:`search` ``` -### Useful Markdown and Restructured Text Tools - -Various -[markdown](https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet) and -[restructured text](http://docutils.sourceforge.net/docs/user/rst/quickref.html) -cheatsheets exist online. - -In restructured text, a $\LaTeX$ block can be generated using - -``` -.. math:: - - \frac{1}{2} -``` - -while inline maths is generated using - -``` -:math:`\frac{1}{2}` -``` -with attention being brought to the backticks. Note: the markdown intepreter -being used here does not support inline maths, so if formula dense documentation -is being implemented, it is advised to use restructured text instead. - -One can cross reference various parts of their markdown file using `anchors`, -which provide clickable pieces of text which transport the reader to a -particular part of the document. - -To do this: add an anchor point in the text. This may look like the following: -``` -Lorem ipsum dolor sit amet consectetur adipiscing elit, sed do -``` - -we can then jump to `label` from an arbitrary point in the text by using -`[text](#label)` - -As an example, clicking [this](#top) will take the reader to the top of the -page. - -This was done by having the following lines of code: - -``` -For example, clicking [this](#top) will take the reader to the top of the page. -``` -as well as -``` -# NNPDF code and standards documentation -``` -at the top of this file. - - -In addition, one can link to other pages within the documentation by -`[text](.)`. - -One can define "lables" for RestructuredText, which can be referred to from -anywhere, like this: -``` - .. _my-reference-label: - - Section to cross-reference - -------------------------- - - This is the text of the section. - - It refers to the section itself, see :ref:`my-reference-label`. -``` - -Such labels can also be defined in Markdown by using `rst` syntax embedded in -code markers in markdown: - - - ```eval_rst - .. _my-reference-label: - ``` - -Labels can be linked to from anywhere using the syntax - -``` -[lint text](my-refence-label) -``` -for Markdown and - -``` -:ref:`my-reference-label`. -``` -for RestructuredText, as described in its -[documentation](https://www.sphinx-doc.org/en/master/usage/restructuredtext/roles.html?highlight=cross%20reference#role-ref). - -## Installation using conda - - - ### Adding indices for modules Sphinx has the capability of automatically documenting any python package. It diff --git a/doc/sphinx/source/conf.py b/doc/sphinx/source/conf.py index 713cf0d480..a9e8627265 100644 --- a/doc/sphinx/source/conf.py +++ b/doc/sphinx/source/conf.py @@ -1,4 +1,3 @@ -# -*- coding: utf-8 -*- # # Configuration file for the Sphinx documentation builder. # @@ -18,14 +17,14 @@ # -- Project information ----------------------------------------------------- -project = 'NNPDF' -copyright = '2021, NNPDF collaboration' -author = 'NNPDF collaboration' +project = "NNPDF" +copyright = "2021, NNPDF collaboration" +author = "NNPDF collaboration" # The short X.Y version -version = '' +version = "" # The full version, including alpha/beta/rc tags -release = '' +release = "" # -- General configuration --------------------------------------------------- @@ -38,45 +37,32 @@ # extensions coming with Sphinx (named 'sphinx.ext.*') or your custom # ones. extensions = [ - 'sphinx.ext.autodoc', - 'sphinx.ext.doctest', - 'sphinx.ext.intersphinx', - 'sphinx.ext.todo', - 'sphinx.ext.coverage', - 'sphinx.ext.mathjax', - 'sphinx.ext.ifconfig', - 'sphinx.ext.viewcode', - 'sphinx.ext.napoleon', - 'sphinxcontrib.bibtex', - # To generate section headings, - # particularly in markdown. See - # https://recommonmark.readthedocs.io/en/latest/#linking-to-headings-in-other-files - 'sphinx.ext.autosectionlabel', + "sphinx.ext.autodoc", + "sphinx.ext.doctest", + "sphinx.ext.intersphinx", + "sphinx.ext.todo", + "sphinx.ext.coverage", + "sphinx.ext.mathjax", + "sphinx.ext.ifconfig", + "sphinx.ext.viewcode", + "sphinx.ext.napoleon", + "sphinxcontrib.bibtex", + "sphinx.ext.autosectionlabel", ] -bibtex_bibfiles = ['references.bib'] +bibtex_bibfiles = ["references.bib"] # Add any paths that contain templates here, relative to this directory. -templates_path = ['_templates'] - -# Markdown configuration +templates_path = ["_templates"] # The suffix(es) of source filenames. -# You can specify multiple suffix as a list of string: -# -source_suffix = { - '.rst': 'restructuredtext', - '.txt': 'markdown', - '.md': 'markdown', -} +source_suffix = [".rst"] autosectionlabel_prefix_document = True -# Allow to embed rst syntax in markdown files. -enable_eval_rst = True # The master toctree document. -master_doc = 'index' +master_doc = "index" # The language for content autogenerated by Sphinx. Refer to documentation # for a list of supported languages. @@ -99,23 +85,20 @@ # The theme to use for HTML and HTML Help pages. See the documentation for # a list of builtin themes. # -html_theme = 'sphinx_rtd_theme' +html_theme = "sphinx_rtd_theme" # Theme options are theme-specific and customize the look and feel of a theme # further. For a list of options available for each theme, see the # documentation. # -html_theme_options = {'logo_only' : True, - 'display_version' : False} +html_theme_options = {"logo_only": True, "display_version": False} # Add any paths that contain custom static files (such as style sheets) here, # relative to this directory. They are copied after the builtin static files, # so a file named "default.css" will overwrite the builtin "default.css". html_static_path = ["_static"] -html_css_files = [ - 'css/custom.css', -] +html_css_files = ["css/custom.css"] html_logo = "_static/LogoNNPDF.png" @@ -133,7 +116,7 @@ # -- Options for HTMLHelp output --------------------------------------------- # Output file base name for HTML help builder. -htmlhelp_basename = 'NNPDFDocumentationdoc' +htmlhelp_basename = "NNPDFDocumentationdoc" # -- Options for LaTeX output ------------------------------------------------ @@ -142,15 +125,12 @@ # The paper size ('letterpaper' or 'a4paper'). # # 'papersize': 'letterpaper', - # The font size ('10pt', '11pt' or '12pt'). # # 'pointsize': '10pt', - # Additional stuff for the LaTeX preamble. # # 'preamble': '', - # Latex figure (float) alignment # # 'figure_align': 'htbp', @@ -160,8 +140,13 @@ # (source start file, target name, title, # author, documentclass [howto, manual, or own class]). latex_documents = [ - (master_doc, 'NNPDFDocumentation.tex', 'NNPDF Documentation Documentation', - 'NNPDF collaboration', 'manual'), + ( + master_doc, + "NNPDFDocumentation.tex", + "NNPDF Documentation Documentation", + "NNPDF collaboration", + "manual", + ) ] @@ -169,10 +154,7 @@ # One entry per manual page. List of tuples # (source start file, name, description, authors, manual section). -man_pages = [ - (master_doc, 'nnpdfdocumentation', 'NNPDF Documentation Documentation', - [author], 1) -] +man_pages = [(master_doc, "nnpdfdocumentation", "NNPDF Documentation Documentation", [author], 1)] # -- Options for Texinfo output ---------------------------------------------- @@ -181,9 +163,15 @@ # (source start file, target name, title, author, # dir menu entry, description, category) texinfo_documents = [ - (master_doc, 'NNPDFDocumentation', 'NNPDF Documentation Documentation', - author, 'NNPDFDocumentation', 'One line description of project.', - 'Miscellaneous'), + ( + master_doc, + "NNPDFDocumentation", + "NNPDF Documentation Documentation", + author, + "NNPDFDocumentation", + "One line description of project.", + "Miscellaneous", + ) ] @@ -202,7 +190,7 @@ # epub_uid = '' # A list of files that should not be packed into the epub file. -epub_exclude_files = ['search.html'] +epub_exclude_files = ["search.html"] # -- Extension configuration ------------------------------------------------- @@ -210,7 +198,7 @@ # -- Options for intersphinx extension --------------------------------------- # Example configuration for intersphinx: refer to the Python standard library. -intersphinx_mapping = {'python': ('https://docs.python.org/', None)} +intersphinx_mapping = {"python": ("https://docs.python.org/", None)} # -- Options for todo extension ---------------------------------------------- diff --git a/doc/sphinx/source/contributing/python-tools.rst b/doc/sphinx/source/contributing/python-tools.rst index bc18636f27..5b749ceb32 100644 --- a/doc/sphinx/source/contributing/python-tools.rst +++ b/doc/sphinx/source/contributing/python-tools.rst @@ -16,7 +16,7 @@ Python editors features. - `vscode `_ is a more full featured editor. - In the long run, the most efficient approach is to learn a terminal based - editor such as `vim `_. Note that `vim` editing modes + editor such as `vim `_. Note that `vim` editing modes can be added as extensions to graphical editors such as :code:`vscode`. @@ -123,9 +123,7 @@ Documentation to enable the ``napoleon`` extension which allows for a more lenient `numpydoc `_ style. Similarly the default RST markup language can be overwhelming for simple - documents. We enable the - `recommonmark `_ extension to - be able to compose files also in markdown format. + documents. Python static checks and code style diff --git a/doc/sphinx/source/contributing/sphinx-documentation.rst b/doc/sphinx/source/contributing/sphinx-documentation.rst index f8ed880a3a..ee082f87e5 100644 --- a/doc/sphinx/source/contributing/sphinx-documentation.rst +++ b/doc/sphinx/source/contributing/sphinx-documentation.rst @@ -11,22 +11,9 @@ the appropriate ``nnpdf`` conda environment. This produces the documentation in the ``build/index/`` directory. The ``index.html`` can be viewed with any appropriate browser. -New documentation can be added in markdown, naming the source files with -the ``.md`` suffix, or restructured text, with the ``.rst`` suffix -formats. - - -.. note:: - The ``md`` format is now deprecated and only supported for legacy reasons. - The reStructured Text format natively supports equation displaying as well as - directives such as this note and is thus the preferred format for `NNPDF` - documentation. Despite this, it is possible to evaluate inline ``rst`` in a - ``md`` file using the ``eval_rst`` command in legacy files written in - markdown. - To add a new section to the documentation, create an appropriately named directory in the ``sphinx/source/`` directory. Inside the new directory, -add all relevant documentation in the markdown or restructured text +add all relevant documentation in restructured text formats. In addition to these files, create an ``index.rst`` file containing: @@ -75,105 +62,6 @@ The next step is to reference the newly made ``index.rst`` in the main * :ref:`modindex` * :ref:`search` -Useful Markdown and Restructured Text Tools -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Various -`markdown `__ -and `restructured -text `__ -cheatsheets exist online. - -In restructured text, a :math:`\LaTeX` block can be generated using - -:: - - .. math:: - - \frac{1}{2} - -while inline maths is generated using - -:: - - :math:`\frac{1}{2}` - -with attention being brought to the backticks. Note: the markdown -interpreter being used here does not support inline maths, so if formula -dense documentation is being implemented, it is advised to use -restructured text instead. - -One can cross reference various parts of their markdown file using -``anchors``, which provide clickable pieces of text which transport the -reader to a particular part of the document. - -To do this: add an anchor point in the text. This may look like the -following: - -:: - - Lorem ipsum dolor sit amet consectetur adipiscing elit, sed do - -we can then jump to ``label`` from an arbitrary point in the text by -using ``[text](#label)`` - -As an example, clicking `this <#top>`__ will take the reader to the top -of the page. - -This was done by having the following lines of code: - -:: - - For example, clicking [this](#top) will take the reader to the top of the page. - -as well as - -:: - - # NNPDF code and standards documentation - -at the top of this file. - -In addition, one can link to other pages within the documentation by -``[text](.)``. - -One can define “labels” for RestructuredText, which can be referred to -from anywhere, like this: - -:: - - .. _my-reference-label: - - Section to cross-reference - -------------------------- - - This is the text of the section. - - It refers to the section itself, see :ref:`my-reference-label`. - -Such labels can also be defined in Markdown by using ``rst`` syntax -embedded in code markers in markdown: - -:: - - ```eval_rst - .. _my-reference-label: - ``` - -Labels can be linked to from anywhere using the syntax - -:: - - [link text](my-reference-label) - -for Markdown and - -:: - - :ref:`my-reference-label` - -for RestructuredText, as described in its -`documentation `__. Adding BibTeX references ~~~~~~~~~~~~~~~~~~~~~~~~ @@ -222,7 +110,7 @@ under: :: %: Makefile - @if test $@ != "clean"; then + @if test $@ != "clean"; then sphinx-apidoc -o ./source/modules/validphys ../../validphys2/src/validphys/ ; \ sphinx-apidoc -o ./source/modules/ ;\ fi From a8d3796db418229c149114ad6ae18e63421ce42a Mon Sep 17 00:00:00 2001 From: Roy Stegeman Date: Mon, 14 Oct 2024 15:49:19 +0100 Subject: [PATCH 21/22] replace documentation README with link to documentation itself --- doc/README.md | 80 ++------------------------------------------------- 1 file changed, 2 insertions(+), 78 deletions(-) diff --git a/doc/README.md b/doc/README.md index c73b0d1e11..e9a9239e91 100644 --- a/doc/README.md +++ b/doc/README.md @@ -3,81 +3,5 @@ Here we store the [documentation](https://docs.nnpdf.science/) (user / developer guides) -## Sphinx Documentation - -### Generating the Documentation - -The NNPDF documentation is produced by the -[sphinx](http://www.sphinx-doc.org/en/master/) resource. To generate the sphinx -documentation, navigate to the `sphinx/` directory and execute the command `make -html`. This produces the documentation in the `build/index/` directory. The -`index.html` can be viewed with any appropriate browser. - -### Adding to the Documentation - -New documentation can be added in restructured text (`.rst`) format. To -add a new section to the documentation, create an appropriately named directory -in the `sphinx/source/` directory. Inside the new directory, add all relevant -documentation in the restructured text formats. In addition to these -files, create an `index.rst` file containing: - -``` -Chapter Name -============ - -.. toctree:: - :maxdepth: 1 - - ./file1.md - ./file2.rst -``` -ensuring that the number of `=` signs is the same as the number of characters in -`Chapter Name`. - -The next step is to reference the newly made `index.rst` in the main -`sphinx/source/index.rst` file: - -``` -.. NNPDF documentation master file, created by - sphinx-quickstart on Mon Oct 29 10:53:50 2018. - You can adapt this file completely to your liking, but it should at least - contain the root `toctree` directive. - -NNPDF documentation -=================== - -.. toctree:: - :maxdepth: 2 - - get-started/index - theory/index - vp/index - code/index - tutorials/index - QA/index - /index - -Indices and tables -================== - -* :ref:`genindex` -* :ref:`modindex` -* :ref:`search` -``` - -### Adding indices for modules - -Sphinx has the capability of automatically documenting any python package. It -produces these under the `index` and `module index` sections. The functions and -modules are documented using their corresponding docstrings. - -To add a new module to document, add a new line in `sphinx/Makefile` under: - -``` -%: Makefile - @if test $@ != "clean"; then - sphinx-apidoc -o ./source/modules/validphys ../../validphys2/src/validphys/ ; \ - sphinx-apidoc -o ./source/modules/ ;\ - fi - -``` +For instructions on how to contribute to the documentation see the +[documentation](https://docs.nnpdf.science/contributing/sphinx-documentation.html) From 640367dddbf496fbaf170c7f6866dd42d9c419a5 Mon Sep 17 00:00:00 2001 From: Roy Stegeman Date: Mon, 14 Oct 2024 17:23:05 +0100 Subject: [PATCH 22/22] address issue #1577 --- doc/sphinx/source/tutorials/general_th_covmat.rst | 6 +++--- doc/sphinx/source/vp/developer.rst | 2 +- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/doc/sphinx/source/tutorials/general_th_covmat.rst b/doc/sphinx/source/tutorials/general_th_covmat.rst index dec91fe310..2593a0eda5 100644 --- a/doc/sphinx/source/tutorials/general_th_covmat.rst +++ b/doc/sphinx/source/tutorials/general_th_covmat.rst @@ -31,8 +31,8 @@ Instructions .. warning:: Make a note of the upload address returned to you, but without the initial part of the address, i.e. you should save - "https://vp.nnpdf.science/IeGM9CY8RxGcb5r6bIEYlQ==/shrek_covmat.csv" - as "IeGM9CY8RxGcb5r6bIEYlQ==/shrek_covmat.csv" + "https://vp.nnpdf.science/IeGM9CY8RxGcb5r6bIEYlQ==/topthcovmat.csv" + as "IeGM9CY8RxGcb5r6bIEYlQ==/topthcovmat.csv" 3. In the runcard under ``theorycovmatconfig`` you need to add the following (using the address above as an example) @@ -43,7 +43,7 @@ Instructions theorycovmatconfig: use_scalevar_uncertainties: False use_user_uncertainties: True - user_covmat_path: "IeGM9CY8RxGcb5r6bIEYlQ==/shrek_covmat.csv" + user_covmat_path: "IeGM9CY8RxGcb5r6bIEYlQ==/topthcovmat.csv" use_thcovmat_in_sampling: True use_thcovmat_in_fitting: True ############################################################################ diff --git a/doc/sphinx/source/vp/developer.rst b/doc/sphinx/source/vp/developer.rst index 9e77b97832..10f50a932a 100644 --- a/doc/sphinx/source/vp/developer.rst +++ b/doc/sphinx/source/vp/developer.rst @@ -22,7 +22,7 @@ Some of the most important modules are - `validphys.core` Core data structures that represent objects such as PDFs and data -sets. +sets. - `validphys.loader` Tools to obtain NNPDF resources locally or remotely. See :ref:`upload`