diff --git a/docs/source/development/pyrealm_build_data.md b/docs/source/development/pyrealm_build_data.md index b59ec2b2..41035bb8 100644 --- a/docs/source/development/pyrealm_build_data.md +++ b/docs/source/development/pyrealm_build_data.md @@ -11,138 +11,92 @@ kernelspec: name: python3 --- -# The `pyrealm_build_data` package - -The `pyrealm` repository includes both the `pyrealm` package and the -`pyrealm_build_data` package. The `pyrealm_build_data` package contains datasets that -are used in the `pyrealm` build and testing process. This includes: - -* Example datasets that are used in the package documentation, such as simple spatial - datasets for showing the use of the P Model. -* "Golden" datasets for regression testing `pyrealm` implementations against the outputs - of other implementations. These datasets will include a set of input data and then - output predictions from other implementations. -* Datasets for providing profiling of `pyrealm` code and for benchmarking new versions - of the package code against earlier implementations to check for performance issues. - -Note that `pyrealm_build_data` is a source distribution only (`sdist`) component of -`pyrealm`, so is not included in binary distributions (`wheel`) that are typically -installed by end users. This means that files in `pyrealm_build_data` are not available -if a user has simply used `pip install pyrealm`: please *do not* use -`pyrealm_build_data` within the main `pyrealm` code. - -## Package contents - -The package is organised into submodules that reflect the data use or previous -implementation. - -### The `bigleaf` submodule - -This submodule contains benchmark outputs from the `bigleaf` package in `R`, which has -been used as the basis for core hygrometry functions. The `bigleaf_conversions.R` R -script runs a set of test values through `bigleaf`. The first part of the file prints -out some simple test values that have been used in package doctests and then the second -part of the file generates more complex benchmarking inputs that are saved, along with -`bigleaf` outputs as `bigleaf_test_values.json`. - -Running `bigleaf_conversions.R` requires an installation of R along with the `jsonlite` -and `bigleaf` packages, and the script can then be run from within the submodule folder -as: - -```sh -Rscript bigleaf_conversions.R -``` - -### The `rpmodel` submodule - -This submodule contains benchmark outputs from the `rpmodel` package in `R`, which has -been used as the basis for initial development of the standard P Model. - -#### Test inputs +# The {mod}`~pyrealm_build_data` module -The `generate_test_inputs.py` file defines a set of constants for running P Model -calculations and then defines a set of scalar and array inputs for the forcing variables -required to run the P Model. The array inputs are set of 100 values sampled randomly -across the ranges of plausible forcing value inputs in order to benchmark the -calculations of the P Model implementation. All of these values are stored in the -`test_inputs.json` file. - -It requires `python` and the `numpy` package and can be run as: - -```sh -python generate_test_inputs.py +```{eval-rst} +.. automodule:: pyrealm_build_data + :autosummary: + :members: + :special-members: __init__ ``` -#### Simple `rpmodel` benchmarking - -The `test_outputs_rpmodel.R` contains R code to run the test input data set, and store -the expected predictions from the `rpmodel` package as `test_outputs_rpmodel.json`. It -requires an installation of `R` and the `rpmodel` package and can be run as: +## The `bigleaf` submodule -```sh -Rscript test_outputs_rpmodel.R +```{eval-rst} +.. automodule:: pyrealm_build_data.bigleaf + :autosummary: + :members: + :special-members: __init__ ``` -#### Global array test +## The `community` submodule -The remaining files in the submodule are intended to provide a global test dataset for -benchmarking the use of `rpmodel` on a global time-series, so using 3 dimensional arrays -with latitude, longitude and time coordinates. It is currently not used in testing -because of issues with the `rpmodel` package in version 1.2.0. It may also be replaced -in testing with the `uk_data` submodule, which is used as an example dataset in the -documentation. +```{eval-rst} +.. automodule:: pyrealm_build_data.community + :autosummary: + :members: + :special-members: __init__ +``` -The files are: +## The `rpmodel` submodule -* pmodel_global.nc: An input global NetCDF file containing forcing variables at 0.5° - spatial resolution and for two time steps. -* test_global_array.R: An R script to run `rpmodel` using the dataset. -* rpmodel_global_gpp_do_ftkphio.nc: A NetCDF file containing `rpmodel` predictions using - corrections for temperature effects on the `kphio` parameter. -* rpmodel_global_gpp_no_ftkphio.nc: A NetCDF file containing `rpmodel` predictions with - fixed `kphio`. +```{eval-rst} +.. automodule:: pyrealm_build_data.rpmodel + :autosummary: + :members: + :special-members: __init__ +``` -To generate the predicted outputs again requires an R installation with the `rpmodel` -package: +## The `sandoval_kphio` submodule -```sh -Rscript test_global_array.R +```{eval-rst} +.. automodule:: pyrealm_build_data.sandoval_kphio + :autosummary: + :members: + :special-members: __init__ ``` -### The `subdaily` submodule +## The `splash` submodule -At present, this submodule only contains a single file containing the predictions for -the `BE_Vie` fluxnet site from the original implementation of the `subdaily` module, -published in {cite}`mengoli:2022a`. Generating these predictions requires an -installation of R and then code from the following repository: +```{eval-rst} +.. automodule:: pyrealm_build_data.splash + :autosummary: + :members: + :special-members: __init__ +``` -[https://github.com/GiuliaMengoli/P-model_subDaily](https://github.com/GiuliaMengoli/P-model_subDaily) +## The `subdaily` submodule -TODO - This submodule should be updated to include the required code along with the -settings files and a runner script to reproduce this code. Or possibly to checkout the -required code as part of a shell script. +```{eval-rst} +.. automodule:: pyrealm_build_data.subdaily + :autosummary: + :members: + :special-members: __init__ +``` -### The `t_model` submodule +## The `t_model` submodule -The `t_model.r` contains the original implementation of the T Model calculations in R -{cite:p}`Li:2014bc`. The `rtmodel_test_outputs.r` script sources this file and then -generates some simple bencmarking predictions, which are saved as `rtmodel_output.csv`. +```{eval-rst} +.. automodule:: pyrealm_build_data.t_model + :autosummary: + :members: + :special-members: __init__ +``` -To generate the predicted outputs again requires an R installation +## The `two_leaf` submodule -```sh -Rscript rtmodel_test_outputs.r +```{eval-rst} +.. automodule:: pyrealm_build_data.two_leaf + :autosummary: + :members: + :special-members: __init__ ``` -### The `uk_data` submodule - -This submodule contains the Python script `create_2D_uk_inputs.py`, which is used to -generate the NetCDF output file `UK_WFDE5_FAPAR_2018_JuneJuly.nc`. This contains P Model -forcings for the United Kingdom at 0.5° spatial resolution and hourly temporal -resolution over 2 months (1464 temporal observations). It is used for demonstrating the -use of the subdaily P Model. +## The `uk_data` submodule -The script is currently written with a hard-coded set of paths to key source data - the -WFDE5 v2 climate data and a separate source of interpolated hourly fAPAR. This should -probably be rewritten to generate reproducible content from publically available sources -of these datasets. +```{eval-rst} +.. automodule:: pyrealm_build_data.uk_data + :autosummary: + :members: + :special-members: __init__ +``` diff --git a/pyrealm_build_data/__init__.py b/pyrealm_build_data/__init__.py index 13c9caca..1b5e9bc0 100644 --- a/pyrealm_build_data/__init__.py +++ b/pyrealm_build_data/__init__.py @@ -10,6 +10,9 @@ * Datasets for providing profiling of ``pyrealm`` code and for benchmarking new versions of the package code against earlier implementations to check for performance issues. +The package is organised into submodules that reflect the data use or previous +implementation. + Note that ``pyrealm_build_data`` is a source distribution only (``sdist``) component of ``pyrealm``, so is not included in binary distributions (``wheel``) that are typically installed by end users. This means that files in ``pyrealm_build_data`` are not diff --git a/pyrealm_build_data/community/__init__.py b/pyrealm_build_data/community/__init__.py new file mode 100644 index 00000000..fc2675a9 --- /dev/null +++ b/pyrealm_build_data/community/__init__.py @@ -0,0 +1,5 @@ +"""The :mod:`pyrealm_build_data.community` submodule provides a set of input files for +the :mod:`pyrealm.demography` module that are used both in unit testing for the module +and as inputs for generating documentation of the module. The files provide definitions +of plant functional types and plant communities in a range of formats. +""" # noqa: D205