LFRicPlans

PSyclone work and plans for LFRic

This page outlines the current work and near-term proposed work in PSyclone that is related directly to LFRic.

[TBD] use algorithm field names to better determine the function space of a field statically
[TBD] add support for mixed precision. We need to read algorithm layer field datatypes and use these when declaring the associated field in the PSy layer.
[TBD] add support for i-first kernels.

[in progress] add support for checking that a kernel does not modify a field that is declared as being read-only in its metadata. This is done using the DataAPI.
[TBD] use algorithm field name analysis (see functionality) to check that fields are used correctly

[TBD] Add run-time checks that operator function spaces are consistent with the associated kernel metadata. Complements the existing field support.

[TBD] use algorithm field name analysis (see functionality) to determine how many more field function spaces are known at compile time. Particularly look for any_space changing to a discontinuous space. Check if any are in significant routines. If so, check for any performance benefit.

[in progress] Add Wolfgang's libxmm matvec optimisation to PSycloneBench and test benefits on Skylake SfP and Broadwell Scarf. Also try out on LFRic with and without his optimisation to compute directly if ndofs are both 1. Report results to Met Office and TING. Add support in PSyclone if the benefit is worth it.
[TBD] Check Sergi's matvec optimisations a) code restructuring (k-inner) b) data-layout re-ordering c) block-colouring and d) kernel constant on matrix vector. Confirm previous Skylake results using SfP. Check Broadwell results on Scarf. Report results to Met Office and TING.
[TBD] Add transformation to support multiple specialised versions of matvec at runtime allowing them to use the Kernel Constant optimisations. Test on matvec kernel and full code on Skylake SfP. Report results to Met Office and TING.

[TBD] Recreate matvec benchmark examples from Alan Gray, test their performance on Glados and add them to the PSycloneBench repo. This will give us base performance results for a hand tuned version of the matvec benchmark
[TBD] Make use of existing PSyclone OpenACC transformations (Kernels, parallel, region, loop) to generate a naive OpenACC version of the matvec benchmark. Get it working correctly and extend PSyclone if necessary.
[TBD] Add transformations to PSyclone to re-create the optimised version by Alan Gray. Test performance as we go along.
[TBD] Try out additional code modifications manually. In particular, 1) test restructured and data-reordered versions of matvec and 2) remove colouring, replacing with locks.