Skip to content

Conference call notes 20211013

Kenneth Hoste edited this page Oct 26, 2021 · 5 revisions

(back to Conference calls)

Notes on the 183th EasyBuild conference call, Wednesday Oct 13th 2021 (15:00 UTC)

Attendees

Alphabetical list of attendees (10):

  • Sebastian Achilles (Jülich Supercomputing Centre, Germany)
  • Simon Branford (Univ. of Birmingham, UK)
  • Jasper Grimm (University of York, UK)
  • Kenneth Hoste (HPC-UGent, Belgium)
  • Kurt Lust (Univ. of Antwerp, Belgium + LUMI User Support Team)
  • Robert Mijakovic (LuxProvide)
  • Mikael Öhman (Chalmers University of Technology, Sweden
  • Bart Oldeman (Compute Canada)
  • Åke Sandgren (Umeå University, Sweden)
  • Alexandre Strube (Jülich Supercomputing Centre, Germany)

Agenda

  • overview of recent developments
  • support for using Intel SVML in numpy
  • broken PMIx detection in OpenMPI 4.x
  • Perl-minimal as wrapper around system Perl in the style of OpenSSL?
  • Q&A

Recent developments

  • release timeline
    • latest release: EasyBuild v4.4.2 (Sept 7th 2021)
    • next release
  • recent changes
    • framework
      • bug fixes
        • fix copy_file so it doesn't fail when copying a symbolic link if the target path is an existing directory (PR #3855)
      • enhancements
        • add support for collecting GPU info (via nvidia-smi), and include it in --show-system-info and test report (PR #3851)
      • changes
        • ...
    • easyblocks
      • bug fixes
        • fix installation of libcp2k for recent CP2K versions (PR #2585)
      • enhancements
        • update MotionCor2 easyblock to consider to locations for built binary (PR #2541 + PR #2598)
        • enhance FlexiBLAS easyblock to support building with Intel MKL (imkl) as backend (PR #2588)
          • related easyconfigs PR: #14082
          • requires pulling imkl to compiler-only toolchain, to avoid having to use an MPI toolchain for FlexiBLAS...
          • add separate easyconfig for MKL FFTW wrappers (imkl-fftw?)
          • is it worth renaming imkl to mkl or intel-mkl?
            • is possible, but rather painful
          • imkl should only be added as a dependency for FlexiBLAS on x86_64
            • can be done via arch= in dependency version
        • update MATLAB easyblock for 2021b (jre is no longer included) (PR #2589)
        • update OpenCV easyblock to detect GTK3 and GTK2 dependencies (next to GTK+) (PR #2591)
        • enhance COMSOL easyblock to change permission on build directory during extract step (to allow using patches) (PR #2598)
      • new easyblocks
        • (none)
      • changes
        • (none)
    • easyconfigs
      • ~25 easyconfig PRs merged since last conf call
      • bug fixes
        • also add location to MPI startup tests to $PATH in OSU-Micro-Benchmarks easyconfigs (PR #14126)
      • enhancements
        • ...
      • new software
        • ...
      • noteworthy software updates
      • changes
        • ...
  • to merge/fix/tackle soon
    • framework
      • reported bugs / bug fixes
        • sources for extensions are still downloaded with --module-only (issue #3849)
      • enhancements
        • use separate different progress bars for different aspects of the installations being performed (WIP) (PR #3844)
      • changes
        • ...
    • easyblocks
      • reported bugs / bug fixes
        • restore RPATH wrappers for OpenMPI sanity check (WIP) (PR #2582)
        • avoid that path to CUDA install directory is added to $PATH (PR #2593)
      • enhancements
        • enhance GCC easyblock to add support for AMD GPU offloading (PR #2578)
      • changes
        • don't use --config=mkl for TensorFlow 2.4+ (PR #2583)
          • cfr. reported performance problems for CPU-only TensorFlow installations (issue #2577), which can worked around via export OMP_NUM_THREADS=1
          • blocked by broken TensorFlow tests when not using --config=mkl (see https://github.com/tensorflow/tensorflow/issues/52151)
          • should we make not using --config=mkl opt-in for now, so we can switch to it for selected (latest) TensorFlow versions?
      • new software
        • (nothing major?)
    • easyconfigs
      • bug reports & fixes
        • remove superfluous -DCMAKE_BUILD_TYPE (PR #13384)
        • TensorFlow tf.matmul ends up using CPU backend for 32bit floats (issue #14120)
          • low/wrong performance for matrix multiplication with certain data types (32-bit float, probably also 16-bit)
          • TensorFlow seems to favor MKL on CPU over GPU, unclear why...
          • should we auto-disable use of mkl-dnn when building on GPU?
          • working fine in TensorFlow 2.6.0 (check changelog?)
      • enhancements
        • Add CI check for CMAKE_BUILD_TYPE (PR #14008)
      • changes
        • update to UCX 1.11.2 as dependency for OpenMPI 4.1.1 (PR #14090)
      • new software
      • noteworthy software updates
        • SciPy-bundle with intel/2021a (PR #12964)
          • need to look into handful of failing tests...
        • intel/2021.09 (PR #14085)

Common toolchains

2021b (WIP!)

  • for now: foss/2021.07 and intel/2021.09 (candidates for 2021b after testing confirms they work well)
    • foss/2021.07: included with EasyBuild v4.4.2 release
    • intel/2021.09: WIP at PR #14085
      • includes intel-compilers 2021.4 release, which support GCC 11.2
      • includes impi 2021.4 on top of UCX 1.11.2
  • PR #14090 (+ PR #14091) suggests bumping UCX to 1.11.2 (from 1.11.0) for the OpenMPI involved in foss/2021.07
  • failing tests for SciPy-bundle with foss/2021.07 (PR #13789)
  • toolchain working group to follow up on this (?)

Support for using Intel SVML in numpy

Broken PMIx detection in OpenMPI 4.x

Perl-minimal as wrapper around system Perl in the style of OpenSSL?

  • only relevant for stuff building with system toolchain
  • assume that necessary Perl modules are available
  • only symlink bin/perl to system Perl?
  • fall back to building Perl from source
  • should be hidden
  • Kurt: other option could be to add a module that should be installed if OS dependency is missing
  • we should hide more build dependencies like binutils with system, Python-bare, etc.
    • only going forward, since there would be impact for existing stuff

Q&A

  • Anyone tried to build TensorFlow on top of ROCm with EasyBuild?
    • Kurt: not yet in LUMI...
Clone this wiki locally