Release Version 19.12.0 (December 31, 2019) · StanfordLegion/legion

Build
- Both builds (Make and CMake) now generate legion_defines.h and
  realm_defines.h. By default these headers are generated in
  the source directory (Make) or build directory (CMake). This
  means that languages such as Regent and Python no longer
  require MAX_DIM to be specified explicitly
Regent
- Support for CUDA 10
- Support for field polymorphic tasks
- Substantially improved the generality of the index launch
  optimization. Task arguments of the form p[i+k] may now be
  used, where k is a variable defined outside of the loop
- Add flag -foverride-demand-index-launch which can be used to
  force loops to be index launched in cases where the compiler
  cannot prove the disjointness of read-write region
  arguments
- Added reductions for complex64
- The scripts install.py and setup_env.py now use CMake to
  build Terra by default, which should improve portability on
  most machines
- The behavior of -fcuda 1 has changed: this flag will now issue
  an error if CUDA cannot be enabled (e.g. because the build
  does not support CUDA, or because the machine has no
  GPUs). Omitting this flag will now enable CUDA if it is
  available (and will not error if it is not available).
  The behavior of -fopenmp 1 has changed similarly.
- The behavior of __demand(__cuda) has changed. This will now
  issue an error if a loop is not eligible for the CUDA
  transformation, regardless of whether CUDA is actually
  available on the current machine or not. The behavior of
  __demand(__openmp) has changed similarly.
- The annotation __allow(__cuda) is now permitted, and permits
  (but does not require) tasks to be optimized with CUDA.
- Experimental support for 2D kernel launch in the CUDA code generation
Python
- Add support for copies
- Copies and fills now support multiple fields
- Tasks (including index launches) now support setting the mapper
  ID and tag
Legion
- A major overhaul of the Legion physical analysis to use an
  approach based on bounding volume hierarchies. The change is
  not visible to users, but will likely impact performance. Most
  programs will get faster; programs that create many partitions
  frequently on the fly may get slower. The later case will be fixed
  in an upcoming release.
- Added support for indirect copy operations such as gather and
  scatter onto existing copy launchers
Realm
- Event::subscribe allows polling via Event::has_triggered to
  (eventually) succeed
- Addition of CompletionQueue objects that allow multiple unordered
  Event triggers to be efficiently handled by a single consumer
- Support for omp_get_level, omp_in_parallel, and
  omp_set_num_threads in tasks running on OpenMP processors
- Support for unstructured scatter and/or gather in copies. (Handling
  structured cases as well as fills/reductions remains a work in
  progress.)
- Removed all calls to Event::wait from inside other Realm API calls.
  Applications now must make sure that index spaces and instance
  metadata are valid before use. For details, see: #465

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Version 19.12.0 (December 31, 2019)