Skip to content

Releases: nv-legate/legate

v24.11.01

07 Dec 04:10
409967c
Compare
Choose a tag to compare

This is a closed-source release, governed by the following EULA: https://docs.nvidia.com/legate/24.11/eula.pdf.

Linux x86 and ARM conda packages with multi-node support (based on UCX or GASNet) are available at https://anaconda.org/legate/legate (GASNet-based packages are under the gex label).

Documentation for this release can be found at https://docs.nvidia.com/legate/24.11/.

New features

  • Bug fixes for release 24.11.00

v24.11.00

17 Nov 00:49
c624a46
Compare
Choose a tag to compare

This is a closed-source release, governed by the following EULA: https://docs.nvidia.com/legate/24.11/eula.pdf.

Linux x86 and ARM conda packages with multi-node support (based on UCX or GASNet) are available at https://anaconda.org/legate/legate (GASNet-based packages are under the gex label).

Documentation for this release can be found at https://docs.nvidia.com/legate/24.11/.

New features

  • Provide an MPI wrapper, that the user can compile against their local MPI installation, and integrate with an existing build of Legate. This is useful when a user needs to use an MPI installation different from the one Legate was compiled against.
  • Add support for using GASNet as the networking backend, useful on platforms not currently supported by UCX, e.g. Slingshot11. Provide scripts for the user to compile GASNet on their local machine, and integrate with an existing build of Legate.
  • Automatic machine configuration; Legate will now detect the available hardware resources at startup, and no longer needs to be provided information such as the amount of memory to allocate.
  • Print more information on what data is taking up memory when Legate encounters an out-of-memory error.
  • Support scalar parameters, default arguments and reduction privileges in Python tasks.
  • Add support for a concurrent_task_barrier, useful in preventing NCCL deadlocks.
  • Allow tasks to specify that CUDA context synchronization at task exit can be skipped, reducing latency.
  • Experimental support for distributed hdf5 and zarr I/O.
  • Experimental support for single-CPU/GPU fast-path task execution (skipping the tasking runtime dependency analysis).
  • Experimental implementation of a "bloated" instance prefetching API, which instructs the runtime to create instances encompassing multiple slices of a store ahead of time, potentially reducing intermediate memory usage.
  • full changelog

Known issues

The GPUDirectStorage backend of the hdf5 I/O module (off by default, and enabled with LEGATE_IO_USE_VFD_GDS=1) is not currently working (enabling it will result in a crash). We are working on a fix.

Legate's auto-configuration heuristics will attempt to split CPU cores and system memory evenly across all instantiated OpenMP processors, not accounting for the actual core count and memory limits of each NUMA domain. In cases where the number of OpenMP groups does not evenly divide the number of NUMA domains, this bug may cause unsatisfiable core and memory allocations, resulting in error messages such as:

  • not enough cores in NUMA domain 0 (72 < 284)
  • reservation ('OMP0 proc 1d00000000000005 (worker 8)') cannot be satisfied
  • insufficient memory in NUMA node 4 (102533955584 > 102005473280 bytes) - skipping allocation

These issues should only affect performance if you are actually running computations on the OpenMP cores (rather than using the GPUs for computation). You can always adjust the automatically derived configuration values through LEGATE_CONFIG, see https://docs.nvidia.com/legate/latest/usage.html#resource-allocation.

v24.06.01

10 Sep 20:11
65a6acf
Compare
Choose a tag to compare

This is a patch release, and includes the following fixes:

This is a closed-source release, governed by the following EULA: https://docs.nvidia.com/legate/24.06/eula.pdf. x86 conda packages with multi-node support (based on UCX) are available at https://anaconda.org/legate/legate-core.

Documentation for this release can be found at https://docs.nvidia.com/legate/24.06/.

v24.06.00

03 Jul 21:45
65a6acf
Compare
Choose a tag to compare

This release re-implements the Legate API in C++, which significantly reduces the overhead of the control code. This release also introduces the following major features:

  • As a result of the C++ re-implementation of the API, now the entire Legate program can be written in C++ (previously the control code had to be written in Python).
  • The Legate Array API, which extends Legate Stores with support for struct-type and nullable containers, and even containers of variable-length elements (e.g. string containers, and sparse array representations)
  • An implementation of STL algorithms based on the Legate API, which allows users to easily express common parallelism patterns without needing to write custom tasks.
  • Support for writing leaf tasks in Python (previously only leaf task implementations in C++ were supported)
  • Integration with NSight Systems (initial support)

This release bumps the minimum support CUDA version to 12.0.

This is a closed-source release, governed by the following EULA: https://docs.nvidia.com/legate/24.06/eula.pdf. x86 conda packages with multi-node support (based on UCX) are available at https://anaconda.org/legate/legate-core.

Documentation for this release can be found at https://docs.nvidia.com/legate/24.06/.

v23.11.00

17 Nov 23:49
fd45636
Compare
Choose a tag to compare

This release focuses on bugfixes and documentation improvements, in particular a formally documented support matrix.

Conda packages for this release are available at https://anaconda.org/legate/legate-core.

What's Changed

🛠️ Improvements

🐛 Bug Fixes

  • Avoid gc infinite loop at runtime destruction time by @manopapad in #842
  • Add missing 12.0 CUDA libraries to env generation script by @manopapad in #850
  • Set Mypy version downloaded in CI by @Jacobfaib in #859
  • Remove numpy from conda build dependencies. by @bdice in #855
  • Control ucx presence in install_info more carefully by @bryevdv in #882

📖 Documentation

New Contributors

Full Changelog: v23.09.00...v23.11.00

v23.09.00

03 Oct 15:24
21ea7b3
Compare
Choose a tag to compare

This release includes a number of bug fixes for multi-process execution, and quality-of-life improvements to the build system and driver script.

Conda packages for this release are available at https://anaconda.org/legate/legate-core.

What's Changed

🛠️ Improvements

📖 Documentation

  • Update CUDA Toolkit version in documentation by @ipdemes in #822

🐛 Bug Fixes

  • Pre-seed random number generators deterministically, to guard against control replication violations by @ipdemes in #809
  • Enable shard-local future creation for IO by @ipdemes in #835
  • Respect user-supplied PYTHONPATH by @bryevdv in #836
  • Use unordered detach operations by @ipdemes in #823
  • Fix oversubscription support in sharding functors by @ipdemes in #819
  • Respect the type of passed storage in create_store by @manopapad in #834

New Contributors

Full Changelog: v23.07.00...v23.09.00

v23.07.00

25 Jul 04:51
2b91db4
Compare
Choose a tag to compare

This release introduces support for resource scoping annotations, which allow parts of the program to be assigned to a subset of the available processors/GPUs. This release also includes some more examples of writing legate libraries, improved logging and safety checks, and a refactoring of legate.core's internals.

Conda packages for this release are available at https://anaconda.org/legate/legate-core.

What's Changed

🚀 New Features

🛠️ Improvements

📖 Documentation

🐛 Bug Fixes

New Contributors

Full Changelog: v23.03.00...v23.07.00

v23.03.00

15 Mar 20:02
5de57a8
Compare
Choose a tag to compare

This is the beta release of Legate Core.

This release focuses on making it easier for developers to get started building libraries on top of Legate Core, including features like updated API documentation, helper CMake functions for bootstrapping new Legate library projects, and a new "Hello World" library example, that demos the use of fundamental Legate API calls.

This release also adds support for using the standard python interpreter for running Legate programs (in addition to using the custom legate driver script).

Conda packages for this release are available at https://anaconda.org/legate/legate-core.

What's Changed

🐛 Bug Fixes

🚀 New Features

  • Default python interpreter support for Legate by @eddy16112 in #539
  • Build helper functions for legate projects, legate-hello example by @jjwilke in #571

🛠️ Improvements

📖 Documentation

New Contributors

Full Changelog: v23.01.00...v23.03.00

v23.01.00

31 Jan 03:38
e05983c
Compare
Choose a tag to compare

This release adds initial support for using the UCX Realm networking backend (for more efficient multi-node communication) and using Legion's new "collective views" feature (for improved scheduling of reduction operations). Both of these features are currently in preview mode, and not enabled by default. They are planned to become the default by next release, following further verification and tuning.

This release improves the build experience for developers, with fixes to corner cases in the cmake configuration, a rewrite of the build documentation, and a script for generating complete conda environments for development, covering all supported platforms.

This release also introduces improvements in user interface (improved jupyter support, more CLI options for debugging and profiling), memory usage (through better instance management in the mapper) and the Legate programming model (allowing libraries to add custom profiler annotations, and use arbitrary communicator libraries in their tasks).

Conda packages for this release are available at https://anaconda.org/legate/legate-core.

What's Changed

🐛 Bug Fixes

  • Legion bug WAR: don't instantiate futures on framebuffer by @manopapad in #413
  • Handle conflicts for library-level args by @bryevdv in #416
  • Fix Transform class hierarchy by @manopapad in #427
  • Handle scalar outputs correctly in manual tasks by @magnatelee in #432
  • Explicitly build Legion if legion_dir or legion_src_dir is not provided by @trxcllnt in #411
  • Fix GPU shard computation by @bryevdv in #433
  • Only set default CMake generator if Ninja is available: Issue #374 by @jjwilke in #379
  • Fix an issue with editable installs by @bryevdv in #434
  • Allow only one of --legion-dir and --legion-src-dir by @jjwilke in #387
  • legate/util: fix a mypy error on MacOS by @rohany in #438
  • Improvements to legate.jupyter by @bryevdv in #425
  • Fix for cunumeric#668 by @manopapad in #453
  • Only keep traceback reprs, to avoid cycles by @bryevdv in #447
  • Fix returned legion paths for editable install with separate legion b… by @jjwilke in #442
  • Make install.py reconfigure editable installs when build type changes by @trxcllnt in #455
  • fix for -ll:networks none, we will init MPI if it has not been initialized by @eddy16112 in #465
  • Construct region-backed 0D stores in a correct way by @magnatelee in #450
  • Pass a sufficiently high default value for gasnet's ibv-max-hcas by @manopapad in #477
  • Make overlap check tight by @manopapad in #479
  • Conda env script fixes by @manopapad in #481
  • Fix some typos by @manopapad in #485
  • fix several reference cycle / leak related bugs by @rohany in #488
  • legate/core: fix FutureMap leak in communicator shutdown by @rohany in #495
  • src/core/mapping: adjust indirect copy mapping for GPUs by @rohany in #499
  • Don't access stream pools unless we're on GPUs by @magnatelee in #503
  • Update env gen script so OS type works for mac by @m3vaz in #523
  • Don't check for collective behavior when we have WRITE privilege by @manopapad in #526
  • All NCCL ranks on the same node must get the same NCCL_IB_HCA by @manopapad in #528
  • legate/core/_legion: add default new argument to dep part functions by @rohany in #527
  • Don't turn on Legate debug checks on debug-rel builds by @manopapad in #533
  • src/core: guard against missing projection functors in collective check by @rohany in #534
  • Erase cached reduction instances that cannot be acquired by @magnatelee in #536

🚀 New Features

🛠️ Improvements

New Contributors

  • @robinw0928 made their first contribution in #431
  • @SeyedMir made their first contribution in #516

Full Changelog: v22.10.00...v23.01.00

v22.10.00

13 Oct 22:30
c50fdd4
Compare
Choose a tag to compare

Release 22.10 contains several improvements to memory management. Those changes are to recycle memory space from GC'ed Legate stores more eagerly for fresh ones. Another big change in this release is a new build infrastructure based on CMake and scikit-build for the Legate ecosystem, which is a big leap over the previous ad-hoc build system. The release also includes two useful debugging features: 1) provenance tracking for tasks and other operator kinds issued by client libraries and 2) detailed logging for client library mappers.

Conda packages for this release are available at https://anaconda.org/legate/legate-core.

What's Changed

🐛 Bug Fixes

🚀 New Features

🛠️ Improvements

📖 Documentation

New Contributors

Full Changelog: v22.08.00...v22.10.00