Skip to content

Commit

Permalink
release: v1.13.0
Browse files Browse the repository at this point in the history
Signed-off-by: Nicholas Sielicki <[email protected]>
  • Loading branch information
aws-nslick committed Nov 19, 2024
1 parent 0673a92 commit cf7606e
Show file tree
Hide file tree
Showing 2 changed files with 51 additions and 7 deletions.
56 changes: 50 additions & 6 deletions RELEASENOTES.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,51 @@
This file is a placeholder on the primary development branch of the
OFI NCCL Plugin so that "make dist" works properly. Release branches
will have an accurate release history in this location, and each
release tarball will also have up to date release notes.
# AWS OFI NCCL Release notes

If you're looking for Plugin releases, please see the [Releases
Page](https://github.com/aws/aws-ofi-nccl/releases).
# Supported Distributions

- Amazon Linux 2
- Amazon Linux 2023
- Ubuntu 20.04 LTS, 22.04 LTS.

For releases before v1.6.0, we generally created releases from two separate
branches, an AWS-specific branch and a general release branch. With v1.6.0, we
have unified the code into a single branch, and made the AWS-specific parts a
compile-time option. When a feature (or entire release) only supports one of
the two variants, we note that in the release notes.

# v1.13.0-aws (2024-11-18)

This release is intended only for use on AWS P\* instances. A general release
that supports other libfabric networks may be made in the near future.

With this release, building with platform-aws requires
[1.22.0amzn4.0](https://github.com/aws/libfabric/commits/1.22.0amzn4.0/)
or greater. AWS customers are generally recommended to track
[the latest-available EFA Installer](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/efa-verify.html)
for performance improvements and bug fixes.

The 1.13.x release series supports
[NCCL 2.23.4-1](https://github.com/NVIDIA/nccl/releases/tag/v2.23.4-1)
while maintaining backward compatibility with older NCCL versions
([NCCL v2.17.1](https://github.com/NVIDIA/nccl/releases/tag/v2.17.1-1) and later).

New features:

- AWS `P5en` platform support was added.

- support was added for the NCCL v3 tuner API. The tuner now supports multiple
platforms and supports multiple collectives.

- Scheduling improvements were made to the plugin RDMA protocol. In multirail
configurations, this is expected to balance traffic more optimally.

- dmabuf memory registration support was added. Users facing problems with
dmabuf may disable dmabuf with `OFI_NCCL_DISABLE_DMABUF=1`.

Breaking changes:

- As mentioned above, building with support for platform-aws now requires
libfabric version 1.22.0amzn4.0 or greater.

- Under CUDA, the plugin now statically links the CUDA runtime by default.
Packagers preferring to dynamically link CUDA may pass
`--enable-cudart-dynamic` at configure time to disable this.
2 changes: 1 addition & 1 deletion configure.ac
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
#

# Initialization
AC_INIT([aws-ofi-nccl], [1.13.0pre], [[email protected]], , [http://github.com/aws/aws-ofi-nccl])
AC_INIT([aws-ofi-nccl], [1.13.0], [[email protected]], , [http://github.com/aws/aws-ofi-nccl])
AC_PREREQ([2.69])
AC_CONFIG_SRCDIR([src/nccl_ofi_net.c])
AC_CONFIG_AUX_DIR([build-aux])
Expand Down

0 comments on commit cf7606e

Please sign in to comment.