Skip to content

Commit

Permalink
[NCCL] Don't override waitUntilInitialized's setting of `comm->init…
Browse files Browse the repository at this point in the history
…ialized_` (pytorch#136155)

pytorch#133630 sets `initialized_` to `true` which causes previous wait codepaths to skip necessary waits, see also #pytorch#136151

CC @shuqiangzhang @wconstab

Pull Request resolved: pytorch#136155
Approved by: https://github.com/fduwjj, https://github.com/kwen2501, https://github.com/c-p-i-o, https://github.com/shuqiangzhang
  • Loading branch information
eqy authored and pytorchmergebot committed Sep 17, 2024
1 parent a4e9a1c commit e3aa5e2
Showing 1 changed file with 3 additions and 1 deletion.
4 changes: 3 additions & 1 deletion torch/csrc/distributed/c10d/NCCLUtils.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,9 @@ std::shared_ptr<NCCLComm> NCCLComm::split(
std::nullopt);
++source->ncclCommSplitCounter_;
comm->rank_ = rank;
comm->initialized_ = true;
if (!nccl_use_nonblocking()) {
comm->initialized_ = true;
}
return comm;
}
#endif
Expand Down

0 comments on commit e3aa5e2

Please sign in to comment.