Point-to-point communication¶
点对点通信¶
(Since NCCL 2.7)
Point-to-point communication can be used to express any communication pattern between ranks.
Any point-to-point communication needs two NCCL calls : a call to ncclSend()
on one
rank and a corresponding ncclRecv()
on the other rank, with the same count and data
type.
(自 NCCL 2.7 起)点对点通信可用于表达 rank 之间的任何通信模式。任何点对点通信都需要两个 NCCL 调用:在一个 rank 上调用ncclSend()
,在另一个 rank 上调用相应的ncclRecv()
,且具有相同的计数和数据类型。
Multiple calls to ncclSend()
and ncclRecv()
targeting different peers
can be fused together with ncclGroupStart()
and ncclGroupEnd()
to form more
complex communication patterns such as one-to-all (scatter), all-to-one (gather),
all-to-all or communication with neighbors in an N-dimensional space.
多次调用ncclSend()
和ncclRecv()
以针对不同的对等体,可以与ncclGroupStart()
和ncclGroupEnd()
结合使用,以形成更复杂的通信模式,例如一对多(分散)、多对一(聚集)、多对多或在 N 维空间中的邻居通信。
Point-to-point calls within a group will be blocking until that group of calls completes,
but calls within a group can be seen as progressing independently, hence should never block
each other. It is therefore important to merge calls that need to progress concurrently to
avoid deadlocks.
组内的点对点调用将阻塞,直到该组调用完成,但组内的调用可以视为独立进行,因此不应相互阻塞。因此,合并需要并发进行的调用以避免死锁至关重要。
Below are a few examples of classic point-to-point communication patterns used by parallel
applications. NCCL semantics allow for all variants with different sizes,
datatypes, and buffers, per rank.
以下是并行应用程序使用的几种经典点对点通信模式的示例。NCCL 语义允许每个等级使用不同大小、数据类型和缓冲区的所有变体。
Sendrecv¶
In MPI terms, a sendrecv operation is when two ranks exchange data, both sending and receiving
at the same time. This can be done by merging both ncclSend and ncclRecv calls into one :
在 MPI 术语中,sendrecv 操作是指两个进程同时交换数据,既发送又接收。这可以通过将 ncclSend 和 ncclRecv 调用合并为一个来实现:
ncclGroupStart();
ncclSend(sendbuff, sendcount, sendtype, peer, comm, stream);
ncclRecv(recvbuff, recvcount, recvtype, peer, comm, stream);
ncclGroupEnd();
One-to-all (scatter)¶ 一对多(散射)¶
A one-to-all operation from a root
rank can be expressed by merging all send and receive
operations in a group :
从root
等级进行的一对多操作可以通过合并组中的所有发送和接收操作来表示:
ncclGroupStart();
if (rank == root) {
for (int r=0; r<nranks; r++)
ncclSend(sendbuff[r], size, type, r, comm, stream);
}
ncclRecv(recvbuff, size, type, root, comm, stream);
ncclGroupEnd();
All-to-one (gather)¶ 全对一(聚集)¶
Similarly, an all-to-one operations to a root
rank would be implemented this way :
同样地,对root
等级的一对多操作将按以下方式实现:
ncclGroupStart();
if (rank == root) {
for (int r=0; r<nranks; r++)
ncclRecv(recvbuff[r], size, type, r, comm, stream);
}
ncclSend(sendbuff, size, type, root, comm, stream);
ncclGroupEnd();
All-to-all¶ 全对全¶
An all-to-all operation would be a merged loop of send/recv operations
to/from all peers :
全对全操作将是一个合并的发送/接收操作循环,涉及所有对等方:
ncclGroupStart();
for (int r=0; r<nranks; r++) {
ncclSend(sendbuff[r], sendcount, sendtype, r, comm, stream);
ncclRecv(recvbuff[r], recvcount, recvtype, r, comm, stream);
}
ncclGroupEnd();
Neighbor exchange¶ 邻居交换¶
Finally, exchanging data with neighbors in an N-dimensions space could be done
with :
最后,在 N 维空间中与邻居交换数据可以通过以下方式完成:
ncclGroupStart();
for (int d=0; d<ndims; d++) {
ncclSend(sendbuff[d], sendcount, sendtype, next[d], comm, stream);
ncclRecv(recvbuff[d], recvcount, recvtype, prev[d], comm, stream);
}
ncclGroupEnd();