Struct Communicator

Source

pub struct Communicator { /* private fields */ }

Expand description

Wraps a NCCL communicator, and provides a Tensor-based interface it.

This implements a subset of the c10d::ProcessGroupAPI.

Implementations§

Source §

impl Communicator

Source

pub fn new( device: CudaDevice, world_size: i32, unique_id: UniqueId, rank: i32, ) -> Result<Self, NcclError>

Create a new communicator. This function must be called by a different thread/process per rank.

Source

pub fn split_all( &mut self, config: Option<NcclConfig>, ) -> Result<Self, NcclError>

Split off a new communicator from this one, preserving the same world size.

Source

pub fn split_from( &mut self, ranks: Vec<i32>, config: Option<NcclConfig>, ) -> Result<Option<Self>, NcclError>

Split off a new communicator from this one. Only ranks will be present on this new communicator.

If ranks is empty, ncclCommSplit will be called with NCCL_SPLIT_NOCOLOR. This can be useful if ranks excluded from the split don’t even know what ranks will be included.

Source

pub fn all_reduce( &mut self, tensor: &TensorCell, reduce_op: ReduceOp, stream: &Stream, ) -> Result<NcclStatus, NcclError>

Reduce the tensor data across all ranks, with each rank receiving the final result in-place.

See torch.distributed.all_reduce for more detailed documentation.

Source

pub fn broadcast( &mut self, tensor: &TensorCell, root: i32, stream: &Stream, ) -> Result<NcclStatus, NcclError>

Broadcast the tensor data on the root rank to all the others.

See torch.distributed.broadcast for more detailed documentation.

Source

pub fn reduce( &mut self, tensor: &TensorCell, reduce_op: ReduceOp, root: i32, stream: &Stream, ) -> Result<NcclStatus, NcclError>

Reduce the tensor data across all ranks, writing the result out to tensor on the root rank.

See torch.distributed.reduce for more detailed documentation.

Source

pub fn all_gather( &mut self, output_cells: &[TensorCell], input_cell: &TensorCell, stream: &Stream, ) -> Result<NcclStatus, NcclError>

Gather tensors from all ranks into a list of output tensors.

See torch.distributed.all_gather for more detailed documentation.

Source

pub fn all_gather_into_tensor( &mut self, output_cell: &TensorCell, input_cell: &TensorCell, stream: &Stream, ) -> Result<NcclStatus, NcclError>

Gather tensors from all ranks into a single output tensor.

See torch.distributed.all_gather_into_tensor for more detailed documentation.

Source

pub fn reduce_scatter_tensor( &mut self, output_cell: &TensorCell, input_cell: &TensorCell, reduce_op: ReduceOp, stream: &Stream, ) -> Result<NcclStatus, NcclError>

Reduce, then scatters the result to all tensors in the group.

See torch.distributed.reduce_scatter_tensor for more detailed documentation.

Source

pub fn send( &mut self, tensor_cell: &TensorCell, dst: i32, stream: &Stream, ) -> Result<NcclStatus, NcclError>

Send a tensor to the rank dst.

Source

pub fn recv( &mut self, tensor_cell: &TensorCell, src: i32, stream: &Stream, ) -> Result<NcclStatus, NcclError>

Receive a tensor from the rank src.

Source

pub fn all_to_all_single( &mut self, output_cell: &TensorCell, input_cell: &TensorCell, stream: &Stream, ) -> Result<NcclStatus, NcclError>

Split the input tensor then scatter the split list to all processes in the group. The received splits are then concatenated into the output tensor.

See torch.distributed.all_to_all_single for more detailed documentation.

Source

pub fn barrier(&mut self, stream: &Stream) -> Result<NcclStatus, NcclError>

Synchronize all ranks.

See torch.distributed.barrier for more detailed documentation.

Trait Implementations§

Source §

impl Debug for Communicator

Source §

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

Source §

impl Send for Communicator

SAFETY: ncclComm_t is okay to access from multiple threads, but each communicator must issue nccl calls in the same order. It is up to the user to ensure this.

Source §

impl Sync for Communicator

SAFETY: ncclComm_t is okay to access from multiple threads, but each communicator must issue nccl calls in the same order. It is up to the user to ensure this.

Auto Trait Implementations§

§

impl UnwindSafe for Communicator

Blanket Implementations§

Source §

impl<T> Any for T
where T: 'static + ?Sized,

Source §

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more

Source §

impl<T> Borrow<T> for T
where T: ?Sized,

Source §

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more

Source §

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source §

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more

Source §

impl<T> From<T> for T

Source §

fn from(t: T) -> T

Returns the argument unchanged.

§

impl<T> FutureExt for T

§

fn with_context(self, otel_cx: Context) -> WithContext<Self>

Attaches the provided Context to this type, returning a WithContext wrapper. Read more

§

fn with_current_context(self) -> WithContext<Self>

Attaches the current Context to this type, returning a WithContext wrapper. Read more

Source §

impl<A, M> Handler<IndexedErasedUnbound<M>> for A
where A: Handler<M>, M: Castable,

Source §

fn handle<'life0, 'life1, 'async_trait>( &'life0 mut self, cx: &'life1 Context<'_, A>, msg: IndexedErasedUnbound<M>, ) -> Pin<Box<dyn Future<Output = Result<(), Error>> + Send + 'async_trait>>
where 'life0: 'async_trait, 'life1: 'async_trait, A: 'async_trait,

Handle the next M-typed message.

§

impl<T> Instrument for T

§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided [Span], returning an Instrumented wrapper. Read more

§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more

Source §

impl<T, U> Into for T
where U: From<T>,

Source §

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source §

impl<T> IntoEither for T

Source §

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more

Source §

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more

§

impl<T> Pointable for T

§

const ALIGN: usize

The alignment of pointer.

§

type Init = T

The type for initializers.

§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more

§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more

§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more

§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more

§

impl<T> QuoteExt for T
where T: ?Sized,

§

fn push_quoted<'q, Q, S>(&mut self, _q: Q, s: S)
where Q: QuoteInto<T>, S: Into<Quotable<'q>>,

Source §

impl<T, U> TryFrom for T
where U: Into<T>,

Source §

type Error = Infallible

The type returned in the event of a conversion error.

Source §

fn try_from(value: U) -> Result<T, <T as TryFrom>::Error>

Performs the conversion.

Source §

impl<T, U> TryInto for T
where U: TryFrom<T>,

Source §

type Error = >::Error

The type returned in the event of a conversion error.

Source §

fn try_into(self) -> Result<U, >::Error>

Performs the conversion.

§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

§

fn vzip(self) -> V

§

impl<T> WithSubscriber for T

§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a [WithDispatch] wrapper. Read more

§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a [WithDispatch] wrapper. Read more

Source §

impl<M> Message for M
where M: Debug + Send + Sync + 'static,

§

Struct CommunicatorCopy item path

Implementations§

impl Communicator

pub fn new( device: CudaDevice, world_size: i32, unique_id: UniqueId, rank: i32, ) -> Result<Self, NcclError>

pub fn split_all( &mut self, config: Option<NcclConfig>, ) -> Result<Self, NcclError>

pub fn split_from( &mut self, ranks: Vec<i32>, config: Option<NcclConfig>, ) -> Result<Option<Self>, NcclError>

pub fn all_reduce( &mut self, tensor: &TensorCell, reduce_op: ReduceOp, stream: &Stream, ) -> Result<NcclStatus, NcclError>

pub fn broadcast( &mut self, tensor: &TensorCell, root: i32, stream: &Stream, ) -> Result<NcclStatus, NcclError>

pub fn reduce( &mut self, tensor: &TensorCell, reduce_op: ReduceOp, root: i32, stream: &Stream, ) -> Result<NcclStatus, NcclError>

pub fn all_gather( &mut self, output_cells: &[TensorCell], input_cell: &TensorCell, stream: &Stream, ) -> Result<NcclStatus, NcclError>

pub fn all_gather_into_tensor( &mut self, output_cell: &TensorCell, input_cell: &TensorCell, stream: &Stream, ) -> Result<NcclStatus, NcclError>

pub fn reduce_scatter_tensor( &mut self, output_cell: &TensorCell, input_cell: &TensorCell, reduce_op: ReduceOp, stream: &Stream, ) -> Result<NcclStatus, NcclError>

pub fn send( &mut self, tensor_cell: &TensorCell, dst: i32, stream: &Stream, ) -> Result<NcclStatus, NcclError>

pub fn recv( &mut self, tensor_cell: &TensorCell, src: i32, stream: &Stream, ) -> Result<NcclStatus, NcclError>

pub fn all_to_all_single( &mut self, output_cell: &TensorCell, input_cell: &TensorCell, stream: &Stream, ) -> Result<NcclStatus, NcclError>

pub fn barrier(&mut self, stream: &Stream) -> Result<NcclStatus, NcclError>

Trait Implementations§

impl Debug for Communicator

fn fmt(&self, f: &mut Formatter<'_>) -> Result

impl Send for Communicator

impl Sync for Communicator

Auto Trait Implementations§

impl Freeze for Communicator

impl RefUnwindSafe for Communicator

impl Unpin for Communicator

impl UnwindSafe for Communicator

Blanket Implementations§

impl<T> Any for Twhere T: 'static + ?Sized,

fn type_id(&self) -> TypeId

impl<T> Borrow<T> for Twhere T: ?Sized,

fn borrow(&self) -> &T

impl<T> BorrowMut<T> for Twhere T: ?Sized,

fn borrow_mut(&mut self) -> &mut T

impl<T> From<T> for T

fn from(t: T) -> T

impl<T> FutureExt for T

fn with_context(self, otel_cx: Context) -> WithContext<Self>

fn with_current_context(self) -> WithContext<Self>

impl<A, M> Handler<IndexedErasedUnbound<M>> for Awhere A: Handler<M>, M: Castable,

fn handle<'life0, 'life1, 'async_trait>( &'life0 mut self, cx: &'life1 Context<'_, A>, msg: IndexedErasedUnbound<M>, ) -> Pin<Box<dyn Future<Output = Result<(), Error>> + Send + 'async_trait>>where 'life0: 'async_trait, 'life1: 'async_trait, A: 'async_trait,

impl<T> Instrument for T

fn instrument(self, span: Span) -> Instrumented<Self>

fn in_current_span(self) -> Instrumented<Self>

impl<T, U> Into<U> for Twhere U: From<T>,

fn into(self) -> U

impl<T> IntoEither for T

fn into_either(self, into_left: bool) -> Either<Self, Self>

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>where F: FnOnce(&Self) -> bool,

impl<T> Pointable for T

const ALIGN: usize

type Init = T

unsafe fn init(init: <T as Pointable>::Init) -> usize

unsafe fn deref<'a>(ptr: usize) -> &'a T

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

unsafe fn drop(ptr: usize)

impl<T> QuoteExt for Twhere T: ?Sized,

fn push_quoted<'q, Q, S>(&mut self, _q: Q, s: S)where Q: QuoteInto<T>, S: Into<Quotable<'q>>,

impl<T, U> TryFrom<U> for Twhere U: Into<T>,

type Error = Infallible

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

impl<T, U> TryInto<U> for Twhere U: TryFrom<T>,

type Error = <U as TryFrom<T>>::Error

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

impl<V, T> VZip<V> for Twhere V: MultiLane<T>,

fn vzip(self) -> V

impl<T> WithSubscriber for T

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>where S: Into<Dispatch>,

fn with_current_subscriber(self) -> WithDispatch<Self>

impl<M> Message for Mwhere M: Debug + Send + Sync + 'static,

impl<T> Ungil for Twhere T: Send,

Struct Communicator

impl<T> Any for T
where T: 'static + ?Sized,

impl<T> Borrow<T> for T
where T: ?Sized,

impl<T> BorrowMut<T> for T
where T: ?Sized,

impl<A, M> Handler<IndexedErasedUnbound<M>> for A
where A: Handler<M>, M: Castable,

fn handle<'life0, 'life1, 'async_trait>( &'life0 mut self, cx: &'life1 Context<'_, A>, msg: IndexedErasedUnbound<M>, ) -> Pin<Box<dyn Future<Output = Result<(), Error>> + Send + 'async_trait>>
where 'life0: 'async_trait, 'life1: 'async_trait, A: 'async_trait,

impl<T, U> Into<U> for T
where U: From<T>,

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

impl<T> QuoteExt for T
where T: ?Sized,

fn push_quoted<'q, Q, S>(&mut self, _q: Q, s: S)
where Q: QuoteInto<T>, S: Into<Quotable<'q>>,

impl<T, U> TryFrom<U> for T
where U: Into<T>,

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

impl<M> Message for M
where M: Debug + Send + Sync + 'static,

impl<T> Ungil for T
where T: Send,