Expand description
A hyperactor
-based implementation of a PyTorch worker actor.
The worker is responsible for executing PyTorch operations on a local device. It assumes it has exclusive access to device resources, and manages concurrency internally via device-specific constructs (CUDA stream, threads, etc.).
This is a port of monarch/python/controller/worker.py
but does have gaps due
to drift that needs to be reconciled.
This mainly includes:
- Support for record and replay
- debugger support
- general drift in exisitng messages
Modules§
Structs§
- Worker
Actor - A PyTorch runtime instance, operating on a single accelerator device, controlled via hyperactor messaging.
Enums§
- Assign
Rank Message - Worker messages. These define the observable behavior of the worker, so the documentations here
Traits§
- Assign
Rank Message Client - The custom client trait for this message type.
- Assign
Rank Message Handler - The custom handler trait for this message type.