Monarch 🦋#
Monarch is a distributed programming framework for PyTorch based on scalable actor messaging. It provides:
Remote actors with scalable messaging: Actors are grouped into collections called meshes and messages can be broadcast to all members.
Fault tolerance through supervision trees: Actors and processes form a tree and failures propagate up the tree, providing good default error behavior and enabling fine-grained fault recovery.
Point-to-point RDMA transfers: cheap registration of any GPU or CPU memory in a process, with the one-sided transfers based on libibverbs
Distributed tensors: actors can work with tensor objects sharded across processes
Monarch code imperatively describes how to create processes and actors using a simple python API:
from monarch.actor import Actor, endpoint, this_host
# spawn 8 trainer processes one for each gpu
training_procs = this_host().spawn_procs({"gpus": 8})
# define the actor to run on each process
class Trainer(Actor):
@endpoint
def train(self, step: int): ...
# create the trainers
trainers = training_procs.spawn("trainers", Trainer)
# tell all the trainers to take a step
fut = trainers.train.call(step=0)
# wait for all trainers to complete
fut.get()
Note: Monarch is currently only supported on Linux systems
Getting Started#
Here are some suggested steps to get started with Monarch:
Installation: Check out the Install guide for getting monarch installed.
Getting Started: The getting started provides an introduction to Monarch’s core API
Explore Examples: Review the Examples to see Monarch in action
Dive Deeper: Explore the API Documentation for more detailed information:
Deep Understanding of Actors: Gain comprehensive knowledge of Actors, the foundational building blocks of Monarch.
Monitoring Tools: Inspect running meshes with the Admin TUI (terminal) or the Monarch Dashboard (web GUI).
License#
Monarch is BSD-3 licensed, as found in the LICENSE file.
Terms of Use <https://opensource.fb.com/legal/terms>_Privacy Policy <https://opensource.fb.com/legal/privacy>_
Community#
We welcome contributions from the community! If you’re interested in contributing, please:
Check the GitHub repository
Review existing issues or create a new one
Discuss your proposed changes before starting work
Submit a pull request with your changes