Expand description
Hyperactor is an actor system intended for managing large scale compute.
§Actor data model
Hyperactor is designed to support large scale (millions of nodes) machine learning workloads where actor topologies communicate through high fanout multicast messaging.
Supporting this scale requires us to impose additional structure at the level of the framework, so that we can efficiently refer to gangs of actors that implement the same worker runtimes.
Similarly, Hyperactor must gang-schedule actors in order to support collective communicaton between actors.
Hyperactor is organized into a hierarchy, wherein parents manage the lifecycle of their children:
- Each world represents a fixed number of procs, scheduled as a gang.
- Each proc represents a single actor runtime instance, and hosts zero or more actors.
- Actors are spawned into worlds, and assigned a global name. Actors spawned in this way are assigned a local PID (pid) of 0. Actors in turn can spawn local actors. These inherit the global pid of their parent, but receive a unique pid.
Actors that share a name within a world are called a gang.
This scheme confers several benefits:
-
Routing of messages can be performed by prefix. For example, we can route a message to an actor based on the world the actor belongs to; from there, we can identify the proc of the actor and send the message to it, which can then in turn be routed locally.
-
We can represent gangs of actors in a uniform and compact way. This is the basis on which we implement efficient multicasting within the system.
Entity | Identifier |
---|---|
World | worldid |
Proc | worldid[rank] |
Actor | worldid[rank].name[pid] |
Gang | worldid.name |
Re-exports§
pub use actor::Actor;
pub use actor::ActorHandle;
pub use actor::Handler;
pub use actor::RemoteHandles;
pub use data::Named;
pub use mailbox::Data;
pub use mailbox::Mailbox;
pub use mailbox::Message;
pub use mailbox::OncePortHandle;
pub use mailbox::PortHandle;
pub use mailbox::RemoteMessage;
pub use proc::Context;
pub use proc::Instance;
pub use reference::ActorId;
pub use reference::ActorRef;
pub use reference::GangId;
pub use reference::GangRef;
pub use reference::OncePortRef;
pub use reference::PortId;
pub use reference::PortRef;
pub use reference::ProcId;
pub use reference::WorldId;
Modules§
- accum
- Defines the accumulator trait and some common accumulators.
- actor
- This module contains all the core traits required to define and manage actors.
- attrs
- Attribute dictionary for type-safe, heterogeneous key-value storage with serde support.
- cap
- Capabilities used in various public APIs.
- channel
- One-way, multi-process, typed communication channels. These are used to send messages between mailboxes residing in different processes.
- checkpoint
- Checkpoint functionality for various objects to save and load states.
- clock
- The clock allows us to control the behaviour of all time dependent events in both real and simulated time throughout the system
- config
- Configuration for Hyperactor.
- data
- This module contains core traits and implementation to manage remote data types in Hyperactor.
- mailbox
- Mailboxes are the central message-passing mechanism in Hyperactor.
- message
- This module provides a framework for mutating serialized messages without the need to deserialize them. This capability is useful when sending messages to a remote destination throughout intermeidate nodes, where the intermediate nodes do not contain the message’s type information.
- metrics
- Hyperactor metrics.
- panic_
handler - Used to capture the backtrace from panic and store it in a task_local, so that it can be retrieved later when the panic is catched.
- proc
- This module provides
Proc
, which is the runtime used within a single proc. - reference
- References for different resources in Hyperactor.
- simnet
- A simulator capable of simulating Hyperactor’s network channels (see: [
channel
]). The simulator can simulate message delivery delays and failures, and is used for testing and development of message distribution techniques. - supervision
- Messages used in supervision.
- sync
- Synchronization primitives that are used by Hyperactor.
- test_
utils - Test utilities
Macros§
- alias
- Create a [
RemoteActor
] handling a specific set of message types. This is used to create an [ActorRef
] without having to depend on the actor’s implementation. If the message type need to be cast, addcastable
flag to those types. e.g. the following example creats an alias with 5 message types, and 4 of which need to be cast. - declare_
attrs - Declares attribute keys using a lazy_static! style syntax.
- declare_
static_ counter - Create a thread safe static counter that can be incremeneted or decremented. This is useful to avoid creating temporary counters. You can safely create counters with the same name. They will be joined by the underlying runtime and are thread safe.
- declare_
static_ gauge - Create a thread safe static gauge that can be set to a specific value. This is useful to avoid creating temporary gauges. You can safely create gauges with the same name. They will be joined by the underlying runtime and are thread safe.
- declare_
static_ histogram - Create a thread safe static histogram that can be incremeneted or decremented. This is useful to avoid creating temporary histograms. You can safely create histograms with the same name. They will be joined by the underlying runtime and are thread safe.
- declare_
static_ timer - Create a thread safe static timer that can be used to measure durations. This macro creates a histogram with predefined boundaries appropriate for the specified time unit. Supported units are “ms” (milliseconds), “us” (microseconds), and “ns” (nanoseconds).
- id
- Statically create a
WorldId
,ProcId
,ActorId
orGangId
, given the concrete syntax documented inReference
: - key_
value - Create key value pairs for use in opentelemetry. These pairs can be stored and used multiple times. Opentelemetry adds key value attributes when you bump counters and histograms. so MY_COUNTER.add(42, &[key_value!(“key”, “value”)]) and MY_COUNTER.add(42, &[key_value!(“key”, “other_value”)]) will actually bump two separete counters.
- kv_
pairs - Construct the key value attribute slice using mapping syntax. Example:
- register_
type - Register a (concrete) type so that it may be looked up by name or hash. Type registration is required only to improve diagnostics, as it allows a binary to introspect serialized payloads under type erasure.
- remote
- Register an actor type so that it can be spawned remotely. The actor
type must implement
crate::data::Named
, which will be used to identify the actor globally.
Structs§
- Signal
Cleanup Guard - RAII guard that automatically unregisters a signal cleanup callback when dropped
Enums§
- Signal
Disposition - This type describes how a signal is currently handled by the process.
Functions§
- initialize
- Initialize the Hyperactor runtime. Specifically:
- initialize_
with_ current_ runtime - Initialize the Hyperactor runtime using the current tokio runtime handle.
- initialize_
with_ log_ prefix - Initialize the Hyperactor runtime. Specifically:
- query_
signal_ disposition - Query the current disposition of a signal (
signum
). - register_
signal_ cleanup - Register a cleanup callback to be executed on SIGINT/SIGTERM Returns a unique ID that can be used to unregister the callback
- register_
signal_ cleanup_ scoped - Register a scoped cleanup callback to be executed on SIGINT/SIGTERM Returns a guard that automatically unregisters the callback when dropped
- sigpipe_
disposition - Returns the current
SignalDisposition
ofSIGPIPE
. - unregister_
signal_ cleanup - Unregister a previously registered cleanup callback
Attribute Macros§
- export
- Exports handlers for this actor. The set of exported handlers
determine the messages that may be sent to remote references of
the actor ([
hyperaxtor::ActorRef
]). Only messages that implement [hyperactor::RemoteMessage
] may be exported. - forward
- Forward messages of the provided type to this handler implementation.
- instrument
- Use this macro in place of tracing::instrument to prevent spamming our tracing table.
We set a default level of INFO while always setting ERROR if the function returns Result::Err giving us
consistent and high quality structured logs. Because this wraps around tracing::instrument, all parameters
mentioned in https://fburl.com/9jlkb5q4 should be valid. For functions that don’t return a [
Result
] type, use [instrument_infallible
] - instrument_
infallible - Use this macro in place of tracing::instrument to prevent spamming our tracing table. Because this wraps around tracing::instrument, all parameters mentioned in https://fburl.com/9jlkb5q4 should be valid.
Derive Macros§
- Bind
- Derive a custom implementation of [
hyperactor::message::Bind
] trait for a struct or enum. This macro is normally used in tandem with [fn derive_unbind
] to make the applied struct or enum castable. - Handle
Client - Derives a client implementation on
ActorHandle<Actor>
. SeeHandler
documentation for details. - Handler
- Derive a custom handler trait for given an enum containing tuple
structs. The handler trait defines a method corresponding
to each of the enum’s variants, and a
handle
function that dispatches messages to the correct method. The macro supports two messaging patterns: “call” and “oneway”. A call is a request-response message; a [hyperactor::mailbox::OncePortRef
] or [hyperactor::mailbox::OncePortHandle
] in the last position is used to send the return value. - Named
- Derive the [
hyperactor::data::Named
] trait for a struct with the provided type URI. The name of the type is its fully-qualified Rust path. The name may be overridden by providing a string value for thename
attribute. - RefClient
- Derives a client implementation on
ActorRef<Actor>
. SeeHandler
documentation for details. - Unbind
- Derive a custom implementation of [
hyperactor::message::Unbind
] trait for a struct or enum. This macro is normally used in tandem with [fn derive_bind
] to make the applied struct or enum castable.