Expand description
Introspection protocol for hyperactor actors.
Every actor has a dedicated introspect task that handles
IntrospectMessage by reading InstanceCell state directly,
without going through the actor’s message loop. This means:
- Stuck actors can be introspected (the task runs independently).
- Introspection does not perturb observed state (no Heisenberg).
- Live status is reported accurately.
Infrastructure actors publish domain-specific metadata via
publish_attrs(), which the introspect task reads for
Entity-view queries. Non-addressable children (e.g., system procs)
are resolved via a callback registered on InstanceCell.
Callers navigate topology by fetching an IntrospectResult
and following its children references.
§Design Invariants
The introspection subsystem maintains eleven invariants (S1–S11). Each is documented at the code site that enforces it.
- S1. Introspection must not depend on actor responsiveness – a wedged actor can still be introspected (runtime task, not actor loop).
- S2. Introspection must not perturb observed state – reading
InstanceCellnever setslast_message_handlertoIntrospectMessage. - S3. Sender routing is unchanged – senders target the same
PortId(IntrospectMessage::port()) across processes. - S4.
IntrospectMessagenever produces aWorkCell– pre-registration viaopen_message_portgives the introspect port its own channel, independent of the actor’s work queue. - S5. Replies never use
PanickingMailboxSender– the introspect task replies viaMailbox::serialize_and_send_once. - S6. View semantics are stable – Actor view uses live structural state + supervision children; Entity view uses published properties + domain children.
- S7.
QueryChildmust work without actor handlers – system procs are resolved via a per-actor callback onInstanceCell. - S8. Published properties are constrained – actors cannot
publish
RootorErrorpayloads (onlyHostandProcvariants). - S9. Port binding is single source of truth – the introspect
port is bound exactly once via
bind_actor_port()inInstance::new(). - S10. Introspect receiver lifecycle – created in
Instance::new(), spawned instart(), dropped inchild_instance(). - S11. Terminated snapshots do not keep actors resolvable –
store_terminated_snapshotwrites to the proc’s snapshot map, not the instances map.resolve_actor_refchecks terminal status independently and is unaffected by snapshot storage. - S12. Introspection must not impair actor liveness – introspection queries (including DashMap reads for actor enumeration) must not cause convoy starvation or scheduling delays that stall concurrent actor spawn/stop operations.
§Introspection key invariants (IK-*)
- IK-1 (metadata completeness): Every actor-runtime
introspection key must carry
@meta(INTROSPECT = ...)with non-emptynameanddesc. - IK-2 (short-name uniqueness): No two introspection keys
may share the same
IntrospectAttr.name. Duplicates would break the FQ-to-short HTTP remap and schema output.
§Failure introspection invariants (FI-*)
The FailureInfo presentation type lives in
hyperactor_mesh::introspect; these invariants are documented
here because the enforcement sites are in hyperactor
(proc.rs serve(), live_actor_payload).
- FI-1 (event-before-status): All
InstanceCellstate thatlive_actor_payloadreads must be written BEFOREchange_status()transitions to terminal. - FI-2 (write-once):
InstanceCellState::supervision_eventis written at most once per actor lifetime. - FI-3 (failure attrs <-> status): Failure attrs are present
iff status is
"failed". - FI-4 (is_propagated <-> root_cause_actor):
failure_is_propagated == trueifffailure_root_cause_actor != this_actor_id. - FI-5 (is_poisoned <-> failed_actor_count):
is_poisoned == trueifffailed_actor_count > 0. - FI-6 (clean stop = no artifacts): When an actor stops
cleanly,
supervision_eventisNone, failure attrs are absent, and the actor does not contribute tofailed_actor_count.
§Attrs view invariants (AV-*)
These govern the typed view layer (ActorAttrsView). The
full AV-* / DP-* family is documented in
hyperactor_mesh::introspect; the subset relevant to this
crate:
- AV-1 (view-roundtrip): For each view V,
V::from_attrs(&v.to_attrs()) == Ok(v). - AV-2 (required-key-strictness):
from_attrsfails iff required keys for that view are missing. - AV-3 (unknown-key-tolerance): Unknown attrs keys must not affect successful decode outcome.
Structs§
- Actor
Attrs View - Typed view over attrs for an actor node.
- Failure
Attrs - Structured failure fields decoded from
FAILURE_*attrs. - Introspect
Result - Internal introspection result. Carries attrs as a JSON string.
The mesh layer constructs the API-facing
NodePayload(withproperties) from this viaderive_properties. - Recorded
Event - Structured tracing event from the actor-local flight recorder.
Enums§
- Attrs
View Error - Error from decoding an
Attrsbag into a typed view. - Introspect
Message - Introspection query sent to any actor.
- Introspect
View - Context for introspection query - what aspect of the actor to describe.
Statics§
- ACTOR_
TYPE - Fully-qualified actor type name.
- CHILDREN
- Child reference strings for tree navigation. Published by infrastructure actors (HostMeshAgent, ProcAgent) so the Entity view can return children without parsing mesh-layer keys.
- CREATED_
AT - Timestamp when this actor was created.
- ERROR_
CODE - Machine-readable error code for error nodes.
- ERROR_
MESSAGE - Human-readable error message for error nodes.
- FAILURE_
ERROR_ MESSAGE - Failure error message.
- FAILURE_
IS_ PROPAGATED - Whether the failure was propagated from a child.
- FAILURE_
OCCURRED_ AT - Timestamp when failure occurred.
- FAILURE_
ROOT_ CAUSE_ ACTOR - Actor that caused the failure (root cause).
- FAILURE_
ROOT_ CAUSE_ NAME - Name of root cause actor.
- FLIGHT_
RECORDER - Flight recorder JSON (recent trace events).
- IS_
SYSTEM - Whether this actor is infrastructure/system.
- LAST_
HANDLER - Name of the last message handler invoked.
- MESSAGES_
PROCESSED - Number of messages processed by this actor.
- STATUS
- Actor lifecycle status: “running”, “stopped”, “failed”.
- STATUS_
REASON - Reason for stop/failure (absent when running).
- TOTAL_
PROCESSING_ TIME_ US - Total CPU time in message handlers (microseconds).
Functions§
- format_
timestamp - Format a
SystemTimeas an ISO 8601 timestamp with millisecond precision. - live_
actor_ payload - Build an
IntrospectResultfrom liveInstanceCellstate. - serve_
introspect - Introspect task: runs on a dedicated tokio task per actor,
handling
IntrospectMessageby readingInstanceCelldirectly and replying via the actor’s [Mailbox].