Rate this Page

Generator#

The Generator (Policy) is the core inference engine in TorchForge, built on top of vLLM. It manages model serving, text generation, and weight updates for reinforcement learning workflows.

Generator#

class forge.actors.generator.Generator(engine_args=<factory>, sampling_params=<factory>, use_dcp_for_weight_sync=None, prefetch_weights_to_shm=True, n_fetcher_procs=8)[source]#

Instance of a vLLM-based generator.

This class manually recreates a vLLM engine that mirrors the design of AsyncLLMEngine in v1. The main difference is that all communications are controlled here via Monarch’s proc meshes.

Parameters:
  • engine_args (EngineArgs) – The engine arguments to use for the vLLM engine.

  • sampling_params (SamplingParams) – The sampling parameters to use for the vLLM engine.

  • use_dcp_for_weight_sync (bool) – Whether to use DCP for NFS-based weight sync. Default depends on whether or not RDMA is enabled in torchstore. If it is, then DCP is disabled. Otherwise, DCP is enabled.

Example: >>> generator = await Generator.options(procs=1, num_replicas=1, with_gpus=True).as_service( … engine_args=EngineArgs(…), … sampling_params=SamplingParams(…), … ) >>> await generator.generate(“Tell me a joke”) Completion(prompt=”Tell me a joke”, text=”A: Why did the chicken cross the road? B: To get to the other side.”, token_ids=[…], logprobs=[…]) >>> await generator.shutdown()

engine_args#
generate#
get_version#
get_vllm_config#
n_fetcher_procs = 8#
prefetch_weights_to_shm = True#
register_worker#
async run()[source]#

Schedule, execute, and make output. vllm-project/vllm

Return type:

None

sampling_params#
save_model_params#
setup#
async classmethod shutdown(actor)[source]#
stop#
update_weights#
use_dcp_for_weight_sync = None#
validate_model_params#

GeneratorWorker#

class forge.actors.generator.GeneratorWorker(vllm_config)[source]#

Bases: ForgeActor

Mirrors a vLLM GPUWorker vllm-project/vllm

In general, this class should not be instantiated or called directly. Rather, the Generator controls the creation and invocation of all GeneratorWorker.

execute_model#
save_model_params#
setup#
setup_kv_cache#
update_weights#
validate_model_params#
vllm_config#