Generator#
The Generator (Policy) is the core inference engine in TorchForge, built on top of vLLM. It manages model serving, text generation, and weight updates for reinforcement learning workflows.
Generator#
- class forge.actors.generator.Generator(engine_args=<factory>, sampling_params=<factory>, use_dcp_for_weight_sync=None, prefetch_weights_to_shm=True, n_fetcher_procs=8)[source]#
Instance of a vLLM-based generator.
This class manually recreates a vLLM engine that mirrors the design of AsyncLLMEngine in v1. The main difference is that all communications are controlled here via Monarch’s proc meshes.
- Parameters:
engine_args (EngineArgs) – The engine arguments to use for the vLLM engine.
sampling_params (SamplingParams) – The sampling parameters to use for the vLLM engine.
use_dcp_for_weight_sync (bool) – Whether to use DCP for NFS-based weight sync. Default depends on whether or not RDMA is enabled in torchstore. If it is, then DCP is disabled. Otherwise, DCP is enabled.
Example: >>> generator = await Generator.options(procs=1, num_replicas=1, with_gpus=True).as_service( … engine_args=EngineArgs(…), … sampling_params=SamplingParams(…), … ) >>> await generator.generate(“Tell me a joke”) Completion(prompt=”Tell me a joke”, text=”A: Why did the chicken cross the road? B: To get to the other side.”, token_ids=[…], logprobs=[…]) >>> await generator.shutdown()
- engine_args#
- generate#
- get_version#
- get_vllm_config#
- n_fetcher_procs = 8#
- prefetch_weights_to_shm = True#
- register_worker#
- async run()[source]#
Schedule, execute, and make output. vllm-project/vllm
- Return type:
None
- sampling_params#
- save_model_params#
- setup#
- stop#
- update_weights#
- use_dcp_for_weight_sync = None#
- validate_model_params#
GeneratorWorker#
- class forge.actors.generator.GeneratorWorker(vllm_config)[source]#
Bases:
ForgeActorMirrors a vLLM GPUWorker vllm-project/vllm
In general, this class should not be instantiated or called directly. Rather, the Generator controls the creation and invocation of all GeneratorWorker.
- execute_model#
- save_model_params#
- setup#
- setup_kv_cache#
- update_weights#
- validate_model_params#
- vllm_config#