RL Framework Integration#

Note

Coming Soon This page is under construction.

Use OpenEnv

Use OpenEnv with popular RL frameworks like TRL, torchforge, and SkyRL.

Overview

OpenEnv environments are designed to integrate seamlessly with RL training frameworks. The standard step(), reset(), state() API makes it easy to use environments in training loops.

TRL Integration

TRL (Transformer Reinforcement Learning) is the recommended framework for training language models with RL.

from trl import GRPOTrainer
from openenv import AutoEnv, AutoAction

env = AutoEnv.from_env("textarena")
TextAction = AutoAction.from_env("textarena")

# Use with TRL's GRPO trainer
trainer = GRPOTrainer(
    model=model,
    reward_model=reward_model,
    # ... TRL config
)

See the Wordle with GRPO tutorial for a complete example.

torchforge Integration

torchforge provides optimized RL training utilities.

# Coming soon - integration example

SkyRL Integration

SkyRL is another RL framework compatible with OpenEnv.

# Coming soon - integration example

Generic Training Loop

For custom training setups:

from openenv import AutoEnv, AutoAction

env = AutoEnv.from_env("my-env")
Action = AutoAction.from_env("my-env")

with env.sync() as client:
    for episode in range(num_episodes):
        result = client.reset()

        while not result.terminated:
            # Get action from your policy
            action = policy(result.observation)

            # Take step
            result = client.step(action)

            # Update policy with reward
            policy.update(result.reward)

Next Steps

Reward Design - Design effective reward functions
Wordle with GRPO - Complete TRL example