# RL Framework Integration

:::{note}
Coming Soon
    This page is under construction.

Use OpenEnv

Use OpenEnv with popular RL frameworks like TRL, torchforge, and SkyRL.

## Overview

OpenEnv environments are designed to integrate seamlessly with RL training frameworks. The standard `step()`, `reset()`, `state()` API makes it easy to use environments in training loops.

## TRL Integration

[TRL (Transformer Reinforcement Learning)](https://huggingface.co/docs/trl) is the recommended framework for training language models with RL.

```python
from trl import GRPOTrainer
from openenv import AutoEnv, AutoAction

env = AutoEnv.from_env("textarena")
TextAction = AutoAction.from_env("textarena")

# Use with TRL's GRPO trainer
trainer = GRPOTrainer(
    model=model,
    reward_model=reward_model,
    # ... TRL config
)
```

See the [Wordle with GRPO](../tutorials/wordle-grpo.md) tutorial for a complete example.

## torchforge Integration

[torchforge](https://github.com/pytorch-labs/torchforge) provides optimized RL training utilities.

```python
# Coming soon - integration example
```

## SkyRL Integration

[SkyRL](https://github.com/skydeck/skyrl) is another RL framework compatible with OpenEnv.

```python
# Coming soon - integration example
```

## Generic Training Loop

For custom training setups:

```python
from openenv import AutoEnv, AutoAction

env = AutoEnv.from_env("my-env")
Action = AutoAction.from_env("my-env")

with env.sync() as client:
    for episode in range(num_episodes):
        result = client.reset()

        while not result.terminated:
            # Get action from your policy
            action = policy(result.observation)

            # Take step
            result = client.step(action)

            # Update policy with reward
            policy.update(result.reward)
```

## Next Steps

- [Reward Design](rewards.md) - Design effective reward functions
- [Wordle with GRPO](../tutorials/wordle-grpo.md) - Complete TRL example
:::