Note
Go to the end to download the full example code.
Introduction & Quick Start#
Part 1 of 5 in the OpenEnv Getting Started Series
This notebook introduces OpenEnv, explains why it exists, and gets you running your first environment.
Note
Time: ~10 minutes | Difficulty: Beginner | GPU Required: No
What Youโll Learn#
What is OpenEnv: The unified framework for RL environments
Why OpenEnv: How it compares to traditional solutions like Gym
RL Basics: The observe-act-reward loop in 60 seconds
Quick Start: Connect to and interact with your first environment
Setup: Enable nested async event loops#
This is needed when running in environments like Sphinx-Gallery or Jupyter that already have an event loop running.
import nest_asyncio
nest_asyncio.apply()
What is OpenEnv?#
OpenEnv is a unified framework for building, sharing, and interacting with reinforcement learning environments. Itโs a collaborative effort between Meta, Hugging Face, Unsloth, GPU Mode, and other industry leaders.
The Goal: Make environment creation as easy and standardized as model sharing on Hugging Face.
Key Features#
Standardized API: Gymnasium-style
reset(),step(),state()Type-Safe: Full IDE autocomplete and error checking
Containerized: Environments run in Docker for isolation and reproducibility
Shareable: Push to Hugging Face Hub with one command
Language-Agnostic: HTTP/WebSocket API works from any language
RL in 60 Seconds#
Reinforcement Learning is simpler than you think. Itโs just a loop:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ THE RL LOOP โ
โ โ
โ โโโโโโโโโโโ โโโโโโโโโโโโโโโ โ
โ โ AGENT โโactionโโถโ ENVIRONMENT โ โ
โ โ โโโrewardโโ โ โ
โ โ โโโโobsโโโโ โ โ
โ โโโโโโโโโโโ โโโโโโโโโโโโโโโ โ
โ โ
โ 1. Agent observes the environment โ
โ 2. Agent chooses an action โ
โ 3. Environment returns reward + new observation โ
โ 4. Repeat until done โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
In code, it looks like this:
result = env.reset() # Start episode
while not result.done:
action = agent.choose(result.observation)
result = env.step(action) # Take action, get reward
agent.learn(result.reward)
Thatโs it. Thatโs RL!
Why OpenEnv? (vs. Traditional Solutions)#
Traditional RL environments (like OpenAI Gym/Gymnasium) have been the backbone of RL research for years. They provide a simple API for interacting with environments, and the community has built thousands of environments on top of them.
However, as RL moves from research to production, several challenges emerge:
The Problem with Traditional Approaches#
No Type Safety: Observations are numpy arrays like
obs[0][3]. What does index 3 mean? You have to read documentation or source code to find out.Same-Process Execution: The environment runs in your training process. A bug in the environment can crash your entire training run.
Dependency Hell: Sharing environments means copying files and hoping the recipient has the same dependencies installed.
Python Lock-in: Want to use Rust or C++ for your agent? Too badโGym is Python-only.
โWorks on My Machineโ: Environments behave differently on different systems due to floating-point differences, library versions, or OS quirks.
How OpenEnv Solves These Problems#
Challenge |
Traditional (Gym) |
OpenEnv |
|---|---|---|
Type Safety |
|
|
Isolation |
Same process (can crash) |
Docker container (isolated) |
Deployment |
โWorks on my machineโ |
Same container everywhere |
Sharing |
Copy files, manage deps |
|
Language |
Python only |
Any language (HTTP/WebSocket) |
Scaling |
Single machine |
Deploy to Kubernetes |
Debugging |
Cryptic numpy index errors |
Clear, typed error messages |
Side-by-Side Code Comparison#
Letโs compare the same workflow in both approaches:
Traditional Gym approach:
import gym
import numpy as np
# Create environment - runs in your process
env = gym.make("CartPole-v1")
# Reset returns numpy arrays
obs, info = env.reset()
# obs = array([0.01, 0.02, -0.03, 0.01])
# What do these numbers mean? You have to check docs!
# Step returns multiple values
obs, reward, done, truncated, info = env.step(action)
# No IDE autocomplete, easy to mix up return values
# If env crashes, your whole training crashes
# Sharing requires: pip install gym[atari], hope versions match
OpenEnv approach:
from openenv import AutoEnv, AutoAction
# Load environment and action classes via auto-discovery
OpenSpielEnv = AutoEnv.get_env_class("openspiel")
OpenSpielAction = AutoAction.from_env("openspiel")
# Connect to containerized environment
with OpenSpielEnv(base_url="http://localhost:8000") as env:
# Reset returns typed StepResult
result = env.reset()
# result.observation.legal_actions - IDE autocompletes!
# result.observation.info_state - you know exactly what this is
# Step with typed action
action = OpenSpielAction(action_id=1, game_name="catch")
result = env.step(action)
# result.reward, result.done - all typed
# Environment runs in Docker - isolated from your code
# Share via: openenv push my-env (one command!)
Part 1: Environment Setup#
Letโs set up our environment. This works in Google Colab, locally, or anywhere Python runs.
import subprocess
import sys
from pathlib import Path
# Detect environment
try:
import google.colab
IN_COLAB = True
except ImportError:
IN_COLAB = False
if IN_COLAB:
print("=" * 70)
print(" GOOGLE COLAB DETECTED - Installing OpenEnv...")
print("=" * 70)
# Install OpenEnv
subprocess.run(
[sys.executable, "-m", "pip", "install", "-q", "openenv-core"],
capture_output=True,
)
print(" OpenEnv installed!")
print("=" * 70)
else:
print("=" * 70)
print(" RUNNING LOCALLY")
print("=" * 70)
print()
print("If you haven't installed OpenEnv yet:")
print(" pip install openenv-core")
print()
# Add src to path for local development (when running from docs folder)
src_path = Path.cwd().parent.parent.parent / "src"
if src_path.exists():
sys.path.insert(0, str(src_path))
# Add envs to path
envs_path = Path.cwd().parent.parent.parent / "envs"
if envs_path.exists():
sys.path.insert(0, str(envs_path.parent))
print("=" * 70)
print()
print("Ready to explore OpenEnv!")
======================================================================
RUNNING LOCALLY
======================================================================
If you haven't installed OpenEnv yet:
pip install openenv-core
======================================================================
Ready to explore OpenEnv!
Part 2: Your First Environment - OpenSpiel#
What is OpenSpiel?#
OpenSpiel is an open-source collection of 70+ game environments developed by DeepMind for research in reinforcement learning, game theory, and multi-agent systems.
It includes:
Classic board games: Chess, Go, Backgammon, Tic-Tac-Toe
Card games: Poker variants, Blackjack, Bridge
Simple RL benchmarks: Catch, Cliff Walking, 2048
Multi-agent games: Hanabi, Kuhn Poker, Negotiation games
OpenSpiel is widely used in RL research because it provides consistent, well-tested implementations with support for both single-player and multi-player scenarios.
How OpenSpiel Connects to OpenEnv#
OpenEnv wraps OpenSpiel games as containerized, type-safe environments. This means:
You get all the benefits of OpenSpielโs game library
Plus type-safe Python clients with IDE autocomplete
Plus Docker isolation for reproducibility
Plus easy sharing via Hugging Face Hub
Currently, OpenEnv includes wrappers for 6 OpenSpiel games:
Game |
Players |
Description |
|---|---|---|
Catch |
1 |
Catch a falling ball with a paddle |
2048 |
1 |
Slide tiles to combine numbers |
Blackjack |
1 |
Classic card game against dealer |
Cliff Walking |
1 |
Navigate a grid while avoiding cliffs |
Tic-Tac-Toe |
2 |
Classic 3ร3 grid game |
Kuhn Poker |
2 |
Simplified 3-card poker |
The Catch Game#
For this tutorial, weโll use Catchโone of the simplest RL environments. Itโs perfect for learning because:
Simple rules (easy to understand)
Fast episodes (10 steps each)
Clear success metric (did you catch the ball?)
Optimal strategy is learnable (move toward the ball)
Game Rules:
โฌ โฌ ๐ด โฌ โฌ <- Ball starts at random column (row 0)
โฌ โฌ โฌ โฌ โฌ
โฌ โฌ โฌ โฌ โฌ The ball falls down one row
โฌ โฌ โฌ โฌ โฌ each time step
โฌ โฌ โฌ โฌ โฌ
โฌ โฌ โฌ โฌ โฌ
โฌ โฌ โฌ โฌ โฌ
โฌ โฌ โฌ โฌ โฌ
โฌ โฌ โฌ โฌ โฌ
โฌ โฌ ๐ โฌ โฌ <- Paddle at bottom (row 9)
Grid Size: 10 rows ร 5 columns
Ball: Starts at a random column in row 0, falls one row per step
Paddle: Starts at center column, you control it
Episode Length: 10 steps (ball reaches bottom)
Actions:
Action ID |
Movement |
|---|---|
0 |
Move LEFT |
1 |
STAY (no move) |
2 |
Move RIGHT |
Rewards:
+1.0 if the paddle is in the same column as the ball when it lands
0.0 if you miss the ball
Optimal Strategy: Track the ballโs column and move toward it. A perfect policy wins 100% of the time since the paddle can always reach any column in 10 steps (grid is only 5 columns wide).
Importing OpenEnv#
First, letโs import the OpenSpiel environment client and models:
# Real imports from OpenEnv
try:
# Direct imports from the openspiel_env package
from openspiel_env.client import OpenSpielEnv
from openspiel_env.models import OpenSpielAction, OpenSpielObservation, OpenSpielState
OPENENV_AVAILABLE = True
print("โ OpenEnv imports successful!")
print(f" - OpenSpielEnv: {OpenSpielEnv}")
print(f" - OpenSpielAction: {OpenSpielAction}")
except ImportError as e:
OPENENV_AVAILABLE = False
print(f"โ OpenEnv not fully installed: {e}")
print(" Run: pip install openenv-core")
print(" And: pip install -e ./envs/openspiel_env")
โ OpenEnv imports successful!
- OpenSpielEnv: <class 'openspiel_env.client.OpenSpielEnv'>
- OpenSpielAction: <class 'openspiel_env.models.OpenSpielAction'>
Connecting to an Environment#
OpenEnv provides three ways to connect to environments:
From Hugging Face Hub (auto-downloads and starts container)
From Docker image (uses local image)
From URL (connects to running server)
Letโs examine the actual methods available on the client class:
print("=" * 70)
print(" THREE WAYS TO CONNECT")
print("=" * 70)
print()
if OPENENV_AVAILABLE:
# Show actual method signatures from the class
import inspect
print("Connection methods available on OpenSpielEnv:")
print()
# Method 1: from_hub
if hasattr(OpenSpielEnv, "from_hub"):
sig = inspect.signature(OpenSpielEnv.from_hub)
print(f"1. OpenSpielEnv.from_hub{sig}")
print(" โ Auto-downloads from Hugging Face, starts container, connects")
print(" Example: env = OpenSpielEnv.from_hub('openenv/openspiel-env')")
print()
# Method 2: from_docker_image
if hasattr(OpenSpielEnv, "from_docker_image"):
sig = inspect.signature(OpenSpielEnv.from_docker_image)
print(f"2. OpenSpielEnv.from_docker_image{sig}")
print(" โ Starts container from local image, connects")
print(" Example: env = OpenSpielEnv.from_docker_image('openspiel-env:latest')")
print()
# Method 3: Direct connection
sig = inspect.signature(OpenSpielEnv.__init__)
print(f"3. OpenSpielEnv.__init__{sig}")
print(" โ Connects to already-running server")
print(" Example: env = OpenSpielEnv(base_url='http://localhost:8000')")
print()
print("-" * 70)
print("All three give you the same API - just different ways to start!")
else:
print("(OpenEnv not installed - showing expected methods)")
print()
print("1. OpenSpielEnv.from_hub(repo_id, *, use_docker=True, ...)")
print(" โ Auto-downloads from Hugging Face, starts container, connects")
print()
print("2. OpenSpielEnv.from_docker_image(image, provider=None, ...)")
print(" โ Starts container from local image, connects")
print()
print("3. OpenSpielEnv(base_url, connect_timeout_s=10.0, ...)")
print(" โ Connects to already-running server")
======================================================================
THREE WAYS TO CONNECT
======================================================================
Connection methods available on OpenSpielEnv:
2. OpenSpielEnv.from_docker_image(image: 'str', provider: "Optional['ContainerProvider']" = None, **kwargs: 'Any') -> 'EnvClientT'
โ Starts container from local image, connects
Example: env = OpenSpielEnv.from_docker_image('openspiel-env:latest')
3. OpenSpielEnv.__init__(self, base_url: 'str', connect_timeout_s: 'float' = 10.0, message_timeout_s: 'float' = 60.0, max_message_size_mb: 'float' = 100.0, provider: "Optional['ContainerProvider | RuntimeProvider']" = None, mode: 'Optional[str]' = None)
โ Connects to already-running server
Example: env = OpenSpielEnv(base_url='http://localhost:8000')
----------------------------------------------------------------------
All three give you the same API - just different ways to start!
Part 3: Playing the Catch Game#
Now letโs actually play! This code attempts to connect to a real server. If no server is running, weโll show what the interaction looks like.
import random
# Check if we can connect to a server
SERVER_URL = "http://localhost:8000"
SERVER_AVAILABLE = False
if OPENENV_AVAILABLE:
try:
# Try to connect using sync wrapper
env = OpenSpielEnv(base_url=SERVER_URL)
with env.sync() as client:
# Quick test to verify connection
pass
SERVER_AVAILABLE = True
print(f"โ Connected to server at {SERVER_URL}")
except Exception as e:
print(f"โ No server running at {SERVER_URL}")
print(f" Error: {e}")
print()
print("To start a server, run one of these:")
print(" docker run -p 8000:8000 openenv/openspiel-env:latest")
print(" # OR")
print(" cd envs/openspiel_env && openenv serve")
โ No server running at http://localhost:8000
Error: Failed to connect to ws://localhost:8000/ws: Multiple exceptions: [Errno 111] Connect call failed ('::1', 8000, 0, 0), [Errno 111] Connect call failed ('127.0.0.1', 8000)
To start a server, run one of these:
docker run -p 8000:8000 openenv/openspiel-env:latest
# OR
cd envs/openspiel_env && openenv serve
Playing with a Real Server#
When connected to a real server, hereโs how the interaction works:
if OPENENV_AVAILABLE and SERVER_AVAILABLE:
print("=" * 70)
print(" PLAYING CATCH - LIVE!")
print("=" * 70)
env = OpenSpielEnv(base_url=SERVER_URL)
with env.sync() as client:
# Reset to start a new episode
result = client.reset()
print(f"\nEpisode started!")
print(f" Observation type: {type(result.observation).__name__}")
print(f" Legal actions: {result.observation.legal_actions}")
print(f" Done: {result.done}")
# Play until the episode ends
step_count = 0
while not result.done:
# Choose a random action from legal actions
action_id = random.choice(result.observation.legal_actions)
action = OpenSpielAction(action_id=action_id, game_name="catch")
# Take the action
result = client.step(action)
step_count += 1
print(f"\nStep {step_count}:")
print(f" Action: {action_id} ({'LEFT' if action_id == 0 else 'STAY' if action_id == 1 else 'RIGHT'})")
print(f" Reward: {result.reward}")
print(f" Done: {result.done}")
# Get final state
state = client.state()
print(f"\nEpisode complete!")
print(f" Total steps: {state.step_count}")
print(f" Final reward: {result.reward}")
print(f" Result: {'CAUGHT!' if result.reward > 0 else 'MISSED!'}")
else:
# Run a local simulation to demonstrate the gameplay
print("=" * 70)
print(" PLAYING CATCH - LOCAL SIMULATION")
print("=" * 70)
print()
print("No server running - demonstrating with local simulation.")
print("(This shows exactly what happens when playing the real game)")
print()
# Simulate the Catch game locally
GRID_HEIGHT = 10
GRID_WIDTH = 5
# Initialize game state
ball_col = random.randint(0, GRID_WIDTH - 1)
paddle_col = GRID_WIDTH // 2 # Start in center
print(f"Game initialized:")
print(f" Ball starting column: {ball_col}")
print(f" Paddle starting column: {paddle_col}")
print(f" Grid size: {GRID_HEIGHT} rows ร {GRID_WIDTH} columns")
print()
# Simulate episode
for step in range(GRID_HEIGHT):
# Create observation (matching OpenSpiel format)
info_state = [0.0] * (GRID_HEIGHT * GRID_WIDTH)
info_state[step * GRID_WIDTH + ball_col] = 1.0 # Ball position
info_state[(GRID_HEIGHT - 1) * GRID_WIDTH + paddle_col] = 1.0 # Paddle
legal_actions = [0, 1, 2] # LEFT, STAY, RIGHT
# Choose random action
action_id = random.choice(legal_actions)
action_name = {0: "LEFT", 1: "STAY", 2: "RIGHT"}[action_id]
# Execute action
old_paddle = paddle_col
if action_id == 0: # LEFT
paddle_col = max(0, paddle_col - 1)
elif action_id == 2: # RIGHT
paddle_col = min(GRID_WIDTH - 1, paddle_col + 1)
print(f"Step {step + 1}: Ball at row {step}, col {ball_col} | "
f"Paddle: {old_paddle}โ{paddle_col} ({action_name})")
# Determine result
caught = (paddle_col == ball_col)
reward = 1.0 if caught else 0.0
print()
print(f"Episode complete!")
print(f" Ball landed at column: {ball_col}")
print(f" Paddle final column: {paddle_col}")
print(f" Reward: {reward}")
print(f" Result: {'CAUGHT! ๐' if caught else 'MISSED! ๐ข'}")
print()
print("-" * 70)
print("This is exactly how the real OpenSpielEnv works,")
print("just running locally instead of via WebSocket to a server.")
======================================================================
PLAYING CATCH - LOCAL SIMULATION
======================================================================
No server running - demonstrating with local simulation.
(This shows exactly what happens when playing the real game)
Game initialized:
Ball starting column: 0
Paddle starting column: 2
Grid size: 10 rows ร 5 columns
Step 1: Ball at row 0, col 0 | Paddle: 2โ3 (RIGHT)
Step 2: Ball at row 1, col 0 | Paddle: 3โ4 (RIGHT)
Step 3: Ball at row 2, col 0 | Paddle: 4โ3 (LEFT)
Step 4: Ball at row 3, col 0 | Paddle: 3โ2 (LEFT)
Step 5: Ball at row 4, col 0 | Paddle: 2โ1 (LEFT)
Step 6: Ball at row 5, col 0 | Paddle: 1โ2 (RIGHT)
Step 7: Ball at row 6, col 0 | Paddle: 2โ3 (RIGHT)
Step 8: Ball at row 7, col 0 | Paddle: 3โ4 (RIGHT)
Step 9: Ball at row 8, col 0 | Paddle: 4โ3 (LEFT)
Step 10: Ball at row 9, col 0 | Paddle: 3โ4 (RIGHT)
Episode complete!
Ball landed at column: 0
Paddle final column: 4
Reward: 0.0
Result: MISSED! ๐ข
----------------------------------------------------------------------
This is exactly how the real OpenSpielEnv works,
just running locally instead of via WebSocket to a server.
Part 4: Understanding the Response Types#
OpenEnv uses type-safe models for all interactions. Letโs create actual instances and examine their attributes:
print("=" * 70)
print(" OPENENV TYPE SYSTEM - ACTUAL INSTANCES")
print("=" * 70)
# Create example instances that match what you'd get from the Catch game
# These are the actual Pydantic models used by OpenEnv
# 1. OpenSpielObservation - what the agent receives after each step
print("\n๐ฆ OpenSpielObservation (returned in StepResult)")
print("-" * 50)
if OPENENV_AVAILABLE:
# OpenSpielObservation was already imported above via auto-discovery
# Create a sample observation like what Catch game returns
sample_observation = OpenSpielObservation(
info_state=[0.0, 0.0, 1.0, 0.0, 0.0] + [0.0] * 45, # Ball at col 2, row 0
legal_actions=[0, 1, 2], # LEFT, STAY, RIGHT
game_phase="playing",
current_player_id=0,
opponent_last_action=None,
)
print(f" info_state: {sample_observation.info_state[:10]}... (length: {len(sample_observation.info_state)})")
print(f" legal_actions: {sample_observation.legal_actions}")
print(f" game_phase: {sample_observation.game_phase!r}")
print(f" current_player_id: {sample_observation.current_player_id}")
print(f" opponent_last_action: {sample_observation.opponent_last_action}")
else:
# Create without imports to show the structure
from dataclasses import dataclass
from typing import List, Optional
@dataclass
class OpenSpielObservation:
info_state: List[float]
legal_actions: List[int]
game_phase: str = "playing"
current_player_id: int = 0
opponent_last_action: Optional[int] = None
sample_observation = OpenSpielObservation(
info_state=[0.0, 0.0, 1.0, 0.0, 0.0] + [0.0] * 45,
legal_actions=[0, 1, 2],
game_phase="playing",
current_player_id=0,
opponent_last_action=None,
)
print(f" info_state: {sample_observation.info_state[:10]}... (length: {len(sample_observation.info_state)})")
print(f" legal_actions: {sample_observation.legal_actions}")
print(f" game_phase: {sample_observation.game_phase!r}")
print(f" current_player_id: {sample_observation.current_player_id}")
print(f" opponent_last_action: {sample_observation.opponent_last_action}")
# 2. OpenSpielState - the environment's internal state
print("\n๐ OpenSpielState (returned by state())")
print("-" * 50)
if OPENENV_AVAILABLE:
# OpenSpielState was already imported above via auto-discovery
sample_state = OpenSpielState(
game_name="catch",
agent_player=0,
opponent_policy="random",
game_params={"rows": 10, "columns": 5},
num_players=1,
)
print(f" game_name: {sample_state.game_name!r}")
print(f" agent_player: {sample_state.agent_player}")
print(f" opponent_policy: {sample_state.opponent_policy!r}")
print(f" game_params: {sample_state.game_params}")
print(f" num_players: {sample_state.num_players}")
else:
@dataclass
class OpenSpielState:
game_name: str = "catch"
agent_player: int = 0
opponent_policy: str = "random"
game_params: dict = None
num_players: int = 1
sample_state = OpenSpielState(
game_name="catch",
agent_player=0,
opponent_policy="random",
game_params={"rows": 10, "columns": 5},
num_players=1,
)
print(f" game_name: {sample_state.game_name!r}")
print(f" agent_player: {sample_state.agent_player}")
print(f" opponent_policy: {sample_state.opponent_policy!r}")
print(f" game_params: {sample_state.game_params}")
print(f" num_players: {sample_state.num_players}")
# 3. OpenSpielAction - what you send to step()
print("\n๐ฎ OpenSpielAction (what you send to step())")
print("-" * 50)
if OPENENV_AVAILABLE:
# OpenSpielAction was already imported above via auto-discovery
sample_action = OpenSpielAction(
action_id=1, # STAY
game_name="catch",
game_params={"rows": 10, "columns": 5},
)
print(f" action_id: {sample_action.action_id} # 0=LEFT, 1=STAY, 2=RIGHT")
print(f" game_name: {sample_action.game_name!r}")
print(f" game_params: {sample_action.game_params}")
else:
@dataclass
class OpenSpielAction:
action_id: int
game_name: str = "catch"
game_params: dict = None
sample_action = OpenSpielAction(
action_id=1,
game_name="catch",
game_params={"rows": 10, "columns": 5},
)
print(f" action_id: {sample_action.action_id} # 0=LEFT, 1=STAY, 2=RIGHT")
print(f" game_name: {sample_action.game_name!r}")
print(f" game_params: {sample_action.game_params}")
print("\n" + "=" * 70)
print("These are the actual Pydantic/dataclass models used by OpenEnv.")
print("Type safety helps catch errors before they reach the environment!")
print("=" * 70)
======================================================================
OPENENV TYPE SYSTEM - ACTUAL INSTANCES
======================================================================
๐ฆ OpenSpielObservation (returned in StepResult)
--------------------------------------------------
info_state: [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]... (length: 50)
legal_actions: [0, 1, 2]
game_phase: 'playing'
current_player_id: 0
opponent_last_action: None
๐ OpenSpielState (returned by state())
--------------------------------------------------
game_name: 'catch'
agent_player: 0
opponent_policy: 'random'
game_params: {'rows': 10, 'columns': 5}
num_players: 1
๐ฎ OpenSpielAction (what you send to step())
--------------------------------------------------
action_id: 1 # 0=LEFT, 1=STAY, 2=RIGHT
game_name: 'catch'
game_params: {'rows': 10, 'columns': 5}
======================================================================
These are the actual Pydantic/dataclass models used by OpenEnv.
Type safety helps catch errors before they reach the environment!
======================================================================
Part 5: The Architecture#
OpenEnv uses a client-server architecture:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ YOUR CODE โ
โ โ
โ from openenv import AutoEnv โ
โ OpenSpielEnv = AutoEnv.get_env_class("openspiel") โ
โ env = OpenSpielEnv(base_url="http://localhost:8000") โ
โ result = env.reset() # Sends WebSocket message โ
โ result = env.step(action) # Sends WebSocket message โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โ WebSocket (persistent connection)
โ
โโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ DOCKER CONTAINER โ
โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ FastAPI Server + Environment Logic โ โ
โ โ - /ws (WebSocket endpoint) โ โ
โ โ - Handles reset(), step(), state() โ โ
โ โ - Runs the actual game simulation โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ Isolated โข Reproducible โข Scalable โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Key insight: You never deal with HTTP/WebSocket directly. The OpenEnv client handles all the networking!
Summary#
In this notebook, you learned:
What OpenEnv Is:
A unified framework for RL environments
Containerized, type-safe, and shareable
Why Use OpenEnv:
Type safety with IDE autocomplete
Isolated Docker containers
Easy sharing via Hugging Face Hub
How to Use It:
env.reset()- Start a new episodeenv.step(action)- Take an actionenv.state()- Get current state
Next Steps#
Continue to Notebook 2: Using Environments
In the next notebook, youโll:
Explore all available OpenEnv environments
Create different AI policies
Run evaluations and compare performance
Work with multi-player games
Total running time of the script: (0 minutes 0.017 seconds)