- openenv¶
OpenSpiel Environment¶
Integration of OpenSpiel games with the OpenEnv framework. OpenSpiel (https://github.com/google-deepmind/open_spiel) is DeepMind's collection of 70+ game environments for RL research.
Supported Games¶
This environment supports 6 games across different categories:
Single-Player Games (No Opponent)¶
- Catch - Move horizontally to catch a falling ball
- Cliff Walking - Navigate grid without falling off cliff (Sutton & Barto benchmark)
- 2048 - Classic tile-merging puzzle game
- Blackjack - Simplified blackjack (HIT/STAND only)
Multi-Player Games (with Bot Opponent)¶
- Tic-Tac-Toe - Classic 3x3 game
- Kuhn Poker - 2-player simplified poker (game theory benchmark)
Architecture¶
┌────────────────────────────────────┐
│ RL Training Code (Client) │
│ OpenSpielEnv.step(action) │
└──────────────┬─────────────────────┘
│ HTTP
┌──────────────▼─────────────────────┐
│ FastAPI Server (Docker) │
│ OpenSpielEnvironment │
│ ├─ Wraps rl_environment.Env │
│ ├─ Agent controls player 0 │
│ └─ Opponent: Random/Fixed │
└────────────────────────────────────┘
Installation & Usage¶
Option 1: Local Development (without Docker)¶
Requirements: - OpenSpiel must be installed (see https://github.com/google-deepmind/open_spiel) - Python 3.11+
from envs.openspiel_env import OpenSpielEnv, OpenSpielAction
# Start local server manually
# python -m envs.openspiel_env.server.app
# Connect to local server
env = OpenSpielEnv(base_url="http://localhost:8000")
# Reset environment
result = env.reset()
print(f"Initial state: {result.observation.info_state}")
print(f"Legal actions: {result.observation.legal_actions}")
# Take actions
for _ in range(10):
action_id = result.observation.legal_actions[0] # Choose first legal action
result = env.step(OpenSpielAction(action_id=action_id))
print(f"Reward: {result.reward}, Done: {result.done}")
if result.done:
break
# Cleanup
env.close()
Option 2: Docker (Recommended)¶
Build Docker image:
cd OpenEnv
docker build -f src/envs/openspiel_env/server/Dockerfile -t openspiel-env:latest .
Run specific games:
# Catch (default)
docker run -p 8000:8000 openspiel-env:latest
# Tic-Tac-Toe with random opponent
docker run -p 8000:8000 -e OPENSPIEL_GAME=tic_tac_toe openspiel-env:latest
# Kuhn Poker
docker run -p 8000:8000 -e OPENSPIEL_GAME=kuhn_poker openspiel-env:latest
# 2048
docker run -p 8000:8000 -e OPENSPIEL_GAME=2048 openspiel-env:latest
Use with from_docker_image():
from envs.openspiel_env import OpenSpielEnv, OpenSpielAction
# Automatically starts container
env = OpenSpielEnv.from_docker_image("openspiel-env:latest")
result = env.reset()
result = env.step(OpenSpielAction(action_id=0))
env.close() # Stops container
Game-Specific Information¶
1. Catch¶
- Type: Single-player
- Action Space: 3 actions (left, stay, right)
- Observation: 5x5 grid flattened (25 dimensions)
- Reward: +1 for catching ball, 0 otherwise
- Episode Length: ~10 steps
env = OpenSpielEnv.from_docker_image("openspiel-env:latest")
# Or set OPENSPIEL_GAME=catch
2. Tic-Tac-Toe¶
- Type: 2-player turn-based, perfect information
- Players: Agent (X) vs Random Bot (O)
- Action Space: 9 positions
- Observation: 27 dimensions (3x3 board + game state)
- Reward: +1 win, -1 loss, 0 draw/mid-game
# Set environment variable or run directly
docker run -p 8000:8000 -e OPENSPIEL_GAME=tic_tac_toe openspiel-env:latest
3. Kuhn Poker¶
- Type: 2-player turn-based, imperfect information
- Players: Agent vs Random Bot
- Action Space: 2 actions (pass/fold, bet/call)
- Observation: 6 dimensions (card + betting history)
- Reward: Pot winnings (typically -1, 0, +1, +2)
- Notes: THE benchmark for imperfect-information RL
docker run -p 8000:8000 -e OPENSPIEL_GAME=kuhn_poker openspiel-env:latest
4. Cliff Walking¶
- Type: Single-player grid world
- Action Space: 4 actions (up, down, left, right)
- Observation: Position encoding
- Reward: -1 per step, -100 for falling off cliff
- Notes: Classic RL benchmark from Sutton & Barto
docker run -p 8000:8000 -e OPENSPIEL_GAME=cliff_walking openspiel-env:latest
5. 2048¶
- Type: Single-player puzzle
- Action Space: 4 actions (up, down, left, right)
- Observation: 4x4 grid with tile values
- Reward: Points from merging tiles
- Notes: Stochastic tile spawning
docker run -p 8000:8000 -e OPENSPIEL_GAME=2048 openspiel-env:latest
6. Blackjack¶
- Type: Single-player vs dealer
- Action Space: 2 actions (HIT, STAND)
- Observation: Player hand + dealer's visible card
- Reward: +1 win, -1 loss, 0 draw
- Notes: Simplified version, no double/split
docker run -p 8000:8000 -e OPENSPIEL_GAME=blackjack openspiel-env:latest
Configuration¶
Environment Variables¶
OPENSPIEL_GAME: Game name (default: "catch")OPENSPIEL_AGENT_PLAYER: Player ID for agent (default: 0)OPENSPIEL_OPPONENT_POLICY: Opponent policy for multi-player gamesrandom: Uniform random (default)first: Always picks first legal actionlast: Always picks last legal action
Example: Tic-Tac-Toe with Fixed Opponent¶
docker run -p 8000:8000 \
-e OPENSPIEL_GAME=tic_tac_toe \
-e OPENSPIEL_OPPONENT_POLICY=first \
openspiel-env:latest
API Reference¶
OpenSpielAction¶
@dataclass
class OpenSpielAction(Action):
action_id: int # Action to take
game_name: str = "catch" # Game name
game_params: Dict[str, Any] = {} # Optional game parameters
OpenSpielObservation¶
@dataclass
class OpenSpielObservation(Observation):
info_state: List[float] # Agent's information state
legal_actions: List[int] # Legal action IDs
game_phase: str # "initial", "playing", "terminal"
current_player_id: int # Current player (-1 for simultaneous)
opponent_last_action: Optional[int] # Last opponent action (if available)
done: bool # Episode finished
reward: Optional[float] # Reward for last action
OpenSpielState¶
@dataclass
class OpenSpielState(State):
episode_id: str # Unique episode ID
step_count: int # Number of steps
game_name: str # Game name
agent_player: int # Agent's player ID
opponent_policy: str # Opponent policy name
num_players: int # Total players
Testing¶
Automated Testing (All 6 Games)¶
Quick test of all games in Docker:
./test_docker_all_games.sh
This automated script will: - Build and run Docker containers for each game - Test reset, step, and state APIs - Verify episode completion - Report pass/fail for all 6 games
Expected output:
========================================
OpenSpiel Docker Integration Test
========================================
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Testing: catch
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
🐳 Starting Docker container...
⏳ Waiting for server to be ready...
✓ Server ready (2s)
🎮 Running Python client test...
✓ PASSED - Episode completed successfully
[... tests all 6 games ...]
========================================
Test Summary
========================================
✓ catch
✓ tic_tac_toe
✓ kuhn_poker
✓ cliff_walking
✓ 2048
✓ blackjack
Total: 6 passed, 0 failed out of 6 games
========================================
All tests PASSED! 🎉
========================================
Manual Testing¶
# Local (requires OpenSpiel installed)
python -m pytest src/envs/openspiel_env/
# Docker build
docker build -f src/envs/openspiel_env/server/Dockerfile -t openspiel-env:latest .
# Run specific game
docker run -p 8000:8000 openspiel-env:latest
# Test from another terminal
python3 examples/openspiel_simple.py
Development¶
Adding New Games¶
To add support for more OpenSpiel games:
- Verify the game works with
rl_environment.Environment - Test with different opponent policies if multi-player
- Document game-specific configuration
- Add example script
Limitations¶
- Simultaneous-move games: Only agent_player=0 supported
- Multi-agent training: Single agent only (no self-play yet)
- Opponent policies: Random and fixed only (no MCTS yet)
- Build time: Docker image takes ~5-10 minutes to build (compiles C++)
Future Work¶
- MCTS opponent policies
- Self-play support (multiple agents)
- More games (Chess, Go, Poker Hold'em)
- Faster build with pre-built OpenSpiel base image
- Game-specific reward shaping options