Grid World Environment#
This directory contains the implementation of a simple 5x5 Grid World environment, designed to serve two primary purposes within the OpenEnv ecosystem:
A basic Reinforcement Learning (RL) testbed: Providing a straightforward, deterministic environment for quick prototyping and testing of RL agents.
A detailed โHow-Toโ guide for building new OpenEnv environments: Demonstrating the architectural patterns, best practices, and core components required to integrate a custom environment into the OpenEnv framework.
๐ Environment Overview#
The Grid World environment features:
Grid Size: A 5x5 square grid.
Agent: Starts at position
(0,0)(top-left).Goal: Fixed at
(4,4)(bottom-right).Actions:
UP,DOWN,LEFT,RIGHT.Dynamics: Deterministic. An action always moves the agent one step in the chosen direction, unless it would move off the grid, in which case the agent stays in its current cell.
Reward Function (Sparse):
-0.1for every step taken (a โliving costโ or โstep penaltyโ).+1.0for reaching the goal at(4,4). This also terminates the episode.
Episode Termination: The episode ends when the agent reaches the goal.
Example Gameplay#
Imagine the agent trying to find the goal:
Reset: Agent at
(0,0)โObs(x=0, y=0, reward=0.0, done=False)Step DOWN: Agent moves to
(1,0)โObs(x=1, y=0, reward=-0.1, done=False)Step RIGHT: Agent moves to
(1,1)โObs(x=1, y=1, reward=-0.1, done=False)โฆ
Step RIGHT (from 4,3): Agent moves to
(4,4)โObs(x=4, y=4, reward=1.0, done=True)
๐ ๏ธ How to Build an OpenEnv Environment: A Detailed Guide#
This section explains the structure and key design choices of the Grid World environment.
1. Scaffolding and Configuration#
This environment supports multi-mode deployment. It uses pyproject.toml for modern local development (via uv) and a Dockerfile for containerized deployment.
Directory Structure#
envs/grid_world_env
โโโ server/
โ โโโ __init__.py # Package initializer for the server side
โ โโโ app.py # The FastAPI application entry point
โ โโโ Dockerfile # Container definition (uses requirements.txt)
โ โโโ grid_world_environment.py # The core environment logic
โ โโโ requirements.txt # Dependencies for the Docker build
โโโ __init__.py # Package initializer for the client side
โโโ client.py # Python client for interacting with the env server
โโโ models.py # Pydantic data structures (Action, Observation)
โโโ openenv.yaml # OpenEnv metadata
โโโ pyproject.toml # Project configuration for local dev (uv)
โโโ uv.lock # Exact dependency versions (Generated by uv)
โโโ README.md
โโโ test_grid_world.sh # Integration test script (Docker based)
# Core Components Explained
This section dives into the specific code files that power the **Grid World**, explaining how the **OpenEnv** framework connects the data, logic, and server layers.
---
## 1. `models.py` โ *The Data Contract*
This file defines the strict โlanguageโ used for communication between the **Client (RL Agent)** and the **Server**. It relies on **Pydantic** to enforce type safety.
### Key Components
- **`MoveAction(str, Enum)`**
Defines the allowed vocabulary for movement: `UP`, `DOWN`, `LEFT`, `RIGHT`.
Using an `Enum` prevents *magic string* errors (e.g., sending `"up"` instead of `"UP"`).
- **`GridWorldAction(Action)`**
Wraps the movement enum in a standardized **OpenEnv** action structure.
When the server receives a request, **FastAPI** automatically validates that the incoming JSON payload matches this schema.
- **`GridWorldObservation(Observation)`**
Defines exactly what the agent observes from the environment:
- `x`, `y`: Integer coordinates representing the agentโs position
- `reward`: Floating-point value (e.g., `-0.1`, `1.0`)
- `done`: Boolean flag indicating episode termination
> **Note:**
> By inheriting from `pydantic.BaseModel` (via `Observation`), these classes automatically handle JSON serialization and deserialization.
---
## 2. `server/grid_world_environment.py` โ *The Logic*
This file contains the โphysics engineโ and rules of the environment. It translates abstract actions into concrete state transitions.
### Core Responsibilities
- **Inheritance**
`GridWorldEnvironment` inherits from `openenv.core.env_server.Environment`, providing the standardized interface required by the OpenEnv server.
- **`__init__` Method**
- Sets static configuration:
- Grid size: `5 ร 5`
- Goal location: `[4, 4]`
- Initializes the persistent state container.
- **State Persistence (`self._state`)**
- HTTP requests are stateless, so the environment instance must remember the agentโs position between calls.
- `self._state` (an instance of `openenv...State`) tracks:
- `step_count`
- `episode_id`
- `agent_x`, `agent_y`
- **`step()` Logic**
- **Input:** Receives a validated `GridWorldAction`
- **Dynamics:** Applies movement rules and clamps coordinates using
`max(0, min(..., grid_size - 1))` to prevent the agent from leaving the grid
- **Feedback:** Computes a sparse reward:
- `1.0` if `(x, y) == goal`
- `-0.1` otherwise
- Returns a `GridWorldObservation`
---
## 3. `server/app.py` โ *The API*
This file is the โglueโ that turns the environment logic into a running web service.
### Key Elements
- **`create_app` Utility**
Instead of manually defining FastAPI routes, this file uses
`openenv.core.env_server.create_app`.
It:
- Binds the environment logic (`GridWorldEnvironment`)
- Connects the data models (`GridWorldAction`, `GridWorldObservation`)
- Automatically generates standard endpoints:
- `/reset`
- `/step`
- `/state`
- `/health`
- **`main()` Entry Point**
Defines a `main()` function that calls `uvicorn.run`.
This is what enables the `server = "..."` script in `pyproject.toml` to start the server.
---
## 4. `server/Dockerfile` โ *The Container*
This file defines how the environment is packaged for production or remote deployment.
### Container Setup
- **Base Image**
Builds on `envtorch-base`, ensuring compatible system libraries.
- **Dependencies**
Copies and installs `server/requirements.txt`.
This keeps the Docker image lightweight and focused only on server-side requirements.
- **Execution**
- Exposes port `8000`
- Defines the `CMD` to launch `uvicorn`
The container is ready to accept HTTP requests immediately upon startup.
---
## 5. `pyproject.toml` โ *Local Development*
This file enables a modern local development workflow using **uv**.
### Key Sections
- **Project Metadata**
- Package name: `grid_world_env`
- Version information
- **Dependencies**
Lists libraries required for local execution:
- `fastapi`
- `uvicorn`
- `gymnasium`
- `numpy`
- **`[project.scripts]`**
Defines a shortcut command:
```toml
server = "grid_world_env.server.app:main"
# ๐ Getting Started
You can run the environment using **uv** (fastest for development) or **Docker** (best for deployment).
---
## Option 1: Local Development with `uv` (Recommended)
Since this project is configured with `pyproject.toml`, you can run the server instantly.
### Steps
1. **Navigate to the environment folder**
```bash
cd envs/grid_world_env
uv run server
2. ** Visit the live Swagger UI in your Browser
```bash
http://localhost:8000/docs
## Option 2: Docker Integration Test
To build the full container and run the integration test suite (simulating a production deployment):
---
### Steps
1. **Navigate to the root OpenEnv directory**
2. **Run the test script**
```bash
./envs/grid_world_env/test_grid_world.sh
Builds the Docker image
Starts the container
Runs a series of curl requests to verify functionality
Cleans up containers and images after completion
## Conclusion
This Grid World environment serves as the reference implementation for building environments in OpenEnv. By following this pattern, custom environments remain:
Portable across local and containerized setups
Strictly typed through Pydantic models
Deployment-ready for development, testing, and production workflows
---