Rate this Page
โ˜… โ˜… โ˜… โ˜… โ˜…

Grid World Environment#

Hugging Face Space

This directory contains the implementation of a simple 5x5 Grid World environment, designed to serve two primary purposes within the OpenEnv ecosystem:

  1. A basic Reinforcement Learning (RL) testbed: Providing a straightforward, deterministic environment for quick prototyping and testing of RL agents.

  2. A detailed โ€œHow-Toโ€ guide for building new OpenEnv environments: Demonstrating the architectural patterns, best practices, and core components required to integrate a custom environment into the OpenEnv framework.


๐Ÿš€ Environment Overview#

The Grid World environment features:

  • Grid Size: A 5x5 square grid.

  • Agent: Starts at position (0,0) (top-left).

  • Goal: Fixed at (4,4) (bottom-right).

  • Actions: UP, DOWN, LEFT, RIGHT.

  • Dynamics: Deterministic. An action always moves the agent one step in the chosen direction, unless it would move off the grid, in which case the agent stays in its current cell.

  • Reward Function (Sparse):

    • -0.1 for every step taken (a โ€œliving costโ€ or โ€œstep penaltyโ€).

    • +1.0 for reaching the goal at (4,4). This also terminates the episode.

  • Episode Termination: The episode ends when the agent reaches the goal.

Example Gameplay#

Imagine the agent trying to find the goal:

  1. Reset: Agent at (0,0) โ†’ Obs(x=0, y=0, reward=0.0, done=False)

  2. Step DOWN: Agent moves to (1,0) โ†’ Obs(x=1, y=0, reward=-0.1, done=False)

  3. Step RIGHT: Agent moves to (1,1) โ†’ Obs(x=1, y=1, reward=-0.1, done=False)

  4. โ€ฆ

  5. Step RIGHT (from 4,3): Agent moves to (4,4) โ†’ Obs(x=4, y=4, reward=1.0, done=True)


๐Ÿ› ๏ธ How to Build an OpenEnv Environment: A Detailed Guide#

This section explains the structure and key design choices of the Grid World environment.

1. Scaffolding and Configuration#

This environment supports multi-mode deployment. It uses pyproject.toml for modern local development (via uv) and a Dockerfile for containerized deployment.

Directory Structure#

envs/grid_world_env
โ”œโ”€โ”€ server/
โ”‚   โ”œโ”€โ”€ __init__.py           # Package initializer for the server side
โ”‚   โ”œโ”€โ”€ app.py                # The FastAPI application entry point
โ”‚   โ”œโ”€โ”€ Dockerfile            # Container definition (uses requirements.txt)
โ”‚   โ”œโ”€โ”€ grid_world_environment.py # The core environment logic
โ”‚   โ””โ”€โ”€ requirements.txt      # Dependencies for the Docker build
โ”œโ”€โ”€ __init__.py               # Package initializer for the client side
โ”œโ”€โ”€ client.py                 # Python client for interacting with the env server
โ”œโ”€โ”€ models.py                 # Pydantic data structures (Action, Observation)
โ”œโ”€โ”€ openenv.yaml              # OpenEnv metadata
โ”œโ”€โ”€ pyproject.toml            # Project configuration for local dev (uv)
โ”œโ”€โ”€ uv.lock                   # Exact dependency versions (Generated by uv)
โ”œโ”€โ”€ README.md
โ””โ”€โ”€ test_grid_world.sh        # Integration test script (Docker based)


# Core Components Explained

This section dives into the specific code files that power the **Grid World**, explaining how the **OpenEnv** framework connects the data, logic, and server layers.

---

## 1. `models.py` โ€” *The Data Contract*

This file defines the strict โ€œlanguageโ€ used for communication between the **Client (RL Agent)** and the **Server**. It relies on **Pydantic** to enforce type safety.

### Key Components

- **`MoveAction(str, Enum)`**  
  Defines the allowed vocabulary for movement: `UP`, `DOWN`, `LEFT`, `RIGHT`.  
  Using an `Enum` prevents *magic string* errors (e.g., sending `"up"` instead of `"UP"`).

- **`GridWorldAction(Action)`**  
  Wraps the movement enum in a standardized **OpenEnv** action structure.  
  When the server receives a request, **FastAPI** automatically validates that the incoming JSON payload matches this schema.

- **`GridWorldObservation(Observation)`**  
  Defines exactly what the agent observes from the environment:
  - `x`, `y`: Integer coordinates representing the agentโ€™s position
  - `reward`: Floating-point value (e.g., `-0.1`, `1.0`)
  - `done`: Boolean flag indicating episode termination

> **Note:**  
> By inheriting from `pydantic.BaseModel` (via `Observation`), these classes automatically handle JSON serialization and deserialization.

---

## 2. `server/grid_world_environment.py` โ€” *The Logic*

This file contains the โ€œphysics engineโ€ and rules of the environment. It translates abstract actions into concrete state transitions.

### Core Responsibilities

- **Inheritance**  
  `GridWorldEnvironment` inherits from `openenv.core.env_server.Environment`, providing the standardized interface required by the OpenEnv server.

- **`__init__` Method**  
  - Sets static configuration:
    - Grid size: `5 ร— 5`
    - Goal location: `[4, 4]`
  - Initializes the persistent state container.

- **State Persistence (`self._state`)**  
  - HTTP requests are stateless, so the environment instance must remember the agentโ€™s position between calls.
  - `self._state` (an instance of `openenv...State`) tracks:
    - `step_count`
    - `episode_id`
    - `agent_x`, `agent_y`

- **`step()` Logic**
  - **Input:** Receives a validated `GridWorldAction`
  - **Dynamics:** Applies movement rules and clamps coordinates using  
    `max(0, min(..., grid_size - 1))` to prevent the agent from leaving the grid
  - **Feedback:** Computes a sparse reward:
    - `1.0` if `(x, y) == goal`
    - `-0.1` otherwise  
  - Returns a `GridWorldObservation`

---

## 3. `server/app.py` โ€” *The API*

This file is the โ€œglueโ€ that turns the environment logic into a running web service.

### Key Elements

- **`create_app` Utility**  
  Instead of manually defining FastAPI routes, this file uses  
  `openenv.core.env_server.create_app`.

  It:
  - Binds the environment logic (`GridWorldEnvironment`)
  - Connects the data models (`GridWorldAction`, `GridWorldObservation`)
  - Automatically generates standard endpoints:
    - `/reset`
    - `/step`
    - `/state`
    - `/health`

- **`main()` Entry Point**  
  Defines a `main()` function that calls `uvicorn.run`.  
  This is what enables the `server = "..."` script in `pyproject.toml` to start the server.

---

## 4. `server/Dockerfile` โ€” *The Container*

This file defines how the environment is packaged for production or remote deployment.

### Container Setup

- **Base Image**  
  Builds on `envtorch-base`, ensuring compatible system libraries.

- **Dependencies**  
  Copies and installs `server/requirements.txt`.  
  This keeps the Docker image lightweight and focused only on server-side requirements.

- **Execution**  
  - Exposes port `8000`
  - Defines the `CMD` to launch `uvicorn`  
  The container is ready to accept HTTP requests immediately upon startup.

---

## 5. `pyproject.toml` โ€” *Local Development*

This file enables a modern local development workflow using **uv**.

### Key Sections

- **Project Metadata**
  - Package name: `grid_world_env`
  - Version information

- **Dependencies**
  Lists libraries required for local execution:
  - `fastapi`
  - `uvicorn`
  - `gymnasium`
  - `numpy`

- **`[project.scripts]`**
  Defines a shortcut command:

  ```toml
  server = "grid_world_env.server.app:main"


# ๐Ÿš€ Getting Started

You can run the environment using **uv** (fastest for development) or **Docker** (best for deployment).

---

## Option 1: Local Development with `uv` (Recommended)

Since this project is configured with `pyproject.toml`, you can run the server instantly.

### Steps

1. **Navigate to the environment folder**
   ```bash
   cd envs/grid_world_env
   uv run server

2. ** Visit the live Swagger UI in your Browser
   ```bash
   http://localhost:8000/docs


 ## Option 2: Docker Integration Test

To build the full container and run the integration test suite (simulating a production deployment):

---

### Steps

1. **Navigate to the root OpenEnv directory**

2. **Run the test script**
   ```bash
   ./envs/grid_world_env/test_grid_world.sh


Builds the Docker image

Starts the container

Runs a series of curl requests to verify functionality

Cleans up containers and images after completion
  

## Conclusion

This Grid World environment serves as the reference implementation for building environments in OpenEnv. By following this pattern, custom environments remain:

Portable across local and containerized setups

Strictly typed through Pydantic models

Deployment-ready for development, testing, and production workflows
---