Skip to content

FinRL Environment

A wrapper around FinRL stock trading environments that conforms to the OpenEnv specification.

Overview

This environment enables reinforcement learning for stock trading tasks using FinRL's powerful StockTradingEnv, exposed through OpenEnv's simple HTTP API. It supports:

  • Stock Trading: Buy/sell actions across multiple stocks
  • Portfolio Management: Track balance, holdings, and portfolio value
  • Technical Indicators: MACD, RSI, CCI, DX, and more
  • Flexible Configuration: Custom data sources and trading parameters

Quick Start

1. Build the Docker Image

First, build the base image (from OpenEnv root):

cd OpenEnv
docker build -t envtorch-base:latest -f src/core/containers/images/Dockerfile .

Then build the FinRL environment image:

docker build -t finrl-env:latest -f src/envs/finrl_env/server/Dockerfile .

2. Run the Server

Option A: With Default Sample Data

docker run -p 8000:8000 finrl-env:latest

This starts the server with synthetic sample data for testing.

Option B: With Custom Configuration

Create a configuration file config.json:

{
  "data_path": "/data/stock_data.csv",
  "stock_dim": 3,
  "hmax": 100,
  "initial_amount": 100000,
  "num_stock_shares": [0, 0, 0],
  "buy_cost_pct": [0.001, 0.001, 0.001],
  "sell_cost_pct": [0.001, 0.001, 0.001],
  "reward_scaling": 0.0001,
  "state_space": 25,
  "action_space": 3,
  "tech_indicator_list": ["macd", "rsi_30", "cci_30", "dx_30"]
}

Run with configuration:

docker run -p 8000:8000 \
  -v $(pwd)/config.json:/config/config.json \
  -v $(pwd)/data:/data \
  -e FINRL_CONFIG_PATH=/config/config.json \
  finrl-env:latest

3. Use the Client

from envs.finrl_env import FinRLEnv, FinRLAction
import numpy as np

# Connect to server
client = FinRLEnv(base_url="http://localhost:8000")

# Get configuration
config = client.get_config()
print(f"Trading {config['stock_dim']} stocks")
print(f"Initial capital: ${config['initial_amount']:,.0f}")

# Reset environment
result = client.reset()
print(f"Initial portfolio value: ${result.observation.portfolio_value:,.2f}")

# Trading loop
for step in range(100):
    # Get current state
    state = result.observation.state

    # Your RL policy here (example: random actions)
    num_stocks = config['stock_dim']
    actions = np.random.uniform(-1, 1, size=num_stocks).tolist()

    # Execute action
    result = client.step(FinRLAction(actions=actions))

    print(f"Step {step}: Portfolio=${result.observation.portfolio_value:,.2f}, "
          f"Reward={result.reward:.2f}")

    if result.done:
        print("Episode finished!")
        break

client.close()

Architecture

┌─────────────────────────────────────────────────────────────┐
│                    RL Training Framework                    │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐      │
│  │ Policy Net   │  │ Value Net    │  │ Replay       │      │
│  │ (PyTorch)    │  │ (PyTorch)    │  │ Buffer       │      │
│  └──────┬───────┘  └──────┬───────┘  └──────┬───────┘      │
│         └──────────────────┴──────────────────┘              │
│                            │                                 │
│                   ┌────────▼────────┐                        │
│                   │ FinRLEnv        │ ← HTTP Client          │
│                   │ (HTTPEnvClient) │                        │
│                   └────────┬────────┘                        │
└────────────────────────────┼─────────────────────────────────┘
                             │ HTTP (JSON)
                    ┌────────▼────────┐
                    │ Docker Container│
                    │  Port: 8000     │
                    │                 │
                    │ ┌─────────────┐ │
                    │ │FastAPI      │ │
                    │ │Server       │ │
                    │ └──────┬──────┘ │
                    │        │        │
                    │ ┌──────▼──────┐ │
                    │ │ FinRL       │ │
                    │ │ Environment │ │
                    │ └──────┬──────┘ │
                    │        │        │
                    │ ┌──────▼──────┐ │
                    │ │ FinRL       │ │
                    │ │ StockTrading│ │
                    │ │ Env         │ │
                    │ └─────────────┘ │
                    └─────────────────┘

API Reference

FinRLAction

Trading action for the environment.

Attributes: - actions: list[float] - Array of normalized action values (-1 to 1) for each stock - Positive values: Buy - Negative values: Sell - Magnitude: Relative trade size

Example:

# Buy stock 0, sell stock 1, hold stock 2
action = FinRLAction(actions=[0.5, -0.3, 0.0])

FinRLObservation

Observation returned by the environment.

Attributes: - state: list[float] - Flattened state vector - Structure: [balance, prices..., holdings..., indicators...] - portfolio_value: float - Total portfolio value (cash + holdings) - date: str - Current trading date - done: bool - Whether episode has ended - reward: float - Reward for the last action - metadata: dict - Additional information

Example:

obs = result.observation
print(f"Portfolio: ${obs.portfolio_value:,.2f}")
print(f"Date: {obs.date}")
print(f"State dimension: {len(obs.state)}")

Client Methods

reset() -> StepResult[FinRLObservation]

Reset the environment to start a new episode.

result = client.reset()

step(action: FinRLAction) -> StepResult[FinRLObservation]

Execute a trading action.

action = FinRLAction(actions=[0.5, -0.3])
result = client.step(action)

state() -> State

Get episode metadata (episode_id, step_count).

state = client.state()
print(f"Episode: {state.episode_id}, Step: {state.step_count}")

get_config() -> dict

Get environment configuration.

config = client.get_config()
print(config['stock_dim'])
print(config['initial_amount'])

Data Format

The environment expects stock data in the following CSV format:

date tic close high low open volume macd rsi_30 cci_30 dx_30
2020-01-01 AAPL 100.0 102.0 98.0 99.0 1000000 0.5 55.0 10.0 15.0
2020-01-01 GOOGL 1500.0 1520.0 1480.0 1490.0 500000 -0.3 48.0 -5.0 20.0

Required columns: - date: Trading date - tic: Stock ticker symbol - close, high, low, open: Price data - volume: Trading volume - Technical indicators (as specified in tech_indicator_list)

Configuration Parameters

Parameter Type Description
data_path str Path to CSV file with stock data
stock_dim int Number of stocks to trade
hmax int Maximum shares per trade
initial_amount int Starting cash balance
num_stock_shares list[int] Initial holdings for each stock
buy_cost_pct list[float] Transaction cost for buying (per stock)
sell_cost_pct list[float] Transaction cost for selling (per stock)
reward_scaling float Scaling factor for rewards
state_space int Dimension of state vector
action_space int Dimension of action space
tech_indicator_list list[str] Technical indicators to include

Integration with RL Frameworks

Stable Baselines 3

from stable_baselines3 import PPO
from envs.finrl_env import FinRLEnv, FinRLAction
import numpy as np

# Create custom wrapper for SB3
class SB3FinRLWrapper:
    def __init__(self, base_url):
        self.env = FinRLEnv(base_url=base_url)
        config = self.env.get_config()
        self.action_space = spaces.Box(
            low=-1, high=1,
            shape=(config['action_space'],),
            dtype=np.float32
        )
        self.observation_space = spaces.Box(
            low=-np.inf, high=np.inf,
            shape=(config['state_space'],),
            dtype=np.float32
        )

    def reset(self):
        result = self.env.reset()
        return np.array(result.observation.state, dtype=np.float32)

    def step(self, action):
        result = self.env.step(FinRLAction(actions=action.tolist()))
        return (
            np.array(result.observation.state, dtype=np.float32),
            result.reward or 0.0,
            result.done,
            result.observation.metadata
        )

# Train
env = SB3FinRLWrapper("http://localhost:8000")
model = PPO("MlpPolicy", env, verbose=1)
model.learn(total_timesteps=10000)

Troubleshooting

Server won't start

  1. Check if base image exists:

    docker images | grep envtorch-base
    

  2. Build base image if missing:

    docker build -t envtorch-base:latest -f src/core/containers/images/Dockerfile .
    

Import errors

Make sure you're in the src directory:

cd OpenEnv/src
python -c "from envs.finrl_env import FinRLEnv"

Configuration errors

Verify your data file has all required columns:

import pandas as pd
df = pd.read_csv('your_data.csv')
print(df.columns.tolist())

Examples

See the examples/ directory for complete examples: - examples/finrl_simple.py - Basic usage - examples/finrl_training.py - Full training loop with PPO - examples/finrl_backtesting.py - Backtesting a trained agent

License

BSD 3-Clause License (see LICENSE file in repository root)

References