FinRL Environment¶

A wrapper around FinRL stock trading environments that conforms to the OpenEnv specification.

Overview¶

This environment enables reinforcement learning for stock trading tasks using FinRL's powerful StockTradingEnv, exposed through OpenEnv's simple HTTP API. It supports:

Stock Trading: Buy/sell actions across multiple stocks
Portfolio Management: Track balance, holdings, and portfolio value
Technical Indicators: MACD, RSI, CCI, DX, and more
Flexible Configuration: Custom data sources and trading parameters

Quick Start¶

1. Build the Docker Image¶

First, build the base image (from OpenEnv root):

cd OpenEnv
docker build -t envtorch-base:latest -f src/core/containers/images/Dockerfile .

Then build the FinRL environment image:

docker build -t finrl-env:latest -f src/envs/finrl_env/server/Dockerfile .

2. Run the Server¶

Option A: With Default Sample Data¶

docker run -p 8000:8000 finrl-env:latest

This starts the server with synthetic sample data for testing.

Option B: With Custom Configuration¶

Create a configuration file config.json:

{
  "data_path": "/data/stock_data.csv",
  "stock_dim": 3,
  "hmax": 100,
  "initial_amount": 100000,
  "num_stock_shares": [0, 0, 0],
  "buy_cost_pct": [0.001, 0.001, 0.001],
  "sell_cost_pct": [0.001, 0.001, 0.001],
  "reward_scaling": 0.0001,
  "state_space": 25,
  "action_space": 3,
  "tech_indicator_list": ["macd", "rsi_30", "cci_30", "dx_30"]
}

Run with configuration:

docker run -p 8000:8000 \
  -v $(pwd)/config.json:/config/config.json \
  -v $(pwd)/data:/data \
  -e FINRL_CONFIG_PATH=/config/config.json \
  finrl-env:latest

3. Use the Client¶

from envs.finrl_env import FinRLEnv, FinRLAction
import numpy as np

# Connect to server
client = FinRLEnv(base_url="http://localhost:8000")

# Get configuration
config = client.get_config()
print(f"Trading {config['stock_dim']} stocks")
print(f"Initial capital: ${config['initial_amount']:,.0f}")

# Reset environment
result = client.reset()
print(f"Initial portfolio value: ${result.observation.portfolio_value:,.2f}")

# Trading loop
for step in range(100):
    # Get current state
    state = result.observation.state

    # Your RL policy here (example: random actions)
    num_stocks = config['stock_dim']
    actions = np.random.uniform(-1, 1, size=num_stocks).tolist()

    # Execute action
    result = client.step(FinRLAction(actions=actions))

    print(f"Step {step}: Portfolio=${result.observation.portfolio_value:,.2f}, "
          f"Reward={result.reward:.2f}")

    if result.done:
        print("Episode finished!")
        break

client.close()

Architecture¶

┌─────────────────────────────────────────────────────────────┐
│                    RL Training Framework                    │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐      │
│  │ Policy Net   │  │ Value Net    │  │ Replay       │      │
│  │ (PyTorch)    │  │ (PyTorch)    │  │ Buffer       │      │
│  └──────┬───────┘  └──────┬───────┘  └──────┬───────┘      │
│         └──────────────────┴──────────────────┘              │
│                            │                                 │
│                   ┌────────▼────────┐                        │
│                   │ FinRLEnv        │ ← HTTP Client          │
│                   │ (HTTPEnvClient) │                        │
│                   └────────┬────────┘                        │
└────────────────────────────┼─────────────────────────────────┘
                             │ HTTP (JSON)
                    ┌────────▼────────┐
                    │ Docker Container│
                    │  Port: 8000     │
                    │                 │
                    │ ┌─────────────┐ │
                    │ │FastAPI      │ │
                    │ │Server       │ │
                    │ └──────┬──────┘ │
                    │        │        │
                    │ ┌──────▼──────┐ │
                    │ │ FinRL       │ │
                    │ │ Environment │ │
                    │ └──────┬──────┘ │
                    │        │        │
                    │ ┌──────▼──────┐ │
                    │ │ FinRL       │ │
                    │ │ StockTrading│ │
                    │ │ Env         │ │
                    │ └─────────────┘ │
                    └─────────────────┘

API Reference¶

FinRLAction¶

Trading action for the environment.

Attributes: - actions: list[float] - Array of normalized action values (-1 to 1) for each stock - Positive values: Buy - Negative values: Sell - Magnitude: Relative trade size

Example:

# Buy stock 0, sell stock 1, hold stock 2
action = FinRLAction(actions=[0.5, -0.3, 0.0])

FinRLObservation¶

Observation returned by the environment.

Attributes: - state: list[float] - Flattened state vector - Structure: [balance, prices..., holdings..., indicators...] - portfolio_value: float - Total portfolio value (cash + holdings) - date: str - Current trading date - done: bool - Whether episode has ended - reward: float - Reward for the last action - metadata: dict - Additional information

Example:

obs = result.observation
print(f"Portfolio: ${obs.portfolio_value:,.2f}")
print(f"Date: {obs.date}")
print(f"State dimension: {len(obs.state)}")

Client Methods¶

`reset() -> StepResult[FinRLObservation]`¶

Reset the environment to start a new episode.

result = client.reset()

`step(action: FinRLAction) -> StepResult[FinRLObservation]`¶

Execute a trading action.

action = FinRLAction(actions=[0.5, -0.3])
result = client.step(action)

`state() -> State`¶

Get episode metadata (episode_id, step_count).

state = client.state()
print(f"Episode: {state.episode_id}, Step: {state.step_count}")

`get_config() -> dict`¶

Get environment configuration.

config = client.get_config()
print(config['stock_dim'])
print(config['initial_amount'])

Data Format¶

The environment expects stock data in the following CSV format:

date	tic	close	high	low	open	volume	macd	rsi_30	cci_30	dx_30
2020-01-01	AAPL	100.0	102.0	98.0	99.0	1000000	0.5	55.0	10.0	15.0
2020-01-01	GOOGL	1500.0	1520.0	1480.0	1490.0	500000	-0.3	48.0	-5.0	20.0

Required columns: - date: Trading date - tic: Stock ticker symbol - close, high, low, open: Price data - volume: Trading volume - Technical indicators (as specified in tech_indicator_list)

Configuration Parameters¶

Parameter	Type	Description
`data_path`	str	Path to CSV file with stock data
`stock_dim`	int	Number of stocks to trade
`hmax`	int	Maximum shares per trade
`initial_amount`	int	Starting cash balance
`num_stock_shares`	list[int]	Initial holdings for each stock
`buy_cost_pct`	list[float]	Transaction cost for buying (per stock)
`sell_cost_pct`	list[float]	Transaction cost for selling (per stock)
`reward_scaling`	float	Scaling factor for rewards
`state_space`	int	Dimension of state vector
`action_space`	int	Dimension of action space
`tech_indicator_list`	list[str]	Technical indicators to include

Integration with RL Frameworks¶

Stable Baselines 3¶

from stable_baselines3 import PPO
from envs.finrl_env import FinRLEnv, FinRLAction
import numpy as np

# Create custom wrapper for SB3
class SB3FinRLWrapper:
    def __init__(self, base_url):
        self.env = FinRLEnv(base_url=base_url)
        config = self.env.get_config()
        self.action_space = spaces.Box(
            low=-1, high=1,
            shape=(config['action_space'],),
            dtype=np.float32
        )
        self.observation_space = spaces.Box(
            low=-np.inf, high=np.inf,
            shape=(config['state_space'],),
            dtype=np.float32
        )

    def reset(self):
        result = self.env.reset()
        return np.array(result.observation.state, dtype=np.float32)

    def step(self, action):
        result = self.env.step(FinRLAction(actions=action.tolist()))
        return (
            np.array(result.observation.state, dtype=np.float32),
            result.reward or 0.0,
            result.done,
            result.observation.metadata
        )

# Train
env = SB3FinRLWrapper("http://localhost:8000")
model = PPO("MlpPolicy", env, verbose=1)
model.learn(total_timesteps=10000)

Troubleshooting¶

Server won't start¶

Check if base image exists:
```
docker images | grep envtorch-base
```

Build base image if missing:

docker build -t envtorch-base:latest -f src/core/containers/images/Dockerfile .

Import errors¶

Make sure you're in the src directory:

cd OpenEnv/src
python -c "from envs.finrl_env import FinRLEnv"

Configuration errors¶

Verify your data file has all required columns:

import pandas as pd
df = pd.read_csv('your_data.csv')
print(df.columns.tolist())

Examples¶

See the examples/ directory for complete examples: - examples/finrl_simple.py - Basic usage - examples/finrl_training.py - Full training loop with PPO - examples/finrl_backtesting.py - Backtesting a trained agent

License¶

BSD 3-Clause License (see LICENSE file in repository root)