Rate this Page

★ ★ ★ ★ ★

FinRL Environment#

A wrapper around FinRL stock trading environments that conforms to the OpenEnv specification.

Overview#

This environment enables reinforcement learning for stock trading tasks using FinRL’s powerful StockTradingEnv, exposed through OpenEnv’s simple HTTP API. It supports:

Stock Trading: Buy/sell actions across multiple stocks
Portfolio Management: Track balance, holdings, and portfolio value
Technical Indicators: MACD, RSI, CCI, DX, and more
Flexible Configuration: Custom data sources and trading parameters

Quick Start#

1. Build the Docker Image#

First, build the base image (from OpenEnv root):

cd OpenEnv
docker build -t envtorch-base:latest -f src/openenv/core/containers/images/Dockerfile .

Then build the FinRL environment image:

docker build -t finrl-env:latest -f envs/finrl_env/server/Dockerfile .

2. Run the Server#

Option A: With Default Sample Data#

docker run -p 8000:8000 finrl-env:latest

This starts the server with synthetic sample data for testing.

Option B: With Custom Configuration#

Create a configuration file config.json:

{
  "data_path": "/data/stock_data.csv",
  "stock_dim": 3,
  "hmax": 100,
  "initial_amount": 100000,
  "num_stock_shares": [0, 0, 0],
  "buy_cost_pct": [0.001, 0.001, 0.001],
  "sell_cost_pct": [0.001, 0.001, 0.001],
  "reward_scaling": 0.0001,
  "state_space": 25,
  "action_space": 3,
  "tech_indicator_list": ["macd", "rsi_30", "cci_30", "dx_30"]
}

Run with configuration:

docker run -p 8000:8000 \
  -v $(pwd)/config.json:/config/config.json \
  -v $(pwd)/data:/data \
  -e FINRL_CONFIG_PATH=/config/config.json \
  finrl-env:latest

3. Use the Client#

from envs.finrl_env import FinRLEnv, FinRLAction
import numpy as np

# Connect to server
client = FinRLEnv(base_url="http://localhost:8000")

# Get configuration
config = client.get_config()
print(f"Trading {config['stock_dim']} stocks")
print(f"Initial capital: ${config['initial_amount']:,.0f}")

# Reset environment
result = client.reset()
print(f"Initial portfolio value: ${result.observation.portfolio_value:,.2f}")

# Trading loop
for step in range(100):
    # Get current state
    state = result.observation.state

    # Your RL policy here (example: random actions)
    num_stocks = config['stock_dim']
    actions = np.random.uniform(-1, 1, size=num_stocks).tolist()

    # Execute action
    result = client.step(FinRLAction(actions=actions))

    print(f"Step {step}: Portfolio=${result.observation.portfolio_value:,.2f}, "
          f"Reward={result.reward:.2f}")

    if result.done:
        print("Episode finished!")
        break

client.close()

Architecture#

┌─────────────────────────────────────────────────────────────┐
│                    RL Training Framework                    │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐      │
│  │ Policy Net   │  │ Value Net    │  │ Replay       │      │
│  │ (PyTorch)    │  │ (PyTorch)    │  │ Buffer       │      │
│  └──────┬───────┘  └──────┬───────┘  └──────┬───────┘      │
│         └──────────────────┴──────────────────┘              │
│                            │                                 │
│                   ┌────────▼────────┐                        │
│                   │ FinRLEnv        │ ← HTTP Client          │
│                   │ (HTTPEnvClient) │                        │
│                   └────────┬────────┘                        │
└────────────────────────────┼─────────────────────────────────┘
                             │ HTTP (JSON)
                    ┌────────▼────────┐
                    │ Docker Container│
                    │  Port: 8000     │
                    │                 │
                    │ ┌─────────────┐ │
                    │ │FastAPI      │ │
                    │ │Server       │ │
                    │ └──────┬──────┘ │
                    │        │        │
                    │ ┌──────▼──────┐ │
                    │ │ FinRL       │ │
                    │ │ Environment │ │
                    │ └──────┬──────┘ │
                    │        │        │
                    │ ┌──────▼──────┐ │
                    │ │ FinRL       │ │
                    │ │ StockTrading│ │
                    │ │ Env         │ │
                    │ └─────────────┘ │
                    └─────────────────┘

API Reference#

FinRLAction#

Trading action for the environment.

Attributes:

actions: list[float] - Array of normalized action values (-1 to 1) for each stock
- Positive values: Buy
- Negative values: Sell
- Magnitude: Relative trade size

Example:

# Buy stock 0, sell stock 1, hold stock 2
action = FinRLAction(actions=[0.5, -0.3, 0.0])

FinRLObservation#

Observation returned by the environment.

Attributes:

state: list[float] - Flattened state vector
- Structure: [balance, prices..., holdings..., indicators...]
portfolio_value: float - Total portfolio value (cash + holdings)
date: str - Current trading date
done: bool - Whether episode has ended
reward: float - Reward for the last action
metadata: dict - Additional information

Example:

obs = result.observation
print(f"Portfolio: ${obs.portfolio_value:,.2f}")
print(f"Date: {obs.date}")
print(f"State dimension: {len(obs.state)}")

Client Methods#

`reset() -> StepResult[FinRLObservation]`#

Reset the environment to start a new episode.

result = client.reset()

`step(action: FinRLAction) -> StepResult[FinRLObservation]`#

Execute a trading action.

action = FinRLAction(actions=[0.5, -0.3])
result = client.step(action)

`state() -> State`#

Get episode metadata (episode_id, step_count).

state = client.state()
print(f"Episode: {state.episode_id}, Step: {state.step_count}")

`get_config() -> dict`#

Get environment configuration.

config = client.get_config()
print(config['stock_dim'])
print(config['initial_amount'])

Data Format#

The environment expects stock data in the following CSV format:

date	tic	close	high	low	open	volume	macd	rsi_30	cci_30	dx_30
2020-01-01	AAPL	100.0	102.0	98.0	99.0	1000000	0.5	55.0	10.0	15.0
2020-01-01	GOOGL	1500.0	1520.0	1480.0	1490.0	500000	-0.3	48.0	-5.0	20.0

Required columns:

date: Trading date
tic: Stock ticker symbol
close, high, low, open: Price data
volume: Trading volume
Technical indicators (as specified in tech_indicator_list)

Configuration Parameters#

Parameter	Type	Description
`data_path`	str	Path to CSV file with stock data
`stock_dim`	int	Number of stocks to trade
`hmax`	int	Maximum shares per trade
`initial_amount`	int	Starting cash balance
`num_stock_shares`	list[int]	Initial holdings for each stock
`buy_cost_pct`	list[float]	Transaction cost for buying (per stock)
`sell_cost_pct`	list[float]	Transaction cost for selling (per stock)
`reward_scaling`	float	Scaling factor for rewards
`state_space`	int	Dimension of state vector
`action_space`	int	Dimension of action space
`tech_indicator_list`	list[str]	Technical indicators to include

Integration with RL Frameworks#

Stable Baselines 3#

from stable_baselines3 import PPO
from envs.finrl_env import FinRLEnv, FinRLAction
import numpy as np

# Create custom wrapper for SB3
class SB3FinRLWrapper:
    def __init__(self, base_url):
        self.env = FinRLEnv(base_url=base_url)
        config = self.env.get_config()
        self.action_space = spaces.Box(
            low=-1, high=1,
            shape=(config['action_space'],),
            dtype=np.float32
        )
        self.observation_space = spaces.Box(
            low=-np.inf, high=np.inf,
            shape=(config['state_space'],),
            dtype=np.float32
        )

    def reset(self):
        result = self.env.reset()
        return np.array(result.observation.state, dtype=np.float32)

    def step(self, action):
        result = self.env.step(FinRLAction(actions=action.tolist()))
        return (
            np.array(result.observation.state, dtype=np.float32),
            result.reward or 0.0,
            result.done,
            result.observation.metadata
        )

# Train
env = SB3FinRLWrapper("http://localhost:8000")
model = PPO("MlpPolicy", env, verbose=1)
model.learn(total_timesteps=10000)

Troubleshooting#

Server won’t start#

Check if base image exists:
```
docker images | grep envtorch-base
```

Build base image if missing:

docker build -t envtorch-base:latest -f src/openenv/core/containers/images/Dockerfile .

Import errors#

Make sure you’re in the src directory:

cd OpenEnv/src
python -c "from envs.finrl_env import FinRLEnv"

Configuration errors#

Verify your data file has all required columns:

import pandas as pd
df = pd.read_csv('your_data.csv')
print(df.columns.tolist())

Examples#

See the examples/ directory for complete examples:

examples/finrl_simple.py - Basic usage
examples/finrl_training.py - Full training loop with PPO
examples/finrl_backtesting.py - Backtesting a trained agent

License#

BSD 3-Clause License (see LICENSE file in repository root)

FinRL Environment#

Overview#

Quick Start#

1. Build the Docker Image#

2. Run the Server#

Option A: With Default Sample Data#

Option B: With Custom Configuration#

3. Use the Client#

Architecture#

API Reference#

FinRLAction#

FinRLObservation#

Client Methods#

`reset() -> StepResult[FinRLObservation]`#

`step(action: FinRLAction) -> StepResult[FinRLObservation]`#

`state() -> State`#

`get_config() -> dict`#

Data Format#

Configuration Parameters#

Integration with RL Frameworks#

Stable Baselines 3#

Troubleshooting#

Server won’t start#

Import errors#

Configuration errors#

Examples#

License#

References#

Docs

Tutorials

Resources

FinRL Environment#

Overview#

Quick Start#

1. Build the Docker Image#

2. Run the Server#

Option A: With Default Sample Data#

Option B: With Custom Configuration#

3. Use the Client#

Architecture#

API Reference#

FinRLAction#

FinRLObservation#

Client Methods#

reset() -> StepResult[FinRLObservation]#

step(action: FinRLAction) -> StepResult[FinRLObservation]#

state() -> State#

get_config() -> dict#

Data Format#

Configuration Parameters#

Integration with RL Frameworks#

Stable Baselines 3#

Troubleshooting#

Server won’t start#

Import errors#

Configuration errors#

Examples#

License#

References#

Docs

Tutorials

Resources

`reset() -> StepResult[FinRLObservation]`#

`step(action: FinRLAction) -> StepResult[FinRLObservation]`#

`state() -> State`#

`get_config() -> dict`#