Jupyter Environment#
jupyter_env exposes a stateful Jupyter-style Python notebook through MCP
tools. Each episode creates a fresh E2B Code Interpreter sandbox, so Python
variables, imports, files, and generated plots persist across tool calls until
the next reset.
Reset can also receive setup and verify scripts. Setup commands run immediately
after the sandbox is created. Verify commands are stored for the episode and run
when the agent calls final_answer.
Tools#
add_and_execute_code_cell(code): execute a new Python cell.edit_and_execute_current_cell(code): replace and re-run the latest code cell.execute_shell_command(command): run shell commands inside the sandbox.get_notebook_state(include_images=False): summarize recent notebook cells.final_answer(answer): record a final answer and run any configured verify commands.
Quick Start#
from jupyter_env import JupyterEnv
with JupyterEnv(base_url="http://localhost:8000").sync() as env:
env.reset(
setup=["pip install -q pandas"],
verify=[
"python - <<'PY'\nfrom pathlib import Path\nassert Path('/home/user/work/answer.txt').exists()\nPY"
],
)
print(env.call_tool("add_and_execute_code_cell", code="x = 21 * 2\nx"))
print(env.call_tool("final_answer", answer="done"))
Local Server#
cd envs/jupyter_env
E2B_API_KEY=e2b_... uv run --project . server
The API and custom web UI are served on port 8000. The notebook UI is mounted
at /web.
Docker#
cd envs/jupyter_env
openenv build -t jupyter-env
docker run -p 8000:8000 -e E2B_API_KEY=e2b_... jupyter-env
Configuration#
E2B_API_KEY: required when resetting an episode.MAX_CONCURRENT_ENVS: maximum concurrent WebSocket sessions. Defaults to4.KAGGLE_DATA_DIR: optional root directory for reset-time Kaggle file staging.
Setup and Verify Commands#
reset() accepts either setup / verify or setup_scripts /
verify_scripts.
env.reset(
setup=[
"mkdir -p /home/user/work",
"printf 'seed data' > /home/user/work/input.txt",
],
verify=[
"test -f /home/user/work/answer.txt",
"python -m pytest /home/user/work/tests",
],
)
Setup failure ends the reset response with done=True and returns the captured
setup results. Verify commands run when final_answer(answer) is called. Reward
defaults to passed_verify_commands / total_verify_commands. A verify command
can override this by writing a float to:
/home/user/logs/verifier/reward.txt
Notes#
This first version intentionally keeps sandbox provider selection local to the environment and uses E2B as the concrete backend. A shared OpenEnv sandbox API can be introduced later once the provider contract is RFC-backed.