Overview
OpenEnv is an open-source framework from Meta’s PyTorch team for defining, deploying, and interacting with environments in RL and agentic workflows. It gives you Gym-style APIs (reset(), step(), state()) wrapped in HTTP clients (for example BrowserGymEnv, EchoEnv, TextArenaEnv), and lets you run those environments:
- As local Python processes.
- Inside Docker containers.
- As hosted Hugging Face Spaces.
OpenEnvRolloutProcessor is the component that runs the OpenEnv loop for you:
- It calls
env.reset()to start an episode for eachEvaluationRow. - For each step, it builds a user message from the observation, calls your model, parses the model’s response into an action, and calls
env.step(action). - It appends a sentinel system message with per-step rewards so your
@evaluation_testcan compute a final score in a single place.
- Which OpenEnv client you pass (
BrowserGymEnv,EchoEnv,TextArenaEnv, …). - How you build prompts (
prompt_builder). - How you parse actions (
action_parser).
How to use OpenEnvRolloutProcessor
At a high level:- Pick an OpenEnv client for your environment (see the OpenEnv environments for a full list):
- BrowserGym:
from envs.browsergym_env import BrowserGymEnv, BrowserGymAction - Echo:
from envs.echo_env import EchoEnv, EchoAction - TextArena:
from envs.textarena_env import TextArenaEnv, TextArenaAction
- BrowserGym:
- Write a
prompt_builder(observation, step, history)that turns the current observation into a user-facing prompt string (or chat messages). - Write an
action_parser(response_text)that converts model output into the environment’sActiontype. - Instantiate
OpenEnvRolloutProcessorwith the right constructor kwargs:env_client_clsorenv_factory(how to construct the client).prompt_builderandaction_parser.- Environment wiring:
docker_imageandenv_varsfor Docker-based envs (BrowserGym, TextArena).hub_repo_idto launch from Hugging Face Hub (for example"openenv/echo-env").env_base_urlwhen connecting to an already running server or remote Space.
- Optional task routing:
tasksandtask_varif you want to rotate across multiple tasks (for example multiple MiniWoB levels).
- Use it in an
@evaluation_test:- Set
rollout_processor=OpenEnvRolloutProcessor(...). - In the test body, read the step rewards sentinel from
row.messagesand setrow.evaluation_resultbased on whatever scoring you want.
- Set
prompt_builder and action_parser can be found in the Eval Protocol Python SDK:
- BrowserGym:
tests.pytest.test_openenv_browsergym_eval - Echo:
tests.pytest.test_openenv_echo_hub - TextArena:
tests.pytest.test_openenv_textarena_docker
BrowserGym example (MiniWoB via Docker)
openenv_browsergym_eval.py
- Swap
BrowserGymEnv/BrowserGymActionforEchoEnv/EchoAction,TextArenaEnv/TextArenaAction, or your own environment class. - Keep
prompt_builderandaction_parseraligned with the environment’s observation and action types. - Reuse the same
@evaluation_testfile across offline evals, dashboards, and RL integrations that call Eval Protocol.
Echo / TextArena and connection modes
OpenEnvRolloutProcessor can construct environments in three main ways, all driven by env_client_cls:
-
From Hugging Face Hub (recommended) —
from_hub:When you useEchoEnv.from_hub("openenv/echo-env"), OpenEnv will pull and start the container for you locally. Internally it runs a command similar to:You typically do not need to run this yourself; it is shown here so you know what OpenEnv is doing under the hood and can debug or run it manually if needed. -
Local / Docker image (TextArena, BrowserGym, custom) —
from_docker_image: -
Existing HTTP server / remote Space —
base_url:WithOpenEnvRolloutProcessor, you can pass a factory instead ofenv_client_cls:
OpenEnvRolloutProcessor, all Eval Protocol tooling (evaluation tests, logs UI, and integrations like TRL/rLLM) can reuse the same environment + reward logic by simply pointing at your @evaluation_test function via its module path.
