What is a rollout processor?
A rollout processor is how Eval Protocol turns input rows into trajectories:- Takes a batch of
EvaluationRows - Calls a model, agent, or environment as needed
- Returns updated rows with new messages attached
@evaluation_test, and Eval Protocol handles:
- Concurrency and retries
- Logging and cost tracking
- Cleanup of external resources (e.g., MCP servers)
Quick decision guide
- Already have model outputs? →
NoOpRolloutProcessor - Single chat completion per row? →
SingleTurnRolloutProcessor - Tools / function calling via MCP? →
AgentRolloutProcessor - Interactive MCP “gym” environment? →
MCPGymRolloutProcessor - Already have an in‑production agent/service you want to eval or train? →
RemoteRolloutProcessor(see the next page: Remote Rollout Processor)
@evaluation_test.
Single-turn model calls
UseSingleTurnRolloutProcessor for classic “prompt → answer” tasks:
tutorial_single_turn.py
- Good for QA, grading, and static benchmarks
- For more knobs (e.g.,
extra_body.reasoning_effort), see the full reference.
No-op processor for offline evaluation
If you have pre-generated model outputs, useNoOpRolloutProcessor:
tutorial_noop.py
Agents and tools via MCP
UseAgentRolloutProcessor when your eval requires tools or function calling:
tutorial_agent.py
- The agent will:
- Call the model with available tools
- Execute any returned tool calls
- Loop until there are no more tools to call or
stepsis reached
MCP gym environments
UseMCPGymRolloutProcessor for interactive environments exposed via MCP:
tutorial_gym.py
When to read the full reference
Stay on this page until you need:- Fine‑grained
RolloutProcessorConfigusage - Pydantic AI–specific integrations
- Detailed concurrency and retry behavior
- CLI flags (pytest plugin) for CI tuning

