eval-protocol
’s RemoteRolloutProcessor.
RemoteRolloutProcessor
delegates rollout execution to a remote HTTP service
that you control. It’s useful for implementing rollouts with your existing agent
codebase by wrapping it in an HTTP service.
Setup
Sign up for a free account at Langfuse and get your API keys. Set the following environment variables:API Contract
We expect the remote service to implement the following API contract:POST /init
POST /init
Request Body
- model: The model to use for the rollout (e.g., “openai/gpt-4o”)
- messages: Array of conversation messages
- tools: Optional array of available tools for the model
- model_base_url: Optional base URL for the remote server to make LLM calls. This is useful for configuring different endpoints during development/training.
- metadata: Object containing rollout execution metadata
request_body.json
GET /status?rollout_id={rollout_id}
GET /status?rollout_id={rollout_id}
Response Body
status_response.json
Metadata Piggyback
To push trajectories back toeval-protocol
, you must include the following
metadata along with your traces so that they can correlated with the
EvaluationRow
s.
invocation_id
experiment_id
rollout_id
run_id
row_id
Example Server Implementations
See the following example server implementations for reference:Example @evaluation_test
Usage
See the example code for using RemoteRolloutProcessor
in the
@evaluation_test
decorator here.
Usage
- Run one of the server examples
- Run the example
@evaluation_test
usage