Skip to main content
If you already have an agent, you can integrate it with Eval Protocol by using the RemoteRolloutProcessor. RemoteRolloutProcessor delegates rollout execution to a remote HTTP service that you control. It’s useful for implementing rollouts with your existing agent codebase by wrapping it in an HTTP service.

High Level Flow

Remote Rollout Processor flow
  1. /init triggers one rollout: Eval Protocol calls your service’s POST /init with the row payload and correlation metadata.
  2. Send logs via FireworksTracingHttpHandler: Your service emits structured logs tagged with the rollout’s correlation fields.
  3. Send chat completions and store as trace: Your agent’s calls are recorded as traces in Fireworks.
  4. Once rollout finished, pull full trace and evaluate: Eval Protocol polls Fireworks for a completion signal, then loads the trace and scores it.
Everything inside the dotted box is handled by Eval Protocol — you only need to implement the Remote Server, more on this below.

API Contract

POST /init: We expect the remote service to implement a single /init endpoint that accepts an InitRequest with the following fields:
completion_params
object
required
Dictionary containing model and optional parameters like temperature, max_tokens, etc.
messages
array
Array of conversation messages
tools
array
Array of available tools for the model
model_base_url
string
Base URL for the remote server to make LLM calls
metadata
object
required
Rollout execution metadata for correlation
api_key
string
API key to be used by the remote server
init_request.json
{
    "completion_params": {
        "model": "accounts/fireworks/models/gpt-oss-120b",
        "temperature": 0.7,
        "max_tokens": 2048
    },
    "messages": [
        { "role": "user", "content": "What is the weather in San Francisco?" }
    ],
    "tools": [
        {
            "type": "function",
            "function": {
                "name": "get_weather",
                "description": "Get the weather for a city",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "city": { "type": "string" }
                    }
                }
            }
        }
    ],
    "model_base_url": "https://tracing.fireworks.ai/rollout_id/brave-night-42/invocation_id/wise-ocean-15/experiment_id/calm-forest-28/run_id/quick-river-07/row_id/bright-star-91",
    "metadata": {
        "invocation_id": "wise-ocean-15",
        "experiment_id": "calm-forest-28",
        "rollout_id": "brave-night-42",
        "run_id": "quick-river-07",
        "row_id": "bright-star-91"
    },
    "api_key": "fw_your_api_key"
}

Metadata Correlation

When making model calls in your remote server, include the following metadata in your traces and logs so that eval-protocol can correlate them with the corresponding EvaluationRows during result collection. RemoteRolloutProcessor automatically generates this and sends it to the server, so you don’t need to worry about wrangling metadata.
  • invocation_id
  • experiment_id
  • rollout_id
  • run_id
  • row_id

Fireworks Tracing

The RemoteRolloutProcessor detects rollout completion by polling structured logs sent to Fireworks Tracing. Your remote server should add FireworksTracingHttpHandler as the logging handler, a RolloutIdFilter, and log completion status using structured Status objects:
remote_server.py
import logging
from eval_protocol import Status, InitRequest, FireworksTracingHttpHandler, RolloutIdFilter

# Configure Fireworks tracing handler
fireworks_handler = FireworksTracingHttpHandler()
logging.getLogger().addHandler(fireworks_handler)

@app.post("/init") 
def init(request: InitRequest):
    # Create rollout-specific logger with filter
    rollout_logger = logging.getLogger(f"eval_server.{request.metadata.rollout_id}")
    rollout_logger.addFilter(RolloutIdFilter(request.metadata.rollout_id))
    
    try:
        # Execute your rollout here
        
        # Then log successful completion with structured status
        rollout_logger.info(
            f"Rollout {request.metadata.rollout_id} completed",
            extra={"status": Status.rollout_finished()}
        )
            
    except Exception as e:
        # Log errors with structured status
        rollout_logger.error(
            f"Rollout {request.metadata.rollout_id} failed: {e}",
            extra={"status": Status.rollout_error(str(e))}
        )

Alternative: Environment Variable Approach

For the following setups, you can use the EP_ROLLOUT_ID environment variable instead of manual filters:
  1. One rollout is processed per server instance
remote_server.py
import os
import logging
from eval_protocol import Status, InitRequest, FireworksTracingHttpHandler

# Configure Fireworks tracing handler
os.environ["EP_ROLLOUT_ID"] = request.metadata.rollout_id
fireworks_handler = FireworksTracingHttpHandler()
logging.getLogger().addHandler(fireworks_handler)
logger = logging.getLogger(__name__)

@app.post("/init") 
def init(request: InitRequest):    
    ...
  1. /init spawns separate Python processes
remote_server.py
import os
import logging
import multiprocessing
from eval_protocol import FireworksTracingHttpHandler, InitRequest

def execute_rollout_step_sync(request):
    # Set in the CHILD process
    os.environ["EP_ROLLOUT_ID"] = rollout_id
    logging.getLogger().addHandler(FireworksTracingHttpHandler())
    
    # Execute your rollout here

@app.post("/init")
async def init(request: InitRequest):
    # Do NOT set EP_ROLLOUT_ID here; set it in the child
    p = multiprocessing.Process(
        target=execute_rollout_step_sync,
        args=(request),
    )
    p.start()

How RemoteRolloutProcessor uses Fireworks Tracing

  1. Remote server logs completion: Uses Status.rollout_finished() or Status.rollout_error()
  2. RemoteRolloutProcessor polls: Searches logs by rollout_id tag until completion found
  3. Status extraction: Reads structured status fields (code, message, details)

Example

See the following repo for a simple end to end example: