Running Rollouts with a Remote Server

If you already have an agent, you can integrate it with Eval Protocol by using the RemoteRolloutProcessor. RemoteRolloutProcessor delegates rollout execution to a remote HTTP service that you control. It’s useful for implementing rollouts with your existing agent codebase by wrapping it in an HTTP service.

High Level Flow

/init triggers one rollout: Eval Protocol calls your service’s POST /init with the row payload and correlation metadata.
Send logs via FireworksTracingHttpHandler: Your service emits structured logs tagged with the rollout’s correlation fields.
Send chat completions and store as trace: Your agent’s calls are recorded as traces in Fireworks.
Once rollout finished, pull full trace and evaluate: Eval Protocol polls Fireworks for a completion signal, then loads the trace and scores it.

Everything inside the dotted box is handled by Eval Protocol — you only need to implement the Remote Server, more on this below.

API Contract

POST /init: We expect the remote service to implement a single /init endpoint that accepts an InitRequest with the following fields:

completion_params

object

required

Dictionary containing model and optional parameters like temperature, max_tokens, etc.

messages

array

Array of conversation messages

tools

array

Array of available tools for the model

model_base_url

string

Base URL for the remote server to make LLM calls

metadata

object

required

Rollout execution metadata for correlation

api_key

string

API key to be used by the remote server

Request Example

init_request.json

{
    "completion_params": {
        "model": "accounts/fireworks/models/gpt-oss-120b",
        "temperature": 0.7,
        "max_tokens": 2048
    },
    "messages": [
        { "role": "user", "content": "What is the weather in San Francisco?" }
    ],
    "tools": [
        {
            "type": "function",
            "function": {
                "name": "get_weather",
                "description": "Get the weather for a city",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "city": { "type": "string" }
                    }
                }
            }
        }
    ],
    "model_base_url": "https://tracing.fireworks.ai/rollout_id/brave-night-42/invocation_id/wise-ocean-15/experiment_id/calm-forest-28/run_id/quick-river-07/row_id/bright-star-91",
    "metadata": {
        "invocation_id": "wise-ocean-15",
        "experiment_id": "calm-forest-28",
        "rollout_id": "brave-night-42",
        "run_id": "quick-river-07",
        "row_id": "bright-star-91"
    },
    "api_key": "fw_your_api_key"
}

Metadata Correlation

When making model calls in your remote server, include the following metadata in your traces and logs so that eval-protocol can correlate them with the corresponding EvaluationRows during result collection. RemoteRolloutProcessor automatically generates this and sends it to the server, so you don’t need to worry about wrangling metadata.

invocation_id
experiment_id
rollout_id
run_id
row_id

Fireworks Tracing: Handling `InitRequest` in your Server

This section shows how to parse the InitRequest fields and call your model. Note: the model_base_url is a tracing.fireworks.ai URL that proxies your model calls so Fireworks can capture full traces for each rollout. Below is a minimal FastAPI server showing how to wire this together.

remote_server.py

@app.post("/init")
def init(req: InitRequest):
    if not req.messages:
        raise ValueError("messages is required")

    model = req.completion_params.get("model")
    if not model:
        raise ValueError("model is required in completion_params")

    # Spread all completion_params (model, temperature, max_tokens, etc.)
    completion_kwargs = {"messages": req.messages, **req.completion_params}

    if req.tools:
        completion_kwargs["tools"] = req.tools

    # Build OpenAI client from InitRequest
    # You can also use req.api_key instead of an environment variable if preferred.
    client = OpenAI(
        base_url=req.model_base_url,
        api_key=os.environ.get("FIREWORKS_API_KEY"),
    )
    completion = client.chat.completions.create(**completion_kwargs)

Fireworks Tracing: Signaling Rollout Completion

Python Example

The RemoteRolloutProcessor detects rollout completion by polling structured logs sent to Fireworks Tracing. Your remote server should add FireworksTracingHttpHandler as the logging handler, a RolloutIdFilter, and log completion status using structured Status objects:

remote_server.py

import logging
from eval_protocol import Status, InitRequest, FireworksTracingHttpHandler, RolloutIdFilter

# Configure Fireworks tracing handler
fireworks_handler = FireworksTracingHttpHandler()
logging.getLogger().addHandler(fireworks_handler)

@app.post("/init") 
def init(request: InitRequest):
    # Create rollout-specific logger with filter
    rollout_logger = logging.getLogger(f"eval_server.{request.metadata.rollout_id}")
    rollout_logger.addFilter(RolloutIdFilter(request.metadata.rollout_id))
    
    try:
        # Execute your rollout here
        
        # Then log successful completion with structured status
        rollout_logger.info(
            f"Rollout {request.metadata.rollout_id} completed",
            extra={"status": Status.rollout_finished()}
        )
            
    except Exception as e:
        # Log errors with structured status
        rollout_logger.error(
            f"Rollout {request.metadata.rollout_id} failed: {e}",
            extra={"status": Status.rollout_error(str(e))}
        )

TypeScript / Node (Vercel) Example

For TypeScript servers (for example, a Vercel serverless function), the eval-protocol JS/TS SDK provides equivalent helpers:

withFireworksLogging: Wraps your handler to automatically send structured logs to Fireworks Tracing.
createRolloutLogger: Creates a rollout-scoped logger tagged with the current rollout_id.
Status / mapOpenAIErrorToStatus: Helpers for emitting structured completion and error statuses.

api/init.ts

import type { VercelRequest, VercelResponse } from '@vercel/node';
import {
    initRequestSchema,
    type InitRequest,
    Status,
    createRolloutLogger,
    withFireworksLogging,
} from 'eval-protocol';

async function handler(req: VercelRequest, res: VercelResponse) {
    const initRequest: InitRequest = initRequestSchema.parse(req.body);

    // Extract rollout id from InitRequest payload and create rollout-specific logger
    const rolloutId = initRequest.metadata.rollout_id;
    const logger = createRolloutLogger(rolloutId);

    try {
        // Execute your rollout here

        const status = Status.rolloutFinished();
        logger.info(`Rollout ${rolloutId} completed`, { status });
    } catch (error: any) {
        const status = Status.rolloutInternalError(error.message);
        logger.error(`Rollout ${rolloutId} failed: ${error.message}`, { status });
    }
}

// Export wrapped handler so all logs go to Fireworks Tracing
export default withFireworksLogging(handler);

Alternative: Environment Variable Approach

For the following setups, you can use the EP_ROLLOUT_ID environment variable instead of manual filters:

One rollout is processed per server instance

remote_server.py

import os
import logging
from eval_protocol import Status, InitRequest, FireworksTracingHttpHandler

# Configure Fireworks tracing handler
os.environ["EP_ROLLOUT_ID"] = request.metadata.rollout_id
fireworks_handler = FireworksTracingHttpHandler()
logging.getLogger().addHandler(fireworks_handler)
logger = logging.getLogger(__name__)

@app.post("/init") 
def init(request: InitRequest):    
    ...

/init spawns separate Python processes

remote_server.py

import os
import logging
import multiprocessing
from eval_protocol import FireworksTracingHttpHandler, InitRequest

def execute_rollout_step_sync(request):
    # Set in the CHILD process
    os.environ["EP_ROLLOUT_ID"] = rollout_id
    logging.getLogger().addHandler(FireworksTracingHttpHandler())
    
    # Execute your rollout here

@app.post("/init")
async def init(request: InitRequest):
    # Do NOT set EP_ROLLOUT_ID here; set it in the child
    p = multiprocessing.Process(
        target=execute_rollout_step_sync,
        args=(request),
    )
    p.start()

How `RemoteRolloutProcessor` uses Fireworks Tracing

Remote server logs completion: Uses Status.rollout_finished() or Status.rollout_error()
RemoteRolloutProcessor polls: Searches logs by rollout_id tag until completion found
Status extraction: Reads structured status fields (code, message, details)

Example

See the following repos for end to end examples:

Getting Started

Integrations

Using the Logs UI

Reference

Running Rollouts with a Remote Server

High Level Flow

API Contract

Metadata Correlation

Fireworks Tracing: Handling `InitRequest` in your Server

Fireworks Tracing: Signaling Rollout Completion

Python Example

TypeScript / Node (Vercel) Example

Alternative: Environment Variable Approach

How `RemoteRolloutProcessor` uses Fireworks Tracing

Example

Getting Started

Integrations

Using the Logs UI

Reference

​High Level Flow

​API Contract

​Metadata Correlation

​Fireworks Tracing: Handling InitRequest in your Server

​Fireworks Tracing: Signaling Rollout Completion

​Python Example

​TypeScript / Node (Vercel) Example

​Alternative: Environment Variable Approach

​How RemoteRolloutProcessor uses Fireworks Tracing

​Example

High Level Flow

API Contract

Metadata Correlation

Fireworks Tracing: Handling `InitRequest` in your Server

Fireworks Tracing: Signaling Rollout Completion

Python Example

TypeScript / Node (Vercel) Example

Alternative: Environment Variable Approach

How `RemoteRolloutProcessor` uses Fireworks Tracing

Example