# Eval Protocol ## Docs - [Setup for Eval Protocol Development](https://evalprotocol.io/authoring-with-ai-agents.md): Configuring MCP servers for AI coding agents - [Common Errors](https://evalprotocol.io/common-errors.md): Quick fixes for the most frequent evaluation issues - [GEPA Prompt Optimizer](https://evalprotocol.io/integrations/gepa-trainer.md): Automatically optimize prompts using your existing evaluations - [Klavis MCP Environments](https://evalprotocol.io/integrations/klavis-mcp.md): How to use Klavis with Eval Protocol - [OpenAI RFT Trainer](https://evalprotocol.io/integrations/openai-rft-trainer.md): Reuse Eval Protocol evaluation tests as Python graders for OpenAI Reinforcement Fine-Tuning (RFT) - [OpenEnv Environments](https://evalprotocol.io/integrations/openenv-rollout-processor.md): Use any OpenEnv HTTP environment with Eval Protocol via a single rollout processor - [rLLM Trainer](https://evalprotocol.io/integrations/rllm-trainer.md): Reuse Eval Protocol environments and evaluation tests as workflows inside the rLLM reinforcement learning framework - [Training with TRL](https://evalprotocol.io/integrations/trl-trainer.md): Connect environments to TRL to train language models - [Introduction to Eval Protocol (EP)](https://evalprotocol.io/introduction.md) - [MCP Control/Data Planes](https://evalprotocol.io/mcp-extensions.md) - [Fine Tuning an SVGAgent with Eval Protocol](https://evalprotocol.io/quickstart.md): Train and improve an SVG generation agent using reinforcement fine tuning with Eval Protocol - [CLI](https://evalprotocol.io/reference/cli.md) - [Data Loader](https://evalprotocol.io/reference/data-loader.md): Load evaluation data using DynamicDataLoader and InlineDataLoader for reusable, parameterized inputs - [@evaluation_test](https://evalprotocol.io/reference/evaluation-test.md): Create pytest-based evaluation tests for AI model evaluation with support for pointwise, groupwise, and all modes - [Rollout Processors](https://evalprotocol.io/reference/rollout-processors.md): Overview of built-in rollout processors, their configs, and when to use each - [Simulated Users](https://evalprotocol.io/simulated-users.md) - [Specification](https://evalprotocol.io/specification.md) - [Evaluation Tests (Getting Started)](https://evalprotocol.io/tutorial/evaluation-tests-getting-started.md): Write your first @evaluation_test in a few minutes, then scale up to real benchmarks. - [Running Rollouts with GitHub Actions](https://evalprotocol.io/tutorial/github-actions-rollout.md) - [GSM8K Fine-tuning Quickstart (Small Model)](https://evalprotocol.io/tutorial/gsm8k-finetuning-quickstart.md): Run pytest to materialize the evaluator and dataset, then launch a local Reinforcement Fine-Tuning job on a small model. - [Running Rollouts with a Remote Server](https://evalprotocol.io/tutorial/remote-rollout-processor.md) - [Rollout Processors (Getting Started)](https://evalprotocol.io/tutorial/rollout-processors-getting-started.md): Pick the right rollout processor for your eval and wire it up with minimal boilerplate. - [Starting the UI](https://evalprotocol.io/tutorial/ui/getting-started.md) - [Pivot View](https://evalprotocol.io/tutorial/ui/pivot.md) - [Table View](https://evalprotocol.io/tutorial/ui/table.md) ## OpenAPI Specs - [openapi](https://evalprotocol.io/openapi.yml)