> ## Documentation Index
> Fetch the complete documentation index at: https://evalprotocol.io/llms.txt
> Use this file to discover all available pages before exploring further.

# CLI

The `ep` command-line interface can inspect evaluation runs locally, upload evaluators, and create reinforcement fine-tuning jobs on Fireworks.

```bash theme={null}
ep [global options] <command> [command options]
```

## Global Options

These options can be used with any command:

<ParamField path="--verbose" type="boolean" default="false">
  Enable verbose logging (Aliases: `-v`)
</ParamField>

<ParamField path="--server">
  Fireworks API server hostname or URL (e.g., dev.api.fireworks.ai or [https://dev.api.fireworks.ai](https://dev.api.fireworks.ai))
</ParamField>

## Commands

### `ep logs`

Serve logs with file watching and real-time updates

<ParamField path="--port" type="number" default="8000">
  Port to bind to (default: 8000)
</ParamField>

<ParamField path="--debug" type="boolean" default="false">
  Enable debug mode
</ParamField>

<ParamField path="--disable-elasticsearch-setup" type="boolean" default="false">
  Disable Elasticsearch setup
</ParamField>

<ParamField path="--use-env-elasticsearch-config" type="boolean" default="false">
  Use env vars for Elasticsearch config (requires ELASTICSEARCH\_URL, ELASTICSEARCH\_API\_KEY, ELASTICSEARCH\_INDEX\_NAME)
</ParamField>

<ParamField path="--use-fireworks" type="boolean" default="false">
  Force Fireworks tracing backend for logs UI (overrides env auto-detection)
</ParamField>

<ParamField path="--use-elasticsearch" type="boolean" default="false">
  Force Elasticsearch backend for logs UI (overrides env auto-detection)
</ParamField>

### `ep upload`

Scan for evaluation tests, select, and upload as Fireworks evaluators

<ParamField path="--path" type="string" default=".">
  Path to search for evaluation tests (default: current directory)
</ParamField>

<ParamField path="--entry">
  Entrypoint of evaluation test to upload (module:function or path::function). For multiple, separate by commas.
</ParamField>

<ParamField path="--yes" type="boolean" default="false">
  Non-interactive: upload all discovered evaluation tests (Aliases: `-y`)
</ParamField>

<ParamField path="--env-file">
  Path to .env file containing secrets to upload (default: .env in current directory)
</ParamField>

<ParamField path="--force" type="boolean" default="false">
  Overwrite existing evaluator with the same ID
</ParamField>

<ParamField path="--evaluator-default-dataset" type="string">
  Default dataset to use with this evaluator (Aliases: `--default-dataset`)
</ParamField>

<ParamField path="--evaluator-description" type="string">
  Description for evaluator (Aliases: `--description`)
</ParamField>

<ParamField path="--evaluator-display-name" type="string">
  Display name for evaluator (defaults to ID) (Aliases: `--name`, `--display-name`)
</ParamField>

<ParamField path="--evaluator-entry-point" type="string">
  Pytest-style entrypoint (e.g., test\_file.py::test\_func). Auto-detected if not provided. (Aliases: `--entry-point`)
</ParamField>

<ParamField path="--evaluator-requirements" type="string">
  Requirements for evaluator (auto-detected from requirements.txt if not provided) (Aliases: `--requirements`)
</ParamField>

<ParamField path="--evaluator-id" type="string">
  Evaluator ID to use (if multiple selections, a numeric suffix is appended) (Aliases: `--id`)
</ParamField>

### `ep create rft`

Create a Reinforcement Fine-tuning Job on Fireworks

<ParamField path="--yes" type="boolean" default="false">
  Non-interactive mode (Aliases: `-y`)
</ParamField>

<ParamField path="--dry-run" type="boolean" default="false">
  Print planned SDK call without sending
</ParamField>

<ParamField path="--force" type="boolean" default="false">
  Overwrite existing evaluator with the same ID
</ParamField>

<ParamField path="--skip-validation" type="boolean" default="false">
  Skip local dataset/evaluator validation
</ParamField>

<ParamField path="--ignore-docker" type="boolean" default="false">
  Ignore Dockerfile even if present; run pytest on host during evaluator validation
</ParamField>

<ParamField path="--docker-build-extra" type="string" default="">
  Extra flags to pass to 'docker build' when validating evaluator (quoted string, e.g. "--no-cache --pull --progress=plain")
</ParamField>

<ParamField path="--docker-run-extra" type="string" default="">
  Extra flags to pass to 'docker run' when validating evaluator (quoted string, e.g. "--env-file .env --memory=8g")
</ParamField>

<ParamField path="--env-file">
  Path to .env file containing secrets to upload to Fireworks (default: .env in project root)
</ParamField>

<ParamField path="--source-job">
  The source reinforcement fine-tuning job to copy configuration from. If other flags are set, they will override the source job's configuration.
</ParamField>

<ParamField path="--quiet" type="boolean" default="false">
  If set, only errors will be printed.
</ParamField>

<ParamField path="--dataset" type="string">
  The name of the dataset used for training.
</ParamField>

<ParamField path="--evaluator" type="string">
  The evaluator resource name to use for RLOR fine-tuning job.
</ParamField>

<ParamField path="--reinforcement-fine-tuning-job-id" type="string">
  ID of the reinforcement fine-tuning job, a random UUID will be generated if not specified. (Aliases: `--job-id`)
</ParamField>

<ParamField path="--chunk-size" type="number">
  Data chunking for rollout, default size 200, enabled when dataset > 300. Valid range is 1-10,000.
</ParamField>

<ParamField path="--eval-auto-carveout" type="boolean" default="false">
  Whether to auto-carve the dataset for eval.
</ParamField>

<ParamField path="--evaluation-dataset" type="string">
  The name of a separate dataset to use for evaluation.
</ParamField>

<ParamField path="--inference-parameters-extra-body" type="string">
  Additional parameters for the inference request as a JSON string. For example:
  "\{"stop": \["\n"]}". (Aliases: `--extra-body`)
</ParamField>

<ParamField path="--inference-parameters-max-output-tokens" type="number">
  Maximum number of tokens to generate per response. (Aliases: `--max-output-tokens`)
</ParamField>

<ParamField path="--inference-parameters-response-candidates-count" type="number">
  Number of response candidates to generate per input. (Aliases: `--response-candidates-count`)
</ParamField>

<ParamField path="--inference-parameters-temperature" type="number">
  Sampling temperature, typically between 0 and 2. (Aliases: `--temperature`)
</ParamField>

<ParamField path="--inference-parameters-top-k" type="number">
  Top-k sampling parameter, limits the token selection to the top k tokens. (Aliases: `--top-k`)
</ParamField>

<ParamField path="--inference-parameters-top-p" type="number">
  Top-p sampling parameter, typically between 0 and 1. (Aliases: `--top-p`)
</ParamField>

<ParamField path="--loss-config-kl-beta" type="number">
  KL coefficient (beta) override for GRPO-like methods. If unset, the trainer
  default is used. (Aliases: `--rl-kl-beta`, `--kl-beta`)
</ParamField>

<ParamField path="--loss-config-method" type="Literal">
  RL loss method for underlying trainers. One of \{grpo,dapo}. (Aliases: `--rl-loss-method`, `--method`)
</ParamField>

<ParamField path="--mcp-server" type="string">
  The MCP server resource name to use for the reinforcement fine-tuning job. (Optional)
</ParamField>

<ParamField path="--node-count" type="number">
  The number of nodes to use for the fine-tuning job. If not specified, the default is 1. (Aliases: `--nodes`)
</ParamField>

<ParamField path="--training-config-base-model" type="string">
  The name of the base model to be fine-tuned Only one of 'base\_model' or
  'warm\_start\_from' should be specified. (Aliases: `--base-model`)
</ParamField>

<ParamField path="--training-config-batch-size" type="number">
  The maximum packed number of tokens per batch for training in sequence packing. (Aliases: `--batch-size`)
</ParamField>

<ParamField path="--training-config-epochs" type="number">
  The number of epochs to train for. (Aliases: `--epochs`)
</ParamField>

<ParamField path="--training-config-gradient-accumulation-steps" type="number">
  The number of batches to accumulate gradients before updating the model parameters. The effective batch size will be batch-size multiplied by this value. (Aliases: `--gradient-accumulation-steps`)
</ParamField>

<ParamField path="--training-config-learning-rate" type="number">
  The learning rate used for training. (Aliases: `--learning-rate`)
</ParamField>

<ParamField path="--training-config-learning-rate-warmup-steps" type="number">
  The number of learning rate warmup steps for the reinforcement fine-tuning job. (Aliases: `--learning-rate-warmup-steps`)
</ParamField>

<ParamField path="--training-config-lora-rank" type="number">
  The rank of the LoRA layers. (Aliases: `--lora-rank`)
</ParamField>

<ParamField path="--training-config-max-context-length" type="number">
  The maximum context length to use with the model. (Aliases: `--max-context-length`)
</ParamField>

<ParamField path="--training-config-output-model" type="string">
  The model ID to be assigned to the resulting fine-tuned model.

  If not specified, the job ID will be used. (Aliases: `--output-model`)
</ParamField>

<ParamField path="--training-config-warm-start-from" type="string">
  The PEFT addon model in Fireworks format to be fine-tuned from Only one of
  'base\_model' or 'warm\_start\_from' should be specified. (Aliases: `--warm-start-from`)
</ParamField>

<ParamField path="--wandb-config-api-key" type="string">
  The API key for the wandb service. (Aliases: `--wandb-api-key`, `--api-key`)
</ParamField>

<ParamField path="--wandb-config-enabled" type="boolean" default="false">
  Whether to enable wandb logging. (Aliases: `--wandb`, `--enabled`)
</ParamField>

<ParamField path="--wandb-config-entity" type="string">
  The entity name for the wandb service. (Aliases: `--wandb-entity`, `--entity`)
</ParamField>

<ParamField path="--wandb-config-project" type="string">
  The project name for the wandb service. (Aliases: `--wandb-project`, `--project`)
</ParamField>

### `ep local-test`

Select an evaluation test and run it locally. If a Dockerfile exists, build and run via Docker; otherwise run on host.

<ParamField path="--entry">
  Entrypoint to run (path::function or path). If not provided, a selector will be shown (unless --yes).
</ParamField>

<ParamField path="--ignore-docker" type="boolean" default="false">
  Ignore Dockerfile even if present; run pytest on host
</ParamField>

<ParamField path="--yes" type="boolean" default="false">
  Non-interactive: if multiple tests exist and no --entry, fails with guidance (Aliases: `-y`)
</ParamField>

<ParamField path="--docker-build-extra" type="string" default="">
  Extra flags to pass to 'docker build' (quoted string, e.g. "--no-cache --pull --progress=plain")
</ParamField>

<ParamField path="--docker-run-extra" type="string" default="">
  Extra flags to pass to 'docker run' (quoted string, e.g. "--env-file .env --memory=8g")
</ParamField>
