Table
tab is selected in
the top left corner of the UI.
Make sure you are in the Table View by checking the top left corner of the UI for the Table
tab.
running
, finished
, stopped
, or error
evaluation_result.score
List of evaluation rows.
Click on a row to inspect the evaluation.
On the left side of an expanded row, you can see the chat interface.
On the right side of an expanded row, you can see the metadata.
The filter section above the table.
Click on the funnel icon next to the invocation ID to filter the table by invocation ID.
+ Add Filter Group
button above the table. Then you can choose to filter by AND
or OR
and add
filters to the group by clicking on the + Add Filter to Group
button.
Click on the + Add Filter Group
button above the table to create a custom filter. In this example, we filter for scores equal to 0
, models with gpt
in the name, and a specific run_id
.
@evaluation_test
, the UI automatically shows running
tests and you can watch rollouts live in the chat interface. When a test finishes,
detailed evaluation results appear to the right of the chat.
Checkout this example of a test running in VSCode and the UI updating with the
rollout.
Expand running
rows to see the chat interface update with the rollout.