Quick start

Get a graded pass/fail report in about 60 seconds — in the browser, or from your terminal / CI / agent.

Option A — the dashboard

Open the dashboard.
Click a sample from the ShopBot journey, or drop your own CSV / JSON / YAML.
Read the score + per-case pass/fail. Past runs are saved on the right.

Option B — the CLI (zero tokens)

Grade locally — great for CI gates and AI agents. No model calls, just assertions.

npx evaldog run cases.csv            # graded report + score
npx evaldog run cases.csv --min 80   # exit 1 if score < 80 (CI gate)
npx evaldog run cases.csv --json     # machine-readable (for agents)

The file format

One row/object per case: the model output you already have, an optional expected value, and an assert type.

CSV

name,output,expected,assert
Password reset,Click the reset link.,reset link,contains
Refund window,Refunds within 30 days.,30 days,contains
JSON shape,"{""ok"":true}",,is-json

JSON / YAML

cases:
  - name: greeting
    output: "Sure! Happy to help."
    assert:
      - { type: not-empty }
  - name: refund policy
    output: "Refunds are available within 30 days."
    assert:
      - { type: contains, value: "30 days" }

Assertions: contains · icontains · equals · regex · is-json · not-empty. Also reads promptfoo-style { tests: [...] }.

Open the dashboard