Local-first prompt evaluation

Prompt Tournament Runner

Compare 3–5 prompt variants on the same task, score the outputs 1–5, pick a winner, and keep a local record so you never re-litigate the same prompt twice. No account, no API keys, no telemetry — it runs entirely on your machine.

Read the README GitHub repo Releases Issues

Quick start

npm ci
npm run serve        # builds, then serves the UI on http://localhost:8790

# reachable over LAN / Tailscale so you can score from your phone:
npm run serve -- --host 0.0.0.0 --port 8790

The loop

Capture the task (and optional context)
Enter 3–5 prompt variants
Run them, paste the outputs back in
Score each 1–5 and pick a winner
The run is saved locally for later review

Use it for

choosing a system prompt with evidence, not vibes
comparing outputs across models or temperatures
prompt regression checks before you ship
scoring a session from your phone over Tailscale

Privacy

100% local. The app never calls an external API and has no analytics. Runs are written to .prompt-tournament-runs.json in your working directory — yours to keep, diff, or delete.