Eval Recipes Runner

Name: Eval Recipes Runner
Author: rysweet

rysweet/amplihack

Run standardized evaluation recipes against LLM prompts, agents, and tool chains to catch regressions, compare model versions, and gate releases with repeatable quality checks.

npx skills add https://github.com/rysweet/amplihack --skill eval-recipes-runner

Installs	119
Repository	rysweet/amplihack ↗

Related skills

Agent BrowserDrive the agent-browser CLI to open local or remote pages, click, fill forms, screenshot, and sanity-check dev servers after npm run dev.404k187

TddRun a red-green-refactor loop while building features or fixing bugs so tests describe behavior through public APIs instead of brittle implementation checks.260k121k

Use My BrowserUse the user live browser session for logged-in flows, DevTools context, and rendered DOM inspection.219k61

Test Driven DevelopmentMake your coding agent follow strict test-first red-green-refactor before any feature, bugfix, or refactor production code.132k221k

Verification Before CompletionInstall this when you want your coding agent to prove tests, lint, and builds actually pass before it says a task is done or opens a PR.113k221k

Webapp TestingVerify local web apps with Python Playwright—UI behavior, screenshots, and browser logs—with optional managed dev-server startup.98.2k148k

Testing & QAagentsllmautomation

Related skills

This week for builders

Related skills