
Agent Eval
Benchmark coding agents against real repos, compare tool-use accuracy, and score regressions before shipping autonomous dev workflows.
npx skills add https://github.com/colbymchenry/codegraph --skill agent-eval| Installs | 821 |
|---|---|
| Repository | colbymchenry/codegraph ↗ |