
Advanced Evaluation
Run rigorous multi-criteria evaluation of agent outputs, tool chains, or candidate solutions before release, scoring quality, safety, and task completion beyond smoke tests.
npx skills add https://github.com/shipshitdev/library --skill advanced-evaluation| Installs | 122 |
|---|---|
| Repository | shipshitdev/library ↗ |