
Behavioral Evals
Run behavioral evaluation suites on CLI agents to catch regressions in tool use, safety, and task completion before promoting model or prompt changes.
npx skills add https://github.com/google-gemini/gemini-cli --skill behavioral-evals| Installs | 196 |
|---|---|
| Repository | google-gemini/gemini-cli ↗ |