
Golden Dataset
Create curated evaluation datasets to benchmark LLM outputs, regression-test prompts, and validate agent behavior before shipping AI features to production users.
npx skills add https://github.com/yonatangross/orchestkit --skill golden-dataset| Installs | 136 |
|---|---|
| Repository | yonatangross/orchestkit ↗ |