
Guideline Generation
Score confidence on each section of AI-generated brand guidelines so solo builders know what to trust before publishing voice, tone, and positioning docs.
Overview
Guideline-generation is an agent skill most often used in Build (also Validate, Launch) that scores each section of generated brand guidelines with High, Medium, or Low confidence from source corroboration.
Install
npx skills add https://github.com/anthropics/knowledge-work-plugins --skill guideline-generationWhat is this skill?
- Three-tier confidence model (High, Medium, Low) with explicit criteria per tier
- Requires corroborating sources, authority weighting, and conflict resolution before marking a section actionable
- Flags single-source and inferred sections for mandatory team review
- Applies to voice, tone, social, and competitive positioning blocks in generated guidelines
- 3 confidence levels (High, Medium, Low)
- High tier requires meeting at least 3 of 5 criteria
- Low tier requires at least 2 of 5 risk criteria
Adoption & trust: 1.6k installs on skills.sh; 19.6k GitHub stars; 3/3 security scanners passed (skills.sh audits).
What problem does it solve?
You have AI-drafted brand guidelines but no structured way to tell which voice, tone, or positioning rules are evidence-backed versus guesswork.
Who is it for?
Indie founders or one-person marketing who generate brand guides from scattered internal docs and need a consistent rubric before sharing with contractors or agents.
Skip if: Teams that already have a signed-off enterprise brand portal with legal approval—skip scoring and enforce the official PDF instead.
When should I use this skill?
After an agent drafts brand guideline sections from documents and conversation analysis, before publishing or handing off to downstream copy tasks.
What do I get? / Deliverables
Each guideline section carries a defensible confidence label and review guidance so you publish only well-supported rules and queue weak sections for team confirmation.
- Per-section confidence labels (High/Medium/Low)
- Rationale tied to source count and conflicts
- Explicit team-review flags for Low confidence blocks
Recommended Skills
Journey fit
Spans multiple journey phases - primary shelf plus alternate fits below.
Canonical shelf is Build → docs because the output is structured brand-guideline artifacts teams store alongside product docs. Guideline sections (voice, tone, positioning) are documentation deliverables even when sourced from sales calls and Slack—not a one-off landing test.
Where it fits
Rank positioning statements before committing to a landing narrative.
Attach confidence tiers to each section of an internal brand kit for contractors.
Decide which tone rules are safe for social posts versus needing founder review.
Re-score guidelines after new customer interviews change implied voice patterns.
How it compares
Use as a quality gate on generated copy rules, not as a logo generator or visual design system tool.
Common Questions / FAQ
Who is guideline-generation for?
Solo and indie builders who use knowledge-work plugins to draft brand guidelines from documents and conversations and want explicit confidence labels before acting on them.
When should I use guideline-generation?
Use it in Validate when scoping positioning, in Build while writing brand and voice docs, and at Launch when aligning site copy—any time an agent outputs guideline sections from heterogeneous sources.
Is guideline-generation safe to install?
It is methodology-only scoring logic with no shell or network requirements by itself; review the Security Audits panel on this Prism page for the parent plugin bundle before installing.
SKILL.md
READMESKILL.md - Guideline Generation
# Confidence Scoring Methodology How to assign and interpret confidence scores for generated brand guidelines. ## Scoring Levels ### High Confidence The guideline section is well-supported and actionable. **Criteria (must meet at least 3):** - 3+ corroborating sources - Explicit guidance found in at least one AUTHORITATIVE source - Consistent across document and conversation analysis - Specific, actionable instructions (not just vague principles) - No unresolved conflicts **Example:** Voice attribute "Confident but not arrogant" appears in the official style guide, is demonstrated in email templates, and matches patterns in top performer calls. ### Medium Confidence The section is reasonable but could benefit from more data or team confirmation. **Criteria (must meet at least 2):** - 1-2 corroborating sources - Inferred from patterns rather than explicit instruction - Minor inconsistencies resolved via recency or authority - Actionable but some interpretation was required - May have one unresolved conflict **Example:** Tone for social media inferred from email templates and one Slack thread, but no official social media guidelines exist. ### Low Confidence The section is a best-effort recommendation. Team review strongly recommended. **Criteria (must meet at least 2):** - Single source only - Primarily inferred from indirect evidence - Significant interpretation required - Unresolved conflicts between sources - Limited specificity **Example:** Competitive positioning derived from a single sales call where a competitor was discussed, with no supporting documentation. ## Section-Level Scoring Guide ### Voice Attributes - **High**: Attributes appear in official brand guide AND are demonstrated in templates or calls - **Medium**: Attributes appear in one document type only, or are inferred from multiple conversations - **Low**: Attributes inferred from a single source or from indirect evidence ### Messaging Framework - **High**: Value propositions documented in official materials AND used consistently in sales conversations - **Medium**: Documented but not observed in practice, OR observed but not documented - **Low**: Extracted from a single pitch deck or single call ### Tone Matrix - **High**: Explicit tone guidance exists for the context AND matches observed behavior - **Medium**: Tone inferred from 3+ examples of content in that context - **Low**: Tone inferred from 1-2 examples, or extrapolated from similar contexts ### Terminology - **High**: Terms explicitly listed in a style guide or glossary - **Medium**: Terms consistently used in templates and calls (pattern-based) - **Low**: Terms observed in a single document or inferred from brand personality ### Language Patterns (from transcripts) - **High**: Pattern observed in 5+ calls across multiple speakers - **Medium**: Pattern observed in 3-4 calls or from a single top performer - **Low**: Pattern observed in 1-2 calls only ### Transcript-Primary Scenarios When guidelines are generated primarily from conversational sources (no AUTHORITATIVE documents available): - Voice Attributes derived from 5+ transcripts = **Medium** (not Low) - Messaging Framework from consistent patterns across 5+ calls = **Medium** - Language Patterns weight increases from 10% to 20% in aggregate calculation (subtract 10% from Voice Attributes) Note this in the guideline metadata: "Guidelines generated primarily from conversational sources — team review recommended to formalize." ## Aggregate Confidence Calculate overall guideline confidence as the weighted average of section scores: | Section | Weight | |---------|--------| | Voice Attributes | 30% | | Messaging Framework | 25% | | Tone Matrix | 20% | | Terminology | 15% | | Language Patterns | 10% | Convert scores: High = 1.0, Medium = 0.6, Low = 0.3 **Example:** - Voice Attributes: High (1.0 x 0.30 = 0.30) - Messaging: Medium (0.6 x 0.25 = 0.15) - Tone: Medium (0.6 x 0.20 = 0.12) - Terminology: High (1.0 x 0.15 = 0.15) - La