Pol Probe

Validate is where solo builders must prove narrow bets cheaply; PoL Probe is the canonical shelf for running those mini-experiments before full build commitment. Prototype subphase covers spike tests, usability checks, and feasibility trials—the exact disposable probes the skill formats—not long-term roadmap scoping alone.

Also useful

Also useful

Where it fits

Example use

Test whether prospects correctly interpret your positioning line before you lock landing copy.

Example use

Run a Task-Focused usability probe on destructive vs reversible actions before you polish the confirmation flow.

Example use

ValidateScope & plan

Use a Feasibility Check on transcript summarization error rates to decide whether the meeting-notes feature belongs in v1.

Example use

Spike a third-party webhook or Zoom pipeline with ten real samples and a rewrite threshold before wiring production jobs.

Example use

ShipCI/CD & deploy

Re-probe a critical onboarding label change after a failed first test showed confusion below your pass bar.

How it compares

Use instead of open-ended chat debate or full feature builds when you only need a narrow, killable learning loop.

Common Questions / FAQ

Who is pol-probe for?

PoL Probe is for solo builders, indie PMs, and technical founders who own discovery and delivery and need disciplined, small-bet validation before coding or redesigning.

When should I use pol-probe?

Use it in Validate to run prototype or usability probes; in Idea when competitor or audience assumptions need a quick task test; and in Build when a feasibility spike (API, GenAI, integrations) must hit explicit error thresholds before you schedule the real sprint.

Is pol-probe safe to install?

It is procedural PM guidance, but you should still review the Security Audits panel on this Prism page and treat any third-party test or AI spike tools you attach to probes as your own integration risk.

SKILL.md

READMESKILL.md - Pol Probe

# PoL Probe Examples

### ✅ Good: Task-Focused PoL Probe

**Hypothesis:** "Users can distinguish between 'archive' and 'delete' without a confirmation modal."

**Probe Type:** Task-Focused Test
**Method:** UsabilityHub 5-second test with 20 users
**Timeline:** 2 days (build task, recruit, analyze)
**Success Criteria:**
- **Pass:** 80%+ users correctly identify action consequences
- **Fail:** <60% correct, or 3+ users express confusion

**Harsh Truth Delivered:** Only 45% understood the difference. Added explicit labels ("Delete forever" vs "Archive for 30 days"). Re-tested at 92%. **Probe deleted.**

**Why This Works:**
- Narrow hypothesis (one UI decision)
- Fast execution (2 days)
- Brutal honesty (failed first attempt)
- Disposable (deleted after learning)

---

### ✅ Good: Feasibility Check PoL Probe

**Hypothesis:** "We can auto-generate meeting summaries from Zoom transcripts using GPT-4 with <2% error rate."

**Probe Type:** Feasibility Check
**Method:** 1-day spike with 10 real transcripts, GenAI prompt chain
**Timeline:** 1 day
**Success Criteria:**
- **Pass:** 9/10 summaries accurate, <5 manual edits per summary
- **Fail:** >3 summaries require full rewrites

**Harsh Truth Delivered:** Error rate was 18%. Discovered that medical jargon and crosstalk broke the model. Decided NOT to build feature. **Probe deleted.**

**Why This Works:**
- Eliminated technical risk before building
- Cheap (1 day)
- Falsifiable (clear error threshold)
- Saved months of wasted development

---

### ❌ Bad: "Prototype Theater" (Not a PoL Probe)

**Hypothesis:** "Executives will approve budget if we show a polished demo."

**Probe Type:** *(None—this isn't a probe)*
**Method:** 3-week Figma design + coded prototype with animations
**Timeline:** 3 weeks
**Success Criteria:** "Get exec buy-in"

**Why This Fails:**
- Not testing a user hypothesis (testing internal politics)
- Too polished (3 weeks = not disposable)
- No harsh truth (vanity metrics: "Execs liked it!")
- No clear disposal plan (became "the prototype we have to maintain")

**What Should Have Been Done:**
- Skip the prototype entirely
- Use a **Narrative Prototype** (Loom walkthrough in 1 day)
- Test with 5 target users, not executives
- Measure task completion, not stakeholder applause

---

### ❌ Bad: MVP Disguised as PoL Probe

**Hypothesis:** "Users will subscribe to our AI writing assistant."

**Probe Type:** Vibe-Coded PoL Probe *(claimed)*
**Method:** Fully functional React app with Stripe integration
**Timeline:** 4 weeks
**Success Criteria:** "10 paying customers"

**Why This Fails:**
- Not disposable (4 weeks of work, payment processing—too invested)
- Too broad ("will subscribe" tests price, value prop, UX, and onboarding simultaneously)
- This is an **MVP**, not a probe
- No disposal plan (team will resist deleting working code)

**What Should Have Been Done:**
- Test value prop with a **Narrative Prototype** (explainer video, measure interest)
- Test pricing with a **landing page + waitlist** (ConvertKit, Carrd)
- Test core AI quality with a **Feasibility Check** (1-day spike)
- Build MVP only after all three probes pass

---

## Common Pitfalls

### 1. **Prototype Theater**
**Failure Mode:** Building impressive demos that don't test hypotheses.

**Consequence:** Waste weeks impressing stakeholders while learning nothing about users.

**Fix:** Before building, write down: "What harsh truth am I seeking? What would prove me wrong?"

---

### 2. **Confusing PoL with MVP**
**Failure Mode:** Treating probes as "version 1.0" instead of disposable experiments.

**Consequence:** Scope creep, technical debt, resistance to disposal.

**Fix:** Set a disposal date before you start. If you can't commit to deleting it, you're building an MVP, not a probe.

---

### 3. **Choosing Based on Tooling Comfort**
**Failure Mode:** "I know Figma, so I'll design a prototype" (regardless of whether design is the risk).

**Consequence:** Validate the wrong thing; miss the actual risk.

**Fix:*

What is this skill?

Frames Task-Focused probes (e.g. short usability tasks with % pass thresholds) and Feasibility Check probes (e.g. one-da

Requires a single falsifiable hypothesis, method, timeline, and brutal pass/fail success criteria before execution

Optimizes for harsh truth: failed probes trigger redesign or kill decisions, then probe artifacts are retired after lear

Examples show 2-day usability cycles and 1-day GenAI feasibility spikes with concrete error-rate gates

Keeps scope narrow—one UI decision or one technical risk—so indie builders can run probes without a research team

Readme documents two probe archetypes: Task-Focused Test and Feasibility Check

Example timelines include 2-day usability cycles and 1-day technical spikes

Example success gates include 80%+ user clarity pass and 9/10 accurate summaries pass

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 1.1k installs on skills.sh; 5k GitHub stars; 3/3 security scanners passed (skills.sh audits).

Who is it for?

Indie founders and solo PMs who have one sharp hypothesis (UI wording, feature feasibility, model accuracy) and want UsabilityHub-scale or one-day spike validation without a research org.

Skip if: Teams with frozen specs, no open hypotheses, or mature UX research programs where every decision is already user-tested and signed off.

What do I get? / Deliverables

You leave with a scoped probe brief—hypothesis, method, timeline, and success gates—so you can run the test, act on harsh results, and retire the probe once the learning is captured.

PoL probe brief with hypothesis, method, timeline, and success criteria

Pass/fail outcome with harsh-truth interpretation

Probe retirement note after learning is applied to the roadmap or kill decision

Journey fit

Spans multiple journey phases - primary shelf plus alternate fits below.

Primary fit

Also useful

Also useful

Where it fits

Example use

Test whether prospects correctly interpret your positioning line before you lock landing copy.

Example use

Run a Task-Focused usability probe on destructive vs reversible actions before you polish the confirmation flow.

Example use

ValidateScope & plan

Use a Feasibility Check on transcript summarization error rates to decide whether the meeting-notes feature belongs in v1.

Example use