
neolabhq/context-engineering-kit
68 skills36.2k installs73.6k starsGitHub
Install
npx skills add https://github.com/neolabhq/context-engineering-kitSkills in this repo
1Prompt Engineeringprompt-engineering is a journey-wide agent skill from the context-engineering kit that teaches advanced prompting patterns—few-shot exemplars, chain-of-thought, structured outputs, and production template design—whenever you author or refine LLM-facing text. Solo builders wear every hat: one day you draft a skill.md, the next you tune a hook or a sub-agent brief before shipping. The skill says to invoke it while writing commands, hooks, skills, sub-agent prompts, or any interaction where outputs must be consistent, verifiable, or format-bound. It balances example count against token cost, shows concrete support-ticket extraction demos, and emphasizes step-by-step reasoning for multi-hop tasks. It is procedural knowledge, not a single API call—use it before committing brittle prompts into repos your agent will replay for months.615installs2Context EngineeringContext Engineering Fundamentals is a journey-wide agent skill for solo builders who ship with Claude Code, Cursor, Codex, and similar stacks. It explains what actually sits in the model window at inference time—system instructions, tool definitions, retrieved documents, conversation history, and prior tool results—and why treating that pile as a curated set rather than a dump changes outcomes. The skill is for the moment you are writing or refactoring a SKILL.md, tightening a slash command, or splitting work across sub-agents and need a shared vocabulary for tradeoffs. It emphasizes progressive disclosure, signal density, and calibrating system-prompt specificity so agents stay maintainable without becoming brittle. You reach for it before large prompt rewrites, when context feels bloated or contradictory, or when onboarding a new skill into a multi-step workflow. It does not replace domain skills; it makes every other skill and integration easier to compose safely.601installs3Thought Based Reasoningthought-based-reasoning is a journey-wide agent skill that teaches Chain-of-Thought prompting and its major variants so solo builders get reliable answers from Claude, Codex, or Cursor on hard problems. Instead of guessing prompt wording, you match the task—quick reasoning without examples, high-stakes decisions needing consensus, exploratory search, decomposed subproblems, tool use, or code execution—to a documented technique with templates and a complexity versus accuracy tradeoff table. It fits early Idea research when you compare options step by step, Validate when you stress-test assumptions, Build when you design agent flows, Ship when you review logic-heavy changes, and Grow when you interpret ambiguous metrics. The skill does not replace domain expertise; it structures how the model thinks aloud so you can audit intermediate steps and reduce silent reasoning errors.590installs4KaizenKaizen is a journey-wide agent skill from the context-engineering kit that instills continuous improvement: small verified steps, error-proof designs, respect for established patterns, and explicit guardrails against over-engineering. It is meant to be applied whenever you implement features, refactor, architect systems, tune processes, or harden error handling—not only at one milestone. Solo builders benefit because agents otherwise tend toward large diffs and speculative abstractions; Kaizen steers toward the smallest viable improvement, fixing minor issues in scope, and compounding quality over sessions. Use it across Idea through Operate whenever you want procedural restraint paired with steady progress. It pairs naturally with planning and review skills but does not replace security audits or formal test suites.573installs5Test Driven DevelopmentTest-Driven Development is an agent skill that enforces true test-first implementation: write the test, watch it fail for the right reason, then write minimal code to pass. Solo and indie builders use it whenever an agent might ship plausible but untested code—features, bugfixes, refactors, or behavior tweaks—because passing tests only matter if you saw them fail first. The skill encodes a hard workflow (delete premature implementation, implement fresh from tests) aligned with red-green-refactor and clear carve-outs for prototypes, generated artifacts, and config when a human explicitly agrees. It fits Claude Code, Cursor, Codex, and similar agents during active coding in Build and pre-release hardening in Ship, turning chat-driven implementation into behavior you can regression-check.564installs6ReflectReflect is an agent skill from the context-engineering kit that makes the model re-read its last answer under a strict self-refinement framework. It opens with complexity triage, then applies a critical perfectionist stance meant to surface gaps, security issues, and unearned assumptions before you merge or deploy. Solo builders use it when agent-generated code looks fine at a glance but you want a second pass that defaults to disapproval. Optional hints narrow the lens to security or trigger deeper reflection when confidence is below a threshold you set. Pair it after implementation or debugging skills and before shipping checkpoints. It does not replace automated tests or human review, but it reduces the chance you bless shallow work because the first draft sounded confident.563installs7Multi Agent PatternsMulti-agent-patterns is an agent skill from the Context Engineering Kit that teaches solo and indie builders how to design Claude Code–style multi-agent architectures when one agent’s context is too small, tasks naturally split into subtasks, or specialization improves output quality. It stresses that sub-agents exist mainly to isolate context—not to mimic org charts—and walks through supervisor, swarm, and hierarchical layouts plus coordination and failure handling. Use it while scoping agent features, refactoring monolithic agent loops, or planning production workflows that chain research, coding, and review. The skill fits builders shipping agent products, internal coding assistants, or automation that exceeds a single invocation’s reasoning and tool budget. Expect architecture prose and pattern selection criteria rather than a one-click integration; pair it with concrete planning and observability skills once the topology is chosen.560installs8CritiqueCritique is a journey-wide agent command that coordinates comprehensive multi-perspective review of work you have already done. It implements Multi-Agent Debate with LLM-as-a-Judge structuring, Chain-of-Verification so each judge checks its own critique, and a consensus phase so recommendations reflect argued agreement rather than a single skim. Solo builders invoke it when they want structured quality assessment on specific files, commits, or recent session output without the agent silently rewriting artifacts. Findings are report-only: you decide what to fix. That makes it useful after implementation spikes, before launch checklists, when validating research summaries, or when auditing agent-generated plans. Optional argument-hint paths narrow scope; otherwise the coordinator uses recent changes and conversation context. Pair it with fix-oriented skills separately if you want automated remediation after the report lands.559installs9Update Docsupdate-docs is an agent skill for solo builders who change code locally and need living documentation to stay accurate without drowning in stale pages. It coordinates tech-writer-style agents to update docs/, README files, inline JSDoc, and API references after feature work or refactors, with optional arguments to narrow to a directory or documentation type. The workflow assumes you may still have uncommitted changes and defaults to those; if the tree is clean, it focuses on the latest commit. It leans on documentation quality principles and Context7 MCP to ground updates in real project facts rather than generic filler. Use it when implementation moved faster than your docs, when onboarding readers depend on README accuracy, or before review and ship when API contracts changed. It is built to produce clear, maintainable prose while cutting unnecessary maintenance overhead.558installs10Subagent Driven DevelopmentSubagent-driven development is an agent workflow for solo builders who already have a plan or a pile of independent fixes and want speed without turning one long chat into a tangled context window. The skill prescribes dispatching a fresh subagent for each task or issue, then reviewing code or output before moving on—sequentially when work is coupled, in parallel when investigations do not share state. It fits right after planning skills in a context-engineering stack: you keep the parent session focused on orchestration and quality gates while subagents do the narrow implementation or debugging passes. Compared with doing everything in one agent thread, you trade a bit of dispatch overhead for cleaner reasoning and earlier review. Use it when SKILL.md triggers match—implementation plans with separable tasks, or three or more independent issues you can investigate without dependencies—not for single-line tweaks that do not need isolation.557installs11BrainstormBrainstorm is a journey-wide agent skill from NeoLab’s context-engineering kit that helps solo builders convert fuzzy intent into collaborative, reviewable designs. Instead of jumping straight to tickets or codegen, the agent inspects the current repository and conversation context, then refines purpose, constraints, and success criteria through disciplined questioning—never stacking multiple questions in one turn when a topic needs depth. After understanding, it surfaces six differentiated approaches with stated probabilities and trade-offs, then drafts the emerging design in digestible sections so you can approve or correct course early. Use it whenever you are creating or developing something non-mechanical and have not locked an implementation plan. The output is meant to feed downstream planning skills or your own spec doc, reducing rework when agents otherwise hallucinate scope.555installs12Judge With Debatejudge-with-debate is a context-engineering workflow skill that evaluates one or more solution paths through multi-round debate among independent judges, orchestrated after a meta-judge defines criteria everyone shares. Solo builders use it when agent-generated plans, specs, or implementations need rigorous scoring—not a quick gut check—and when you want structured argumentation that forces evidence from the solution and the evaluation spec. The pattern dispatches a meta-judge first, then runs parallel judges that refine assessments across up to three rounds until consensus or round limits. It fits agent stacks where quality gates matter before merge or release. Complexity is advanced: you need clear solution inputs, evaluation intent, and tolerance for multi-step agent orchestration and report artifacts on disk.555installs13MemorizeMemorize is an agent skill from the Context Engineering Kit that closes the loop on Agentic Context Engineering (ACE) by consolidating what your agent learned during reflection and debate into CLAUDE.md. Solo and indie builders who run long agent sessions need durable memory outside the chat window; this skill harvests critiques, verification outcomes, and execution feedback, then curates them into structured guidance the next invocation can apply immediately. It is meant to run after reflexion workflows—not as a substitute for planning or implementation—and it favors additive, specific bullets over collapsing everything into a short summary. Use it whenever you want your Claude Code or Cursor setup to accumulate institutional knowledge the way a team wiki would, without manually copying notes after every session.553installs14Sdd:PlanSdd:plan is a Context Engineering Kit workflow command that refines a draft task specification—typically created by `/add-task`—into an implementation-ready plan. It orchestrates parallel research, codebase, and business analysis, synthesizes architecture, decomposes work with risks, parallelizes steps for faster solo or small-team execution, adds LLM-as-Judge verification sections, and moves the file from `.specs/tasks/draft/` to `.specs/tasks/todo/`. Use it when you already have a rough feature markdown spec and need ordered, verifiable tasks before your coding agent touches the repo. The skill is intermediate overhead suited to spec-driven development stacks, not ad-hoc one-file tweaks. Supports `--continue` to resume from a named stage. Pair with drafting skills upstream and implementation skills downstream once the todo task is approved.550installs15Root Cause TracingRoot-cause-tracing is a systematic debugging skill for agents and developers when errors manifest far from their source—wrong working directory, bad paths, corrupt state passed through many layers. It encodes a backward walk through the call stack, adding instrumentation when static reading is insufficient, until you identify who introduced invalid data or incorrect behavior. Solo builders benefit during Ship when tests fail mysteriously or integrations throw nested exceptions, and again during Operate when production logs show the same pattern. The skill deliberately discourages patching at the throw site because that leaves the trigger intact. It pairs with structured reproduction steps and optional guardrails after the true fix lands.548installs16Write ConciselyWrite Concisely is an agent skill that applies William Strunk Jr.’s classic Elements of Style rules to documentation and other prose humans will read. Solo and indie builders use it when drafts feel wordy, passive, or vague—READMEs, API docs, landing copy, changelogs, and support articles all benefit. The skill encodes elementary usage and composition principles (active voice, positive statements, concrete language, tight paragraphs) plus a misused-words reference, so your agent edits systematically instead of guessing tone. It fits anywhere you produce reader-facing text: while documenting features in build, tightening launch messaging, or improving growth content. It is methodology, not a linter—pair it with your doc generator or PR description workflow when you want professional, scannable writing without hiring an editor.548installs17Agent EvaluationAgent-evaluation is a meta skill from the context-engineering kit for solo builders who ship with Claude Code and need evidence—not vibes—that prompt and context changes work. It frames agent testing differently from unit tests: runs are non-deterministic, routes vary, and success is judged on outcomes plus reasonable process. You apply structured rubrics across accuracy, completeness, citation fidelity, source quality, and tool use, mixing automated LLM judging with targeted human passes on brittle cases. Use it while authoring or refactoring skills and commands, after refactors that might regress behavior, or when benchmarking browsing-style agents against hard retrieval tasks. The BrowseComp performance discussion reinforces that evaluation design should track the few drivers that dominate variance so you do not over-fit noisy single metrics. Intermediate-to-advanced builders maintaining agent libraries get the most value; casual one-shot scripts rarely need this depth.546installs18Analyse ProblemAnalyse Problem is a journey-wide agent skill that applies the A3 problem-solving format—Background, Current Condition, Goal, Root Cause Analysis, Countermeasures, Implementation Plan, and Follow-up—into a single concise document. Solo builders invoke it whenever a issue is too fuzzy for a ticket title but too urgent to ignore: churn spikes, flaky deploys, scope creep on a solo SaaS, or support themes that hint at product debt. The skill defaults to markdown output and walks through measurable targets and verification, pushing you past symptom-chasing. It is methodology, not an integration: no APIs, just disciplined thinking you can run in Validate when scoping risk, Ship when reviewing incidents, or Operate when iterating on systemic fixes. Beginner-friendly structure with intermediate discipline if you supply real metrics in Current Condition.542installs19Plan TaskPlan-task is a specification discipline skill for solo builders using the context-engineering-kit specs layout. It takes a task file path such as .specs/tasks/task-{name}.md and forces business analysis through a scratchpad workspace before anything lands in the canonical task doc. The agent dumps discoveries across phased sections—requirements discovery, concept extraction, deeper analysis, and draft material—then copies only verified, relevant content forward. That separation reduces the classic failure mode where half-baked bullets become “the spec” and implementation thrashes. The tone is strict on testability and completeness because vague criteria waste agent and human time alike. Use it when a ticket exists but engineers cannot implement confidently, or when acceptance criteria are missing entirely. It pairs naturally with downstream implementation skills once the task file holds crisp, measurable outcomes.542installs20AnalyseAnalyse is a smart routing skill from the context-engineering kit that reads what you want to improve and applies the most fitting Kaizen technique instead of forcing one playbook. Gemba Walk suits authentication implementations, architecture gaps, and places where documentation diverges from running code. Value Stream Mapping fits deployment workflows, pipelines, and time trapped between teams or systems. Muda targets inefficiencies, debt, and resource waste when you ask to analyse a codebase for problems. Solo builders use it when they feel slow or bloated but are unsure whether to explore code, map a process, or hunt waste. It is multi-phase because the same command works during Build when learning a feature, Ship when tuning release flow, and Operate when iterating on quality. Prism’s canonical shelf is Operate → iterate because the deliverable is structured findings that drive the next improvement loop, not a shipped feature by itself.541installs21Review Local ChangesReview Local Changes is an agent skill for solo builders who want a disciplined pass over what is sitting in the working tree—not a full PR review on a remote branch, but a pre-commit sweep with structured findings. You can steer the pass with free-text review aspects and tighten noise with --min-impact so only issues at or above your chosen severity tier surface. Default behavior emphasizes high-impact and above, which fits indie teams that cannot fix fifty nits before every commit. Output can stay human-readable markdown or switch to --json when you want to pipe results into scripts or dashboards. The workflow intentionally ignores spec/ and reports/ churn unless you override that, keeping reviews focused on product code. It fits the Ship phase as a review gate, but the same ritual is useful late in Build when you are about to checkpoint a feature slice.538installs22CommitCommit is an agent skill for solo and indie builders who want git history that reads like a changelog instead of a pile of “fix stuff” messages. It walks the agent through branch policy on main or master, optional pre-commit lint for common JS package managers, staging when the index is empty, and a full diff review before anything is recorded. When the diff mixes unrelated concerns, it nudges you toward smaller, reviewable commits rather than one kitchen-sink blob. Output is conventional-commit titles with emoji so release notes and semantic versioning stay honest. It fits day-to-day build work but catalogs under Ship/review because that is when message quality and commit boundaries affect PRs, rollbacks, and CI. Pair it with your normal code-review flow; it does not replace human judgment on what should ship together.537installs23Add TaskAdd-task turns a vague chat request into a concrete draft task under .specs/tasks/draft/ so solo builders running context-engineering or spec-driven agents do not lose intent between sessions. It ensures folder layout and gitignore-friendly scratchpad exist, parses the user argument for verb-led titles, infers task type, and wires dependency references when you list prerequisite task files. The output is a single markdown task artifact ready for later promotion to todo, pairing naturally with plan execution skills downstream. It is most visible during Build PM, but the same capture helps in Validate when you are scoping what to prototype next. The skill is procedural, not a ticket integration—it does not sync to Jira or Linear unless you add that yourself. Use it the moment you decide “this is real work” but have not yet written a full implementation plan.536installs24Implement TaskImplement Task is an agent skill from the context-engineering kit that turns markdown task specifications into executed implementation work supervised by LLM-as-Judge verification. Solo builders shipping with Claude Code or similar agents use it when a feature is already decomposed into steps in a task file and they want the agent to keep going—launching implementation sub-agents, running judges on critical artifacts, and looping on fixes instead of stopping at the first draft. Command-line style arguments support continuing after interruption, refining only git-changed steps, and pausing after named steps for human approval. The skill explicitly discourages unnecessary user questions and treats incomplete verification as a blocker to advancing. It pairs naturally with upstream planning or spec skills and extends into Ship-like quality gates because judges act as automated reviewers. Expect intermediate-to-advanced agent workflow fluency, a repo with git history for refine mode, and tolerance for longer autonomous runs.536installs25Do In Stepsdo-in-steps is an agent skill that turns one overwhelming task into a supervised pipeline of smaller steps, each executed by a dedicated sub-agent with a clean context window. The orchestrator analyzes dependencies, orders work, and for every step launches an implementation agent alongside a meta-judge that defines evaluation criteria in parallel. Completed outputs are summarized and passed forward so later steps stay aligned with earlier decisions, while verification gates block progress until the judge accepts the step. Solo builders use it when a single chat thread would lose track of refactors, cross-module updates, or multi-phase automation. It fits teams of one who already run Claude Code-style agents and want institutional-style decomposition without writing a custom orchestration framework.534installs26Launch Sub Agentlaunch-sub-agent is a context-engineering command skill for builders who outgrow single-threaded coding sessions in Claude Code and similar agents. It formalizes how an orchestrator should analyze a natural-language task—implementation, research, documentation, or review—pick an appropriate model and optional named agent profile, and spawn an isolated worker that starts with explicit step-by-step reasoning and must finish with self-critique rather than hand-waving. The design goal is context hygiene: subsidiary work stays in a clean window while the parent keeps strategic state. Optional flags documented in the skill (--model, --agent, --output) let you override defaults when you already know cost or latency constraints. Use it during build when a ticket is too large for one reply but too small to warrant a fully manual second session, or when you want repeatable dispatch wording across teammates.534installs27Whywhy is an agent skill that implements iterative Five Whys root cause analysis: you state a problem clearly, ask why repeatedly (default five times), document each layer, validate by tracing from root cause back to the symptom, branch when several causes compete, and recommend fixes that address fundamentals—not band-aids. Solo and indie builders install it when a bug, flaky test, or operational incident keeps returning after superficial fixes, or when migrations and process gaps (missing indexes, weak PR checklists) need to be surfaced. It pairs naturally with debugging and ship workflows but applies anywhere a decision should be grounded in cause rather than guesswork. Invoke with a concrete issue description or let the agent prompt for ISSUE; adjust DEPTH when the problem is shallow or deeply systemic.533installs28Create IdeasCreate Ideas is a context-engineering-kit agent skill that runs one-shot creative sampling on a topic or problem you supply. Instead of a single obvious answer, it asks the model for six distinct list items, each with explicit probability—three from the high-confidence region and three deliberately from the long tail of the solution space. Solo builders use it at the earliest Idea phase when they want optionality before validation work: landing angles, feature bundles, positioning hooks, or problem framings without prematurely collapsing on one plan. Optional argument hints let you pass the topic and desired idea count context through the kit. It is lightweight procedural prompting, not a research scraper or market validator. Pair outputs with separate skills for competitor scans, scoping, or writing formal plans once you pick a direction.532installs29Test Skilltest-skill is an agent skill for solo builders and skill authors who treat SKILL.md like code that must be verified under stress. Instead of assuming a new rule works because it reads well, you run subagents through realistic scenarios without the skill (RED), capture how they shortcut or rationalize, then write or tighten the skill until behavior complies (GREEN), and finally close loopholes (REFACTOR). It explicitly targets skills that impose compliance costs—TDD gates, testing mandates, review rituals—where agents might trade quality for speed. Pure reference skills and syntax guides are out of scope. You should already understand the superpowers test-driven-development skill because this package only adds skill-specific test formats. The outcome is higher-confidence skills before you publish to a repo or catalog: documented failure modes, iterated wording, and evidence that agents resist “just this once” excuses. Intermediate to advanced because it demands subagent orchestration and honest observation of failures.532installs30Do And Judgedo-and-judge is an agent orchestration skill for solo builders who want implementation and verification separated: you stay the conductor while a sub-agent does the work and an independent judge scores it against criteria a meta-judge generated in parallel. It targets one concrete task at a time—such as refactoring a service to dependency injection—so you avoid mixing planning, coding, and self-review in one bloated context window. The flow runs meta-judge and implementer together, then judges against the rubric; failures trigger up to two retries with explicit feedback rather than vague “try again.” That makes it especially useful when agent output looks plausible but misses edge cases your own session would rationalize away. It fits indie teams treating agent runs like a mini CI gate before they trust a diff.531installs31Do In ParallelDo-in-parallel is a context-engineering command skill that turns one high-level task into many concurrent sub-agent runs across files or targets. It analyzes work for requirement grouping so related items share meta-judges and evaluation specs, picks an appropriate model tier, and generates prompts that demand reasoning plus self-critique before implementation. After dispatch, meta-judges run in parallel, implementors execute per task, and LLM-as-a-judge checks outputs against grouping-aware specs. Solo builders use it when sequential agent loops would take too long on refactors, test fixes, or multi-module updates inside Claude Code–style environments. The primary win is wall-clock time: independent units finish together while verification stays structured instead of ad hoc.531installs32Fix TestsFix Tests is a workflow skill for solo builders whose suite went red after business logic changes, refactors, or dependency bumps. It starts from project docs to learn how tests and coverage run, then applies a disciplined goal: restore passing tests while preserving what each test was meant to prove. A explicit complexity rule keeps you from hand-editing large blast-radius failures—multi-file or intricate logic changes should be handled by orchestrated sub-agents, while trivial single-file drift can be fixed inline. The skill pairs with optional sadd guidance when available. It is not a substitute for deciding correct product behavior; it assumes behavior is already right and tests need to catch up, except when a minimal logic fix is truly required.531installs33Write Testswrite-tests is a workflow skill for solo builders and small teams who just shipped feature work locally and need regression safety before merge. It scopes work from the current git diff—uncommitted changes including untracked files—or, when everything is committed, the latest commit. The stated goal is comprehensive coverage of critical business logic introduced or altered by those changes, not blanket 100% line coverage. A deliberate constraint keeps the main agent in orchestrator mode when two or more files change or when a single file carries complex logic; only trivial single-file edits may be implemented inline. The flow references reading a sadd skill when available during preparation, then uses coverage analysis and delegated agents to propose new tests while leaving existing suites intact. It fits backend APIs, CLIs, and SaaS services where fast iteration often outpaces manual test writing. Use it at the end of a build slice or immediately pre-PR when you want tests aligned to what actually changed.531installs34JudgeJudge is a coordinator skill from the context-engineering kit that turns ad-hoc “does this look good?” chat into a disciplined evaluation pipeline. Phase one extracts what was built in the current conversation; a meta-judge designs rubrics matched to the artifact type and your optional evaluation focus. A judge sub-agent then scores work in fresh context, citing specific evidence, and returns structured feedback without mutating files. Solo builders use it when agent sessions run long and self-approval creeps in, when you need a second opinion before merge or release, or when you want repeatable critique of plans, code, or docs. It implements the meta-judge to LLM-as-judge pattern with documented self-verification adjustments. Treat findings as input to your own fixes rather than an automated linter replacement.530installs35Tree Of Thoughtstree-of-thoughts is a context-engineering-kit skill that runs hard reasoning jobs through a full Tree of Thoughts loop instead of jumping to the first plausible answer. Solo builders and small teams use it when a feature, architecture, or agent plan has several viable paths and the cost of picking wrong is high. The skill walks exploration, structured pruning, expansion of promising branches, meta-judge specification of criteria, multi-perspective scoring, and synthesis into a final recommendation or artifact. That pattern fits validation debates, implementation design forks, and operational trade-off analysis alike. It is methodology-heavy: you supply the task and optional output criteria; the skill supplies the systematic exploration discipline. Pair it with planning or prompt-testing skills when the winning branch becomes a spec or a deployable prompt.530installs36Test Prompttest-prompt treats LLM instructions like code under test: you observe how an agent behaves without your prompt (RED), add instructions that fix the failures (GREEN), then tighten edge cases (REFACTOR). It covers commands, hooks, skills, subagent task text, and user-facing production prompts whenever clarity, consistency, or failure cost matters. The skill explicitly depends on understanding test-driven development and pairs naturally with prompt-engineering techniques. Solo builders use it while crafting new agent workflows in build, before release checks in ship, and when iterating prompts in operate. The core discipline is empirical—you do not know what a prompt must fix until you have watched a subagent miss the bar without it. That makes it a quality gate for anyone maintaining a growing skills library or Factory-style plugin set.529installs37Create Skillcreate-skill is the Context Engineering Kit command for authoring effective agent skills. It frames writing SKILL.md as Test-Driven Development applied to procedural documentation: you design pressure scenarios with subagents, observe baseline non-compliance, write the skill, rerun until behavior matches intent, then refactor to close loopholes. The overview insists you must already understand the TDD skill’s RED-GREEN-REFACTOR cycle because this command adapts that discipline to docs rather than code. Triggers cover net-new skills, edits to existing skills, and pre-deployment verification. Personal skills target agent-specific directories on disk. Solo builders who ship custom capabilities for Claude Code or Codex use it whenever generic prompting is not enough and they need durable, test-backed process knowledge. Official Anthropic best practices are referenced via /apply-anthropic-skill-best-practices as an enhancement layer after core TDD authoring.528installs38ActualizeActualize is an agent skill from the Context Engineering Kit that reconciles your project’s FPF (structured assurance) state with recent repository changes. Solo and indie builders who treat `.fpf/` as the source of truth for decisions and evidence run it when the codebase moves faster than the knowledge base—after merges, dependency bumps, or infra edits. The skill walks through git baselines, surfaces files that imply context drift, flags stale evidence, and highlights decisions that may no longer hold. It is not a one-time scaffold; it is maintenance for agent-ready context engineering. Use it when you want Observe-phase hygiene without manually re-reading every config file. Outputs are an audit-oriented report and optional updates to context artifacts so downstream agent work stays trustworthy.527installs39Analyze Issueanalyze-issue is a planning skill for solo builders and small teams who manage work in GitHub Issues but want implementation-grade specs before an agent touches code. You pass an issue number; the skill checks ./specs/issues/ for an existing markdown capture, and if missing, follows your project’s load-issues command pattern to pull title, body, and labels from GitHub. It then reviews related code structure and emits a technical specification with issue summary, priority inference, problem statement, technical approach, stepwise implementation plan, unit/component/integration test scenarios, and explicit file modify/create lists. That output is ideal when you are stacking agent sessions: the spec becomes the contract for a later build or ship pass, reducing vague “fix issue 42” prompts. It assumes GitHub CLI authentication and a repo that already uses the context-engineering-kit specs layout. It does not replace code review, security review, or automatic PR creation—it front-loads clarity so you spend fewer tokens rediscovering requirements mid-implementation. Pair it with issue-loading skills in the same kit when your specs folder is empty.525installs40Plan Do Check Actplan-do-check-act is a journey-wide agent skill from the context-engineering kit that teaches your coding agent to improve work through Deming’s PDCA loop instead of one-off fixes. You supply an improvement goal; the skill walks Plan (baseline, hypothesis, measurable targets), Do (small-scale change with notes), Check (before/after analysis), and Act (standardize or start the next cycle). Solo builders use it when shipping features, tuning prompts, fixing recurring bugs, or tightening onboarding—anywhere you need disciplined experimentation. It pairs naturally with causal-analysis commands referenced in the skill text. Complexity is intermediate because you must define metrics and tolerate partial failures as learning. It is procedural knowledge packaged as slash-style steps, optimized for agents that already follow structured commands in Claude Code or similar environments.525installs41Sdd:Implementsdd:implement is an agent skill for spec-driven delivery: you point it at a task file (for example a feature markdown spec) and it drives implementation through dedicated implementation agents while judges automatically review critical outputs. The workflow is built for solo and indie builders who want agent coding with guardrails rather than trusting a single long chat turn. Arguments cover continuing interrupted runs, refining only steps touched by recent git changes, and optional human-in-the-loop pauses on named steps. It pairs naturally with upstream planning or SDD task authoring skills in the same kit. Expect iterative fix cycles until judges pass or human gates approve, which makes it suited to feature work where quality bars matter on intermediate artifacts, not only the final diff.525installs42Apply Anthropic Skill Best PracticesApply Anthropic Skill Best Practices is a meta authoring guide for builders packaging procedural knowledge as agent skills. It walks through how Claude discovers skills at runtime—only name and description are pre-loaded until relevance triggers a full read—and why every line in SKILL.md still matters once loaded. You use it when a skill is complex enough to need explicit structure: progressive disclosure, testing across the models you actually deploy, and calibration so Haiku gets enough guidance while Opus is not over-explained. The skill is aimed at solo and indie developers maintaining SKILL.md repos for Claude Code, Cursor, Codex, or similar hosts, especially when you are reviewing an existing skill or starting a new one that must pass real tasks, not just read well in chat.524installs43Do Competitivelydo-competitively is a context-engineering-kit command that executes hard tasks through parallel competitive generation, structured meta-judge rubrics, multi-judge scoring, and evidence-based synthesis so you keep the best parts of several implementations. Solo builders reach for it when quality beats latency—specs, architecture choices, sensitive refactors, or deliverables where a single agent pass is risky. The orchestrator role is explicit: you coordinate sub-agents and judges without loading their full file dumps or report noise into your own context. Adaptive strategy picks whether to polish a clear winner, merge split decisions, or redesign after failure, with documented average savings from smarter synthesis. It pairs naturally with upfront planning skills when criteria and output paths are known, and it is overkill for tiny edits or tasks with no evaluation rubric.524installs44Create RuleCreate Rule is a context-engineering-kit skill for solo builders who want agent behavior to improve permanently, not just in one chat. When you or an implementation agent repeat the same mistake, this skill walks you through authoring a `.claude/rules` module with contrastive examples that remove ambiguity about what is wrong versus correct. Rules load every session as behavioral guardrails, while skills stay on-demand—so this is for cross-cutting standards, not one-off task playbooks. It covers constraints, formatting and architecture choices, gates before proceeding, and project-specific terminology. Use it after you notice a pattern worth enforcing repository-wide, especially when task-specific guidance would be better as a skill instead. The outcome is durable standing orders that every Claude Code–style session inherits automatically.523installs45Create Prcreate-pr is a ship-phase workflow skill that walks you through creating GitHub pull requests with GitHub CLI instead of guessing flags or skipping repo conventions. It is built for solo and indie builders who already use git daily but want an agent to enforce pre-flight hygiene—check git status, commit uncommitted work via the commit skill, then draft a PR that follows the project template. The guide covers installing and authenticating gh on macOS, Windows, and Linux, emphasizes English PR copy, and keeps automation inside explicit Bash allowances for gh and git. You get a repeatable path from “branch is ready” to “draft PR exists on GitHub,” which reduces back-and-forth from missing commits, wrong base branches, or empty descriptions. Pair it with your existing commit discipline; it does not replace code review or CI—it gets you to the review queue faster.522installs46Setup Serena Mcpsetup-serena-mcp is a guided agent skill for installing and configuring the Serena MCP server, which exposes semantic code retrieval and editing tools to Claude Code and compatible clients. Solo builders who want agents to navigate large codebases by symbols rather than blind grep install it when onboarding a new repo or standardizing team agent setup. The workflow starts by choosing whether configuration lives in shared CLAUDE.md, local CLAUDE.local.md, or global user settings, then detects if Serena is already reachable, then loads upstream Serena documentation for capabilities and run instructions. It is procedural setup documentation rather than a runtime integration, so success means correct MCP wiring and documented triggers in the right markdown file. Use early in a project’s agent stack and revisit when switching machines or clients.522installs47Cause And Effectcause-and-effect is a context-engineering-kit skill that walks you through Fishbone (Ishikawa) diagram thinking so you do not stop at the first plausible explanation for a failure. Solo builders and small teams invoke it when latency spikes, flaky releases, support escalations, or pre-launch risks need causes mapped across People, Process, Technology, Environment, Methods, and Materials. You state the problem as the fish head, brainstorm within each category, ask why repeatedly, label contributing versus root causes, rank by impact and likelihood, then propose fixes for the highest-priority branches. It is methodology rather than a single integration: use it during Operate incident reviews, Ship quality retros, Validate scope debates, or Build architecture decisions whenever symptoms could hide multiple upstream factors. The output is a structured cause map and prioritized remediation ideas your agent or team can turn into tickets, not a automated metric dashboard.521installs48Review PrReview-pr is an agent skill for solo and indie builders who want PR feedback that lands on the diff instead of a noisy summary thread. You invoke it when a pull request is ready for human-or-agent review, optionally narrowing scope with free-text aspects such as security or performance and filtering noise with `--min-impact` from critical through low. The workflow is strict about signal: each finding must be an inline comment tied to code, meaningful, and above your impact floor—overall review reports are forbidden. That design suits small teams using Claude Code, Cursor, or Codex on GitHub-style flows where maintainers want fewer, sharper comments. Default minimum impact is high, so day-to-day runs emphasize merge-blocking issues unless you deliberately widen the net for a deeper pass.518installs49Attach Review To Prattach-review-to-pr teaches an agent how to post line-specific pull request review comments the same way a human would in the GitHub UI, instead of dumping a single blob comment at the bottom of the PR. Solo and indie builders who ship through GitHub can keep review discipline inside Claude Code or similar agents: approve paths, request changes on risky hunks, and document nitpicks on exact lines without context switching. The skill documents a preferred MCP GitHub inline-comment integration and a practical fallback through authenticated gh api calls, including patterns for one comment versus a coordinated multi-comment review. It fits teams-of-one who still want PR hygiene, external collaborators, or audit trails before merge. You need gh auth and an existing PR; the output is durable inline review data attached to the diff for authors to address in follow-up commits.516installs50Create AgentCreate Agent is a procedural skill for solo builders and small teams who ship with Claude Code and need repeatable autonomous helpers—not one-off chat personas. It walks you from a name and purpose through the full `agents/*.md` contract: when Claude should delegate via Task, how descriptions must encode triggering examples, and how the system prompt should scope multi-step work in an isolated context window. The skill explicitly separates agents from slash commands so you do not misuse `commands/*.md` for long-running subprocess work. You get Anthropic-aligned structure checks, mkdir-safe file creation, and patterns that make agents discoverable when the parent session hits matching contexts. Use it when you are adding a reviewer, researcher, or implementer agent to a repo, refactoring brittle prompts into maintained agent files, or onboarding contributors who need a single canonical template. It does not replace product architecture or MCP wiring; it standardizes how delegated work is packaged for Claude Code’s Task tool.515installs51Load Issuesload-issues is an agent skill that turns a GitHub repository’s open issue queue into a versioned markdown backlog under `./specs/issues/`. It uses the GitHub CLI to list open issues, pulls rich metadata per number, and writes consistently named files so context-engineering and planning workflows can cite stable local paths instead of scraping the web UI. Solo builders shipping with Claude Code or Cursor benefit when they want specs, agents, and humans to share one source of truth for what is in flight—especially before writing plans or splitting work across sessions. The skill is intentionally procedural: defined bash allowances, a single markdown template, and a post-run inventory of created files. It does not triage, close, or edit issues; it loads and documents them for downstream PM and build steps in the same kit.513installs52Setup Codemap CliSetup Codemap CLI is a guided integration skill for solo developers who onboard Claude Code onto codebases that are hard to scan in plain text. It starts by asking where configuration should live—team-shared git paths, personal local overrides, or global user defaults—so you do not accidentally commit machine-specific hooks. The workflow checks whether the `codemap` binary is present, pulls current upstream documentation, and then patches the right CLAUDE.md and settings files for intelligent map-based navigation. That matters when you inherit a large monorepo, switch contracts weekly, or run agents that otherwise grep blindly. The skill is opinionated about path choice and gitignore hygiene, which prevents the common failure mode of broken teammate setups. After setup, Codemap becomes part of your agent context stack alongside other context-engineering-kit skills. It is not a replacement for reading architecture docs; it reduces time-to-orientation so Build and later Operate debugging sessions start with a spatial mental model of the tree.513installs53Customaize Agent:Prompt Engineeringcustomaize-agent:prompt-engineering is a journey-wide agent skill for solo builders who repeatedly draft or refine the language that drives Claude Code, Cursor, Codex, and similar systems. Its frontmatter trigger covers writing commands, hooks, skills, sub-agent prompts, and any LLM interaction where vague instructions produce flaky JSON, skipped steps, or inconsistent tone. The body emphasizes practical patterns such as few-shot exemplars for format lock-in and chain-of-thought scaffolding for harder reasoning tasks, framed for people shipping custom agent packs rather than one-off chat questions. You can invoke it while validating a product idea’s research prompts, scoping landing copy, implementing new skills, or tightening review agents before ship. Because prompt quality compounds across every phase, treating this as agent-tooling infrastructure pays off more than polishing a single feature prompt in isolation.512installs54Git WorktreesGit Worktrees is an agent skill that teaches Git’s linked working-directory model so you can work on multiple branches at once without stashing or second clones. It explains main vs linked worktrees, shared object storage, branch locking, and metadata under `.git/worktrees/`, then gives copy-paste command recipes for creating worktrees from existing or new branches, listing and pruning stale entries, and keeping one branch per active context. Solo and indie builders shipping with Claude Code, Cursor, or Codex use it when they need to review a pull request, spike an alternative implementation, or run isolated tests while leaving uncommitted work untouched on the primary tree. The skill emphasizes a simple rule: change directories to change branches. That reduces merge surprises and context-switch friction compared with aggressive stash stacks or duplicate checkouts on disk.512installs55Propose HypothesesPropose Hypotheses is an agent workflow skill that executes the First Principles Framework cycle for a stated problem: initialize context, generate competing L0 hypotheses, verify reasoning, validate evidence, audit trust, and land a decision with artifacts on disk. It creates a `.fpf/` directory tree so decisions are reproducible and tiered knowledge (L0, L1, L2, invalid) stays organized across sessions. Solo builders use it when chat brainstorming is too vague—they want competing explanations, explicit evidence files, and a recorded outcome before writing plans or code. The skill is orchestration-heavy: the main agent mkdirs, then launches specialized fpf-agent passes that read plugin task files and write summaries and hypothesis lists to known paths. It pairs well with later planning skills once a hypothesis set is accepted or rejected.512installs56Statusstatus is an agent skill from the neolabhq context-engineering-kit that prints a structured health report for your FPF knowledge base stored under `.fpf/`. Solo builders using formal context engineering need a quick read on whether hypotheses are stuck in L0, evidence has expired, or decisions are piling up without closure—this skill automates that inventory at run time. It walks directory structure checks, per-layer file counts, evidence freshness against `.fpf/evidence/`, and lists decision records so you can see if you should verify, validate, decide, or run decay. The inferred phase label ties the counts to the methodology’s reasoning stages, which helps agents and humans align the next command in the kit. It is lightweight PM instrumentation for agentic workflows rather than a deploy or SEO tool, and pairs naturally with other FPF kit skills when warnings suggest `/fpf:decay` or similar follow-ups.512installs57Build McpBuild-mcp is a structured development guide for solo builders shipping Model Context Protocol servers that agents can actually use on real tasks. It walks through research and planning first—workflow-oriented tool design, consolidating related API steps, and respecting tight context budgets—then implementation choices in FastMCP or the MCP SDK, plus evaluation-minded iteration. You reach for it when you are turning a product API, internal admin action, or third-party SaaS into durable agent capabilities, not when you only need a one-off curl snippet. The skill emphasizes high-impact tools agents can chain into complete outcomes rather than mirroring every HTTP route. It pairs naturally with context-engineering practices elsewhere in the same kit. Expect intermediate familiarity with your target language and the service you are wrapping; deliverables are a running MCP server and a coherent tool catalog your Claude Code or Cursor session can invoke.511installs58Code Review:Review Local Changescode-review:review-local-changes is an agent skill from the context-engineering-kit that runs a comprehensive, structured review of everything you have changed locally but not yet committed. Solo builders use it when they want a second opinion on a WIP diff—security, performance, correctness, or custom focus areas passed as free-text review-aspects—without opening a pull request. The workflow emphasizes systematic evaluation and concrete fixes, filtering noise with an impact threshold so only issues at or above your chosen level appear. You can keep human-readable markdown for day-to-day use or emit JSON when another script or agent needs to consume findings. It fits the ship/review moment: you have working tree changes, you are about to commit, and you want confidence that obvious regressions and high-impact problems are caught first. Pair it with your normal git workflow; treat it as a pre-commit reviewer that respects your repo layout by ignoring spec/ and reports/ unless you override that default.511installs59Create CommandCreate-command is a meta assistant skill that helps solo builders design and file new Claude commands that fit a disciplined command library. Instead of pasting ad-hoc prompts, you describe purpose and the skill maps it to planning, implementation-with-modes, or analysis patterns, picks project versus user scope, and emits markdown that matches Scopecraft-style structure—including task framing, context pointers, and category examples like brainstorm-feature or implement. It explicitly directs the agent to read the command creation guide first, reducing drift in naming, stages, and MCP tool references. The workflow covers requirement gathering, pattern selection, file generation, ancillary assets, and doc updates so commands stay discoverable for future sessions. Use it when you are investing in a reusable command system rather than one-off chats. It pairs naturally with other context-engineering kit skills once hooks and commands define how work enters the repo.511installs60Create HookCreate-hook is an interactive agent skill from the context-engineering kit that designs, implements, and validates git hooks tailored to what your project actually runs. Solo builders use it when they want consistent guardrails—type-check after edits, auto-format with Prettier, ESLint fixes, tests or builds before commits—without hand-writing hook scripts from scratch. The skill starts by detecting tooling (TypeScript, Prettier, ESLint, npm scripts) and whether git is present, then proposes practical PostToolUse and PreToolUse patterns aligned with agent workflows, not only classical pre-commit files. It walks through configuration questions, generates the hook script, and emphasizes testing so hooks fail loudly instead of silently skipping. The result is repo-local automation that pairs well with Claude Code-style tool hooks. It is less about product features and more about tightening the loop between agent edits and verifiable quality, which pays off across Build and Ship.511installs61DecayDecay is an agent skill for managing evidence freshness in structured decision workflows inspired by FPF B.3.4. Solo builders and small teams use it when important choices—performance, security, architecture, or launch readiness—still cite benchmarks, audits, or test runs whose valid_until dates have passed. The skill explains what stale evidence means, when waiving is appropriate versus negligent, and how to choose among Refresh (re-obtain proof), Deprecate (retire or downgrade the decision), and Waive (acknowledge risk until a set date). It is aimed at builders who keep decision logs or knowledge bases in repo markdown rather than ad-hoc chat memory, especially alongside context-engineering or FPF-style kits. Use it during iteration and pre-release checks so you do not ship on expired assumptions, and pair waivers with scheduled refresh rather than treating them as permanent exceptions.511installs62ResetReset is an agent skill from the context-engineering kit that ends an FPF reasoning cycle cleanly and starts a new one. Solo builders use it when a brainstorming or verification thread has gone stale, hypotheses were rejected, or they want to abandon a line of inquiry without losing an audit trail. The documented soft reset writes a session archive markdown file under .fpf/sessions/, records reset reason, hypothesis inventory, and decision status, and optionally relocates active knowledge files from L0, L1, and L2 into a dated archive folder. It is for agents and humans co-maintaining .fpf knowledge trees, not for wiping git history or production databases. Pair it at the start of a new problem statement after you have captured enough context in the archive for later reference.511installs63Sdd:BrainstormSDD Brainstorm is a context-engineering-kit skill that turns fuzzy product or feature concepts into designs an agent can implement later. It follows a disciplined ritual: inspect the current repo and docs, ask one refining question at a time (preferring multiple choice), explore six competing approaches with honest trade-offs, then deliver the design in short sections while checking alignment after each chunk. Solo and indie builders shipping with Claude Code, Cursor, or Codex should invoke it whenever they might otherwise jump straight into code or a task list—especially for net-new capabilities, ambiguous requirements, or cross-cutting changes. The skill is journey-wide because the same questioning pattern applies during idea exploration, validation scoping, build planning, ship readiness reviews, launch positioning, and growth experiments; only skip it when the next steps are already fully specified mechanical work. Outcomes are an agreed design narrative and criteria that downstream planning or implementation skills can consume without rework.509installs64Setup Arxiv McpSetup arXiv MCP is an agent skill that walks through enabling an arXiv-oriented paper search MCP server using Docker MCP. Solo builders engineering context for Claude Code use it when they want programmatic literature search inside the agent instead of manual arXiv browsing. The flow starts by choosing where configuration lives—team-shared project CLAUDE.md, gitignored local preferences, or user-global settings—so research tooling matches how you collaborate. It then gates on Docker MCP availability with `mcp-find`, and if Docker is missing, defers to official Docker Desktop and Claude MCP connection guides before continuing. Subsequent steps cover searching the catalog and adding the paper server, with optional user arguments for research topics. This matters for AEO-heavy and R&D-heavy indie products because agents only cite what they can retrieve; a correctly scoped MCP setup reduces broken tools and duplicate config across repos.509installs65Queryquery is a runtime agent skill from the context-engineering-kit that searches a First Principles Framework (FPF) workspace stored under `.fpf/`. Solo builders and small teams use it when architectural choices are captured as hypotheses with assurance metadata instead of scattered notes. At run time the skill scans knowledge and decision directories, ranks matches to the user’s query, and renders structured markdown including layers (L0/L1/L2), kinds, scopes, R_eff from audit sections, weakest-link notes, and dependency trees. It fits agentic workflows where you want retrieval grounded in project-specific evidence rather than generic web search. Place it early in planning loops and again during ship and operate when validating that a change still satisfies recorded assumptions. Intermediate complexity: you need an initialized FPF tree and consistent hypothesis filenames. It pairs naturally with skills that author or promote hypotheses after experiments complete.508installs66Sdd:Add Tasksdd:add-task is an agent skill from the context-engineering-kit spec-driven development (SDD) workflow. It turns a short natural-language request—passed as the skill argument, optionally with a list of task files it depends on—into a draft task file under `.specs/tasks/draft/`. Before writing, it invokes the plugin’s folder bootstrap script so draft, todo, in-progress, done, and scratchpad paths exist and scratchpad stays gitignored. The agent parses the user’s wording for objective, implied work type, and dependencies, then emits a file whose name and title follow verb-first conventions while keeping the user’s intent in the description. Solo and indie builders shipping with Claude Code (the skill references `${CLAUDE_PLUGIN_ROOT}`) use it to avoid losing backlog items in chat logs and to keep work aligned with the same `.specs/` tree the rest of the kit expects. It is a lightweight intake step: drafts stay in draft until other SDD skills analyze and move them toward todo and implementation.503installs67Sdd:Create IdeasSdd:create-ideas is a lightweight generator skill for the earliest solo-builder moment: you have a fuzzy topic or problem and need structured options before validation work begins. The agent returns six distinct ideas as separate list items, each labeled with a numeric probability. The first half anchors on conventional, high-likelihood directions; the second half deliberately samples the tails so you see unusual angles you might otherwise skip. The skill stresses non-overlapping responses so the list reads like a real brainstorm wall, not six rephrasings of the same concept. Use it when you want fast ideation inside SDD or context-engineering workflows, not when you already have a scored backlog or approved spec. It is beginner-friendly—no repo or CLI—and output is immediately usable for prioritization, landing copy experiments, or feeding a later scope or prototype skill.502installs68Tdd:Test Driven Developmenttdd:test-driven-development is a journey-wide agent discipline from the context-engineering kit: write the test first, watch it fail, then write minimal code to pass. Solo builders install it so coding agents do not ship plausible implementations that were never proven against a failing spec. The SKILL.md treats violating the letter of the rules as violating the spirit—code authored before tests must be deleted, not kept as reference. It applies across the solo journey whenever you implement or change behavior: scoping prototypes in Validate, feature work in Build, hardening before Ship, and fixes in Operate. The workflow encodes Red-Green-Refactor with explicit verification that failure is for the right reason before going green. Confidence is high because triggers are enumerated (“Always” plus narrow exceptions). Pair with human judgment only for prototypes, generated output, or config-only edits.501installs