
Understand Domain
Scan an existing repo and extract structured domain context so your agent can explain business flows, entry points, and boundaries before you refactor or onboard.
Overview
Understand-domain is an agent skill most often used in Build (also Idea, Ship) that scans a codebase and produces structured domain-context JSON for business-flow analysis.
Install
npx skills add https://github.com/lum1104/understand-anything --skill understand-domainWhat is this skill?
- Runs extract-domain-context.py to emit .understand-anything/intermediate/domain-context.json
- Lightweight scanner with caps (5000 files, 40 sampled files, 512 KB output) for agent context limits
- Builds file tree, entry-point hints, and sampled source slices across many language extensions
- Skips noise dirs (node_modules, .git, vendor) so scans stay fast on real monorepos
- Feeds a domain-analyzer agent path for identifying business domains, flows, and steps
- Caps scan at 5000 total files and 40 sampled source files
- Samples up to 80 lines per file with 512 KB max JSON output
Adoption & trust: 685 installs on skills.sh; 54.9k GitHub stars; 3/3 security scanners passed (skills.sh audits).
What problem does it solve?
You are staring at an unfamiliar repository and cannot confidently name the business domains, main flows, or where logic lives.
Who is it for?
Onboarding to a brownfield SaaS or API repo when you need agent-ready context without manually reading every package.
Skip if: Greenfield ideas with no code yet, or teams that only need API reference docs from an OpenAPI file alone.
When should I use this skill?
Before planning refactors, onboarding to a large repo, or when an agent needs grounded domain context from source.
What do I get? / Deliverables
You get a bounded domain-context artifact under .understand-anything that your agent can use to map domains and steps before you plan refactors or write specs.
- domain-context.json under .understand-anything/intermediate/
- Agent-ready summary of file tree, samples, and entry-point hints
Recommended Skills
Journey fit
Spans multiple journey phases - primary shelf plus alternate fits below.
Canonical shelf in Build because the primary output is codebase-derived domain documentation that supports implementation and maintenance. Fits docs: it materializes intermediate domain-context JSON and file-tree signals for human- and agent-readable domain maps.
Where it fits
Evaluate an acquired GitHub repo by extracting domain context before committing to a rewrite.
Produce a structured domain map to align agent tasks with real modules and entry points.
Break an epic into domain-scoped stories after the scanner highlights major flows.
Scope a refactor PR by confirming which directories own payment versus auth flows.
How it compares
Use instead of ad-hoc “read the whole repo” chat when you need capped, reproducible extraction output.
Common Questions / FAQ
Who is understand-domain for?
Solo and indie builders shipping with coding agents who inherit complex repos and need structured domain signals before planning work.
When should I use understand-domain?
During Build docs work when mapping an existing app; in Idea research when evaluating a codebase you might acquire; and in Ship review when scoping risky refactors—always from a real project root.
Is understand-domain safe to install?
It runs a local filesystem scan only; review the Security Audits panel on this page and inspect the script before pointing it at repos with secrets on disk.
SKILL.md
READMESKILL.md - Understand Domain
#!/usr/bin/env python3 """ extract-domain-context.py — Lightweight codebase scanner for domain knowledge extraction. Scans a project directory and produces a structured JSON context file that the domain-analyzer agent uses to identify business domains, flows, and steps. Usage: python extract-domain-context.py <project-root> Output: <project-root>/.understand-anything/intermediate/domain-context.json """ import json import os import re import sys from pathlib import Path from typing import Any # ── Configuration ────────────────────────────────────────────────────────── MAX_FILE_TREE_DEPTH = 6 MAX_FILES_PER_DIR = 50 MAX_FILES_TOTAL = 5000 MAX_SAMPLED_FILES = 40 MAX_LINES_PER_FILE = 80 MAX_ENTRY_POINTS = 200 MAX_OUTPUT_BYTES = 512 * 1024 # 512 KB — keeps output within agent context limits # File extensions we care about for domain analysis SOURCE_EXTENSIONS = { ".ts", ".tsx", ".js", ".jsx", ".mjs", ".cjs", ".py", ".pyi", ".go", ".rs", ".java", ".kt", ".scala", ".rb", ".cs", ".php", ".swift", ".c", ".cpp", ".h", ".hpp", ".ex", ".exs", ".hs", ".lua", ".r", ".R", } # Directories to always skip SKIP_DIRS = { "node_modules", ".git", ".svn", ".hg", "__pycache__", ".tox", "venv", ".venv", "env", ".env", "dist", "build", "out", ".next", ".nuxt", "target", "vendor", ".idea", ".vscode", "coverage", ".understand-anything", ".pytest_cache", ".mypy_cache", "Pods", "DerivedData", ".gradle", "bin", "obj", } # Files that reveal project metadata METADATA_FILES = [ "package.json", "Cargo.toml", "go.mod", "pyproject.toml", "setup.py", "setup.cfg", "pom.xml", "build.gradle", "Gemfile", "composer.json", "mix.exs", "Makefile", "docker-compose.yml", "docker-compose.yaml", "README.md", "README.rst", "README.txt", "README", ] # ── Entry point detection patterns ───────────────────────────────────────── ENTRY_POINT_PATTERNS: list[tuple[str, str, re.Pattern[str]]] = [ # HTTP routes ("http", "Express/Koa route", re.compile( r"""(?:app|router|server)\s*\.\s*(?:get|post|put|patch|delete|all|use)\s*\(\s*['"](/[^'"]*?)['"]""", re.IGNORECASE, )), ("http", "Decorator route (Flask/FastAPI/NestJS)", re.compile( r"""@(?:app\.)?(?:route|get|post|put|patch|delete|api_view|RequestMapping|GetMapping|PostMapping)\s*\(\s*['"](/[^'"]*?)['"]""", re.IGNORECASE, )), ("http", "Next.js/Remix route handler", re.compile( r"""export\s+(?:async\s+)?function\s+(GET|POST|PUT|PATCH|DELETE|HEAD|OPTIONS)\b""", )), # CLI ("cli", "CLI command", re.compile( r"""\.command\s*\(\s*['"]([\w\-:]+)['"]""", )), ("cli", "argparse subparser", re.compile( r"""add_parser\s*\(\s*['"]([\w\-]+)['"]""", )), # Event handlers ("event", "Event listener", re.compile( r"""\.on\s*\(\s*['"]([\w\-:.]+)['"]""", )), ("event", "Event subscriber decorator", re.compile( r"""@(?:EventHandler|Subscribe|Listener|on_event)\s*\(\s*['"]([\w\-:.]+)['"]""", )), # Cron / scheduled ("cron", "Cron schedule", re.compile( r"""@?(?:Cron|Schedule|Scheduled|crontab)\s*\(\s*['"]([^'"]+)['"]""", re.IGNORECASE, )), # GraphQL ("http", "GraphQL resolver", re.compile( r"""@(?:Query|Mutation|Subscription|Resolver)\s*\(""", )), # gRPC (only in .proto files — handled by file extension check below) ("http", "gRPC service", re.compile( r"""^service\s+(\w+)\s*\{""", re.MULTILINE, )), # Exported handlers (generic) ("manual", "Exported handler", re.compile( r"""export\s+(?:async\s+)?function\s+(handle\w+|process\w+|on\w+)\b""", )), ] # ── Gitignore support ────────────────────────────────────────────────────── def parse_gitignore(project_root: Path) -> list[re.Pattern[str]]: """Parse .gitignore into a list of compiled regex patterns.""" gitignore = project_root / ".gitignore" patterns: list[re