
Understand Knowledge
Merge Karpathy-style wiki scan manifests and LLM analysis batches into one deduplicated knowledge graph with layers and tours.
Overview
Understand-knowledge is an agent skill for the Build phase that merges scan manifests and LLM analysis batches into a final Karpathy-pattern knowledge graph.
Install
npx skills add https://github.com/lum1104/understand-anything --skill understand-knowledgeWhat is this skill?
- Merges scan-manifest.json with analysis-batch-*.json into assembled-graph.json
- Entity deduplication, edge normalization, and canonical node/edge type validation
- Builds layers from index.md categories and tours from section ordering
- Writes output under .understand-anything/intermediate/ in the target wiki directory
- Python CLI: merge-knowledge-graph.py <wiki-directory>
- Canonical VALID_NODE_TYPES and VALID_EDGE_TYPES aligned with core/src/types.ts
Adoption & trust: 653 installs on skills.sh; 54.9k GitHub stars; 3/3 security scanners passed (skills.sh audits).
What problem does it solve?
You have scan manifests and separate LLM analysis batches but no single deduplicated graph with valid types, layers, and tours.
Who is it for?
Solo builders finishing an understand-anything wiki run who need deterministic merge logic after batch analysis.
Skip if: Greenfield projects with no scan-manifest.json or analysis-batch-*.json yet—run upstream scan and analysis first.
When should I use this skill?
Scan manifest and analysis-batch-*.json files exist under a wiki directory and you need assembled-graph.json.
What do I get? / Deliverables
You get assembled-graph.json with normalized entities, edges, layers, and tours ready for wiki or agent consumption.
- assembled-graph.json under .understand-anything/intermediate/
- Deduplicated nodes and normalized edges with layer and tour metadata
Recommended Skills
Journey fit
Graph assembly happens after you have scanned content and batch analyses—core build-time documentation intelligence, not idea or launch work. Produces structured docs/knowledge artifacts (entities, edges, tours from index.md), which is canonical docs-phase shelf placement.
How it compares
Use as the assembly step after batch analysis, not as a substitute for scanning or interactive chat summarization.
Common Questions / FAQ
Who is understand-knowledge for?
Indie builders and agent workflows using the understand-anything Karpathy wiki pattern who need to consolidate intermediate JSON into one graph.
When should I use understand-knowledge?
During Build/docs work after analysis batches exist, when you are ready to publish or query a merged knowledge graph for a wiki directory.
Is understand-knowledge safe to install?
Review the Security Audits panel on this Prism page and inspect the Python merge script before running it on sensitive wiki trees.
SKILL.md
READMESKILL.md - Understand Knowledge
#!/usr/bin/env python3 """ Merge script for Karpathy-pattern knowledge graphs. Combines the deterministic scan-manifest.json with LLM analysis batches (analysis-batch-*.json) into a final assembled knowledge graph. Handles: entity deduplication, edge normalization, layer building from index.md categories, tour generation from index.md section ordering. Usage: python merge-knowledge-graph.py <wiki-directory> Output: Writes assembled-graph.json to <wiki-directory>/.understand-anything/intermediate/ """ import json import os import re import sys from datetime import datetime, timezone from pathlib import Path # --------------------------------------------------------------------------- # Canonical type sets (must match core/src/types.ts) # --------------------------------------------------------------------------- VALID_NODE_TYPES = { "article", "entity", "topic", "claim", "source", # Codebase types (for cross-compatibility) "file", "function", "class", "module", "concept", "config", "document", "service", "table", "endpoint", "pipeline", "schema", "resource", "domain", "flow", "step", } VALID_EDGE_TYPES = { "cites", "contradicts", "builds_on", "exemplifies", "categorized_under", "authored_by", "related", "similar_to", # Codebase types "imports", "exports", "contains", "inherits", "implements", "calls", "subscribes", "publishes", "middleware", "reads_from", "writes_to", "transforms", "validates", "depends_on", "tested_by", "configures", "deploys", "serves", "provisions", "triggers", "migrates", "documents", "routes", "defines_schema", "contains_flow", "flow_step", "cross_domain", } NODE_TYPE_ALIASES = { "note": "article", "page": "article", "wiki_page": "article", "person": "entity", "actor": "entity", "organization": "entity", "tag": "topic", "category": "topic", "theme": "topic", "assertion": "claim", "decision": "claim", "thesis": "claim", "reference": "source", "raw": "source", "paper": "source", } EDGE_TYPE_ALIASES = { "references": "cites", "cites_source": "cites", "conflicts_with": "contradicts", "disagrees_with": "contradicts", "refines": "builds_on", "elaborates": "builds_on", "illustrates": "exemplifies", "instance_of": "exemplifies", "example_of": "exemplifies", "belongs_to": "categorized_under", "tagged_with": "categorized_under", "written_by": "authored_by", "created_by": "authored_by", "relates_to": "related", "related_to": "related", } # --------------------------------------------------------------------------- # Normalization # --------------------------------------------------------------------------- def normalize_node_type(t: str) -> str: t = t.lower().strip() return NODE_TYPE_ALIASES.get(t, t) def normalize_edge_type(t: str) -> str: t = t.lower().strip() return EDGE_TYPE_ALIASES.get(t, t) def normalize_entity_name(name: str) -> str: """Normalize entity names for deduplication.""" return re.sub(r'\s+', ' ', name.strip().lower()) # --------------------------------------------------------------------------- # Merge pipeline # --------------------------------------------------------------------------- def merge(root: Path) -> dict: intermediate = root / ".understand-anything" / "intermediate" manifest_path = intermediate / "scan-manifest.json" if not manifest_path.is_file(): print(f"Error: {manifest_path} not found. Run parse-knowledge-base.py first.", file=sys.stderr) sys.exit(1) # Load scan manifest (deterministic base) manifest = json.loads(manifest_path.read_text(encoding="utf-8")) nodes = {n["id"]: n for n in manifest["nodes"]} edges = list(manifest["edges"]) report = {"base_nodes": len(nodes), "base_edges": len(edges), "batches": 0, "new_entities": 0, "new_claims": 0, "new_edges": 0, "deduped_entities": 0, "dropped_edges": 0} # Load analysis batches