
Ara Compiler
Structure an AI research repo with ARA-style manifests, claims, evidence tables, and trace DAGs so agents and collaborators can navigate your work reproducibly.
Overview
ARA Compiler is an agent skill most often used in Idea (also Validate, Build) that defines and applies the ARA research-directory schema so AI projects stay organized around claims, evidence, and traceable exploration.
Install
npx skills add https://github.com/orchestra-research/ai-research-skills --skill ara-compilerWhat is this skill?
- Complete ARA directory schema: PAPER.md root, logic/, src/, trace/, evidence/, rubric/
- Typed research artifacts: claims.md, experiments.md, exploration_tree.yaml, related_work RDO graph
- Evidence index discipline mapping tables and figures to falsifiable claims
- Minimal execution stubs plus environment.md for deps, hardware, and seeds
Adoption & trust: 1 installs on skills.sh; 9.4k GitHub stars; 2/3 security scanners passed (skills.sh audits).
What problem does it solve?
Your AI research folder mixes notes, scripts, and results with no shared schema, so agents cannot reliably find claims, evidence, or experiment lineage.
Who is it for?
Indie ML or agent researchers who want reproducible repo layout before scaling experiments or publishing artifacts.
Skip if: Builders who only need a quick README for an app repo with no claims/evidence discipline or formal research trace.
When should I use this skill?
You are starting or refactoring an AI research repository and need the full ARA field-level directory schema applied consistently.
What do I get? / Deliverables
You get a standardized ARA tree—manifest, logic layer, stubs, trace YAML, and evidence index—ready for agents to populate and for you to extend with real runs and figures.
- ARA-aligned directory tree specification
- Populated manifest and logic layer files
- Evidence index mapping tables and figures to claims
Recommended Skills
Journey fit
Spans multiple journey phases - primary shelf plus alternate fits below.
Research structuring starts when you are framing problems and evidence before implementation, which maps to the Idea phase shelf for discovery-oriented AI work. The skill is a field-level schema compiler for research directories—logic, experiments, evidence—so the canonical home is idea/research, not a single build task.
Where it fits
Define logic/problem.md and claims.md before committing to a model architecture.
Lay out experiments.md and rubric expectations to bound what you will prove in a prototype phase.
Align evidence/README.md and tables/ with what you ship in src/execution stubs.
Cross-check published figures against indexed evidence files before external release.
How it compares
Use as a structural template for research repos, not as a one-shot code generator or generic project scaffold.
Common Questions / FAQ
Who is ara-compiler for?
Solo and small-team AI researchers and agent builders who document hypotheses, experiments, and evidence in a repo agents can navigate.
When should I use ara-compiler?
During Idea when framing a study, during Validate when locking experiment plans, and during Build when aligning docs, stubs, and evidence folders before you ship analysis or write-ups.
Is ara-compiler safe to install?
It is primarily schema and documentation guidance; review the Security Audits panel on this page and treat any bundled stubs like normal code before executing.
SKILL.md
READMESKILL.md - Ara Compiler
# ARA Directory Schema — Complete Field-Level Reference ## Directory Structure ``` PAPER.md # Level 1: Root manifest + layer index logic/ problem.md # Why: observations → gaps → key insight claims.md # Falsifiable assertions concepts.md # All key technical terms (one ## per term) experiments.md # Declarative experiment plans (NOT scripts) solution/ architecture.md # System design + component graph algorithm.md # Math formulation + pseudocode constraints.md # Boundary conditions + limitations heuristics.md # Convergence tricks + rationale related_work.md # Typed dependency graph (RDO) src/ configs/ training.md # Training hyperparameters with rationale model.md # Architecture/model configs execution/ {module}.py # Minimal code stubs (core algorithm only) environment.md # Dependencies, hardware, seeds trace/ exploration_tree.yaml # Research DAG: nested YAML tree with typed nodes evidence/ README.md # Index mapping every evidence file to claims tables/ # Raw result tables (exact cell values) figures/ # Raw figure data (extracted data points) rubric/ # (Only if rubric provided) requirements.md # Leaf-level rubric requirements mapped to ARA files ``` Additional files or subdirectories may be created on demand when the source contains content that does not fit the standard layers (for example, appendix-sourced worked examples, prompt templates, or enumerated taxonomies). Place such content in the ARA layer where it best belongs. ## Progressive Disclosure (3 Levels) - **Level 1 — PAPER.md** (~200 tokens): Frontmatter + layer index. Agent reads ONLY this to decide relevance. - **Level 2 — Layer files** (problem.md, claims.md, experiments.md, evidence/README.md): Loaded on demand. - **Level 3 — Detail files** (algorithm.md, code stubs, individual evidence tables): Loaded when drilling in. --- ## PAPER.md YAML frontmatter MUST include: ```yaml --- title: "{full paper title}" authors: [{author list}] year: {year} venue: "{venue}" doi: "{DOI or arXiv ID}" ara_version: "1.0" domain: "{research domain}" keywords: [{5-10 keywords}] claims_summary: - "{one-line summary of main claim 1}" - "{one-line summary of main claim 2}" - "{one-line summary of main claim 3}" abstract: "{paper abstract}" --- ``` Body MUST include a Layer Index — a table for each layer listing every file: ```markdown # {Paper Title} ## Overview {1-2 paragraph summary of the contribution} ## Layer Index ### Cognitive Layer (`/logic`) | File | Description | |------|-------------| | [problem.md](logic/problem.md) | Observations → gaps → key insight | | [claims.md](logic/claims.md) | {N} falsifiable claims (C01–C{NN}) | | ... ### Physical Layer (`/src`) | File | Description | Claims | |------|-------------|--------| | [execution/{module}.py](src/execution/{module}.py) | {what} | C{NN} | | ... ### Exploration Graph (`/trace`) | File | Description | |------|-------------| | [exploration_tree.yaml](trace/exploration_tree.yaml) | {N}-node research DAG | ### Evidence (`/evidence`) | File | Description | |------|-------------| | [README.md](evidence/README.md) | Full index of {N} tables + {N} figures | ``` --- ## Evidence Naming and Fidelity The evidence layer has two different object types: 1. **Raw source evidence** - Faithful transcription of one source table or figure - Must preserve the original source identifier and caption - Example: `evidence/tables/table3_imagenet_validation.md` 2. **Derived subset evidence** - Filtered or recomposed view created for a specific claim -