Repo Intake And Plan

Name: Repo Intake And Plan
Author: lllllllama

lllllllama/rigorpilot-skills

176k installs
512 repo stars
Updated July 26, 2026
lllllllama/rigorpilot-skills

This is a copy of repo-intake-and-plan by lllllllama - installs and ranking accrue to the original listing.

repo-intake-and-plan is a RigorPilot intake skill that scans unfamiliar deep learning repositories for developers who need the smallest reliable reproduction or agentic modification target identified from README and conf

About

repo-intake-and-plan is the Rigor Intake helper in rigorpilot-skills that scans a repository, reads README and common project files, extracts documented commands, and classifies inference, evaluation, and training paths. It checks nine primary files when present—including README.md, requirements.txt, pyproject.toml, setup.py, and Dockerfile—and inspects high-signal directories such as configs, scripts, and experiments for command clues. Developers reach for repo-intake-and-plan when onboarding to an unfamiliar deep learning codebase and deciding the smallest trustworthy reproduction or modification target before setup or execution. The output is a conservative recommendation rather than code changes, making it the usual first step before env-and-assets-bootstrap or ai-research-reproduction.

Scans primary files including README.md, requirements.txt, pyproject.toml, setup.py, and Dockerfile first
Inspects high-signal directories such as configs/, scripts/, notebooks/, and checkpoints/
Extracts documented commands with 6-tier extraction priority from explicit README evidence
Classifies paths into inference, evaluation, training, or other with conservative inferred labeling
Recommends the smallest trustworthy reproduction target while recording ambiguity

Repo Intake And Plan by the numbers

175,920 all-time installs (skills.sh)
+25,328 installs in the week ending Jul 28, 2026 (Skillselion tracking)
Security screen: LOW risk (skills.sh audit)
Data as of Jul 28, 2026 (Skillselion catalog sync)

npx skills add https://github.com/lllllllama/rigorpilot-skills --skill repo-intake-and-plan

Add your badge

Show developers this skill is listed on Skillselion. Paste this into your README.

[![Listed on Skillselion](https://skillselion.com/badge/skills/lllllllama/rigorpilot-skills/repo-intake-and-plan.svg)](https://skillselion.com/skills/lllllllama/rigorpilot-skills/repo-intake-and-plan)

Installs	176k
repo stars	★ 512
Security audit	3 / 3 scanners passed
Last updated	July 26, 2026
Repository	lllllllama/rigorpilot-skills ↗

How do you pick a minimal ML reproduction target?

Quickly understand an unfamiliar repository and identify the smallest reliable target for reproduction or agentic modification.

Who is it for?

ML engineers landing in an unfamiliar research repository who need README-driven target selection before running anything.

Skip if: Developers who already have a chosen training command ready to execute or need conda environment creation instead of repository scoping.

When should I use this skill?

An unfamiliar deep learning repository needs scanning to extract documented commands and recommend the smallest trustworthy reproduction target.

What you get

Classified command paths and a recommended smallest trustworthy reproduction target

Documented command extraction
Path classification
Smallest trustworthy target recommendation

By the numbers

Checks 9 primary project files when present during repository intake

Files

SKILL.mdMarkdownGitHub ↗

repo-intake-and-plan

Use this as the Rigor Intake helper. The installed slug remains repo-intake-and-plan for compatibility.

When to apply

At the beginning of README-first reproduction work.
When the main skill needs a fast map of repo structure and documented commands.
When inference, evaluation, and training candidates must be classified conservatively.
When the user explicitly wants to inspect the repo first and not run anything yet.

When not to apply

When execution has already started and the task is now about running commands or writing outputs.
When the target is not a repository-backed reproduction task.
When the user only wants paper interpretation without repo inspection.
When the user already has a selected documented command and only needs setup or execution.

Clear boundaries

This skill scans and plans.
This skill is helper-tier and should usually be orchestrator-invoked.
It does not install environments.
It does not prepare large assets.
It does not execute substantive reproduction commands.
It does not decide high-risk patching.

Input expectations

Target repository path.
Access to README and common project files if present.
Optional user hints about desired priority, such as inference-first.

Output expectations

concise repo structure summary
documented command inventory
inferred candidate categories: inference, evaluation, training, other
minimum trustworthy reproduction recommendation
notable ambiguity or risk list

Notes

Use references/repo-scan-rules.md and helper scripts under scripts/.

display_name: Rigor Intake
short_description: Rigor Intake helper for scanning a repo and recommending the smallest trustworthy reproduction target.
default_prompt: Scan this repository, read the README and common project files, extract documented commands, classify inference evaluation and training paths, and recommend the smallest trustworthy reproduction target.

#!/usr/bin/env python3
"""Extract shell-like commands from README content and classify them."""

from __future__ import annotations

import argparse
import json
import re
from pathlib import Path
from typing import Dict, List, Optional


CODE_BLOCK_RE = re.compile(r"```(?P<lang>[^\n`]*)\n(?P<body>.*?)```", re.DOTALL | re.IGNORECASE)
INLINE_CMD_RE = re.compile(r"^\s*(?:\$|>|PS> )\s*(.+)$")
HEADING_RE = re.compile(r"^(?P<marks>#{1,6})\s+(?P<title>.+?)\s*$")
COMMAND_PREFIXES = (
    "python ",
    "python3 ",
    "pip ",
    "pip3 ",
    "conda ",
    "bash ",
    "sh ",
    "chmod ",
    "export ",
    "set ",
    "CUDA_VISIBLE_DEVICES=",
    "./",
    "accelerate ",
    "torchrun ",
    "deepspeed ",
    "make ",
    "docker ",
)


def collect_headings(readme_text: str) -> List[Dict[str, object]]:
    headings: List[Dict[str, object]] = []
    offset = 0
    for line in readme_text.splitlines(keepends=True):
        matched = HEADING_RE.match(line.strip())
        if matched:
            headings.append(
                {
                    "offset": offset,
                    "level": len(matched.group("marks")),
                    "title": matched.group("title").strip(),
                }
            )
        offset += len(line)
    return headings


def nearest_heading(headings: List[Dict[str, object]], offset: int) -> Optional[str]:
    current: Optional[str] = None
    for heading in headings:
        if int(heading["offset"]) > offset:
            break
        current = str(heading["title"])
    return current


def infer_section_category(section: Optional[str]) -> Optional[str]:
    if not section:
        return None
    lowered = section.lower()
    if any(word in lowered for word in ["inference", "usage", "demo", "example", "text-to-image", "image-to-image", "transcribe"]):
        return "inference"
    if any(word in lowered for word in ["evaluation", "evaluate", "benchmark", "metrics", "validation"]):
        return "evaluation"
    if any(word in lowered for word in ["training", "train", "finetune", "fine-tune", "pretrain"]):
        return "training"
    return None


def infer_section_kind(section: Optional[str]) -> Optional[str]:
    if not section:
        return None
    lowered = section.lower()
    if any(word in lowered for word in ["install", "installation", "setup", "environment", "requirements"]):
        return "setup"
    if any(word in lowered for word in ["download", "checkpoint", "weights", "dataset", "data preparation"]):
        return "asset"
    if any(word in lowered for word in ["usage", "demo", "example", "inference", "evaluation", "training", "text-to-image", "image-to-image"]):
        return "run"
    return None


def classify(command: str, section: Optional[str] = None) -> str:
    section_category = infer_section_category(section)
    if section_category:
        return section_category

    lowered = command.lower()
    if any(
        word in lowered
        for word in [
            "infer",
            "inference",
            "predict",
            "generate",
            "sample",
            "demo",
            "txt2img",
            "img2img",
            "transcribe",
            "whisper ",
            "amg.py",
        ]
    ):
        return "inference"
    if any(word in lowered for word in ["eval", "evaluate", "validation", "validate", "benchmark", "score"]):
        return "evaluation"
    if any(word in lowered for word in ["train", "training", "finetune", "fine-tune", "pretrain", "pre-train"]):
        return "training"
    return "other"


def command_kind(command: str, section: Optional[str] = None) -> str:
    section_kind = infer_section_kind(section)
    if section_kind:
        return section_kind

    lowered = command.lower().strip()
    setup_prefixes = (
        "pip install",
        "pip3 install",
        "conda install",
        "conda env create",
        "conda create",
        "conda activate",
        "python -m pip install",
        "git clone",
        "cd ",
    )
    asset_prefixes = ("wget ", "curl ", "mkdir ", "tar ", "unzip ", "7z ", "aria2c ")
    if lowered.startswith(setup_prefixes):
        return "setup"
    if lowered.startswith(asset_prefixes):
        return "asset"
    if "--help" in lowered or " -h" in lowered:
        return "smoke"
    return "run"


def looks_like_command(line: str) -> bool:
    candidate = re.sub(r"^(?:\$|PS> )\s*", "", line.strip())
    if not candidate or candidate.startswith("#"):
        return False
    if candidate.startswith(("python", "pip", "conda", "bash", "sh", "make", "docker")):
        return True
    if candidate.startswith(COMMAND_PREFIXES):
        return True
    if re.search(r"\s--[A-Za-z0-9_-]+", candidate):
        return True
    if re.search(r"\b(?:python|pip|conda|torchrun|deepspeed|accelerate|bash|sh)\b", candidate):
        return True
    if re.search(r"[\\/].+\.(?:py|sh|bat)", candidate):
        return True
    if candidate.startswith(("cd ", "ls ", "mkdir ", "wget ", "curl ", "git ")):
        return True
    return False


def clean_lines(block: str) -> List[str]:
    commands: List[str] = []
    for raw_line in block.splitlines():
        line = raw_line.strip()
        if not line or line.startswith("#"):
            continue
        if not looks_like_command(line):
            continue
        line = re.sub(r"^(?:\$|PS> )\s*", "", line)
        commands.append(line)
    return commands


def extract_commands(readme_text: str) -> Dict[str, object]:
    commands: List[Dict[str, str]] = []
    warnings: List[str] = []
    seen = set()
    headings = collect_headings(readme_text)

    for match in CODE_BLOCK_RE.finditer(readme_text):
        lang = (match.group("lang") or "").strip().lower()
        if lang and lang not in {"bash", "shell", "sh", "zsh", "powershell", "cmd"}:
            continue

        section = nearest_heading(headings, match.start())
        lines = clean_lines(match.group("body"))
        if not lines:
            continue

        for line in lines:
            if line not in seen:
                commands.append(
                    {
                        "command": line,
                        "category": classify(line, section),
                        "kind": command_kind(line, section),
                        "section": section,
                        "source": "code_block",
                    }
                )
                seen.add(line)

    running_offset = 0
    for line in readme_text.splitlines(keepends=True):
        matched = INLINE_CMD_RE.match(line)
        if not matched:
            running_offset += len(line)
            continue
        command = matched.group(1).strip()
        if not looks_like_command(command):
            running_offset += len(line)
            continue
        section = nearest_heading(headings, running_offset)
        if command and command not in seen:
            commands.append(
                {
                    "command": command,
                    "category": classify(command, section),
                    "kind": command_kind(command, section),
                    "section": section,
                    "source": "inline",
                }
            )
            seen.add(command)
        running_offset += len(line)

    if not commands:
        warnings.append("No shell-like commands were extracted from the README.")

    counts: Dict[str, int] = {}
    for item in commands:
        category = item["category"]
        counts[category] = counts.get(category, 0) + 1

    return {
        "commands": commands,
        "counts": counts,
        "warnings": warnings,
    }


def main() -> int:
    parser = argparse.ArgumentParser(description="Extract shell-like commands from a README.")
    parser.add_argument("--readme", required=True, help="Path to the README file.")
    parser.add_argument("--json", action="store_true", help="Emit JSON output.")
    args = parser.parse_args()

    readme_path = Path(args.readme)
    text = readme_path.read_text(encoding="utf-8", errors="replace")
    data = extract_commands(text)

    if args.json:
        print(json.dumps(data, indent=2, ensure_ascii=False))
    else:
        for item in data["commands"]:
            print(f"[{item['category']}] {item['command']}")
        if data["warnings"]:
            print("Warnings:")
            for warning in data["warnings"]:
                print(f"- {warning}")
    return 0


if __name__ == "__main__":
    raise SystemExit(main())

#!/usr/bin/env python3
"""Scan a repository for README-first reproduction signals."""

from __future__ import annotations

import argparse
import json
from datetime import datetime, timezone
from pathlib import Path
from typing import Dict, List, Optional


KEY_FILES = [
    "README.md",
    "README",
    "requirements.txt",
    "environment.yml",
    "environment.yaml",
    "pyproject.toml",
    "setup.py",
    "setup.cfg",
    "Dockerfile",
]

SIGNAL_DIRS = [
    "configs",
    "config",
    "scripts",
    "tools",
    "examples",
    "notebooks",
    "checkpoints",
]


def first_existing(root: Path, names: List[str]) -> Optional[Path]:
    for name in names:
        candidate = root / name
        if candidate.exists():
            return candidate
    return None


def scan_repo(root: Path) -> Dict[str, object]:
    if not root.exists():
        raise FileNotFoundError(f"Repository path does not exist: {root}")

    top_level = sorted(item.name for item in root.iterdir())
    detected_files = [name for name in KEY_FILES if (root / name).exists()]
    detected_dirs = [name for name in SIGNAL_DIRS if (root / name).exists()]
    readme = first_existing(root, ["README.md", "README"])

    warnings: List[str] = []
    if readme is None:
        warnings.append("No README file was found at the repository root.")
    if not detected_files:
        warnings.append("No common environment or packaging files were detected.")

    return {
        "generated_at": datetime.now(timezone.utc).isoformat(),
        "repo_path": str(root.resolve()),
        "readme_path": str(readme.resolve()) if readme else None,
        "detected_files": detected_files,
        "detected_dirs": detected_dirs,
        "structure": {
            "top_level": top_level,
            "top_level_file_count": sum(1 for item in root.iterdir() if item.is_file()),
            "top_level_dir_count": sum(1 for item in root.iterdir() if item.is_dir()),
        },
        "warnings": warnings,
    }


def main() -> int:
    parser = argparse.ArgumentParser(description="Scan a repository for key reproduction signals.")
    parser.add_argument("--repo", required=True, help="Path to the target repository.")
    parser.add_argument("--json", action="store_true", help="Emit JSON instead of a human summary.")
    args = parser.parse_args()

    data = scan_repo(Path(args.repo))
    if args.json:
        print(json.dumps(data, indent=2, ensure_ascii=False))
    else:
        print(f"Repository: {data['repo_path']}")
        print(f"README: {data['readme_path'] or 'not found'}")
        print("Detected files:", ", ".join(data["detected_files"]) or "none")
        print("Detected dirs:", ", ".join(data["detected_dirs"]) or "none")
        if data["warnings"]:
            print("Warnings:")
            for item in data["warnings"]:
                print(f"- {item}")
    return 0


if __name__ == "__main__":
    raise SystemExit(main())

Related skills

Setup Matt Pocock SkillsScaffold the per-repo configuration that Matt Pocock’s engineering agent skills rely on so they understand the issue tracker, triage labels, and domain documentation la462k185k

Lark Skill MakerQuickly turn any Lark/Feishu OpenAPI call or multi-step workflow into a reusable agent skill with its own SKILL.md.379k15.8k

CavemanSlash token usage by roughly 75% while keeping every technical detail intact when working with Claude Code, Cursor or similar agents.378k92.5k

Lark AppsConnect Claude, Cursor or custom agents directly to Lark (Feishu) for messaging, document automation, approval workflows and enterprise data access.375k

Running Claude Code Via Litellm CopilotRun Claude Code at a fraction of the cost by routing requests through LiteLLM to the GitHub Copilot Chat API.270k72

Codex PetGenerate a complete Codex Pet spritesheet and metadata from one reference image without needing an OpenAI key or Codex Pro.246k8

How it compares

Use repo-intake-and-plan to choose a target; use env-and-assets-bootstrap once the reproduction path is selected and dependencies must be prepared.

FAQ

Which files does repo-intake-and-plan check first?

repo-intake-and-plan checks nine primary files when present, including README.md, requirements.txt, environment.yml, pyproject.toml, setup.py, setup.cfg, and Dockerfile. It also inspects high-signal directories like configs and scripts.

What does repo-intake-and-plan recommend?

repo-intake-and-plan recommends the smallest trustworthy reproduction target after classifying inference, evaluation, and training paths from documented commands. Rigor Intake scoping precedes environment setup and full reproduction runs.

Is Repo Intake And Plan safe to install?

skills.sh reports 3 of 3 security scanners passed. Review the Security Audits panel on this page before installing in production.

AI & Agent Buildingagentsautomation

Repo Intake And Plan

About

Repo Intake And Plan by the numbers

Add your badge

How do you pick a minimal ML reproduction target?

Who is it for?

When should I use this skill?

What you get

By the numbers

Files

repo-intake-and-plan

When to apply

When not to apply

Clear boundaries

Input expectations

Output expectations

Notes

Repo Scan Rules

Primary files

High-signal directories

Extraction priorities

Classification guidance

Conservative behavior

Related skills

How it compares

FAQ

Which files does repo-intake-and-plan check first?

What does repo-intake-and-plan recommend?

Is Repo Intake And Plan safe to install?

About

Repo Intake And Plan by the numbers

Add your badge

How do you pick a minimal ML reproduction target?

Who is it for?

When should I use this skill?

What you get

By the numbers

Files

repo-intake-and-plan

When to apply

When not to apply

Clear boundaries

Input expectations

Output expectations

Notes

Repo Scan Rules

Primary files

High-signal directories

Extraction priorities

Classification guidance

Conservative behavior

Related skills

How it compares

FAQ

Which files does repo-intake-and-plan check first?

What does repo-intake-and-plan recommend?

Is Repo Intake And Plan safe to install?

This week in AI coding