Run Train

Name: Run Train
Author: lllllllama

lllllllama/rigorpilot-skills

176k installs
512 repo stars
Updated July 26, 2026
lllllllama/rigorpilot-skills

run-train is a Claude Code skill for conservative deep learning training execution with config, seed, checkpoint, log, and metric preservation.

About

A training execution skill for deep learning research. Use it when you have a selected training command and want conservative execution with structured status, checkpoint, and metric reporting.

Executes documented training command conservatively with status tracking
Records configs, seeds, checkpoints, logs, and metrics for reproducibility
Normalizes evidence for startup, short-run, full kickoff, or resume scenarios

Run Train by the numbers

175,913 all-time installs (skills.sh)
+25,323 installs in the week ending Jul 28, 2026 (Skillselion tracking)
Ranked #3 of 2,066 Data Science & ML skills by installs in the Skillselion catalog
Security screen: LOW risk (skills.sh audit)
Data as of Jul 28, 2026 (Skillselion catalog sync)

At a glance

run-train capabilities & compatibility

Capabilities: command execution · monitoring · evidence collection · checkpoint preservation
Use cases: testing · research · debugging

npx skills add https://github.com/lllllllama/rigorpilot-skills --skill run-train

Add your badge

Show developers this skill is listed on Skillselion. Paste this into your README.

[![Listed on Skillselion](https://skillselion.com/badge/skills/lllllllama/rigorpilot-skills/run-train.svg)](https://skillselion.com/skills/lllllllama/rigorpilot-skills/run-train)

Installs	176k
repo stars	★ 512
Security audit	2 / 3 scanners passed
Last updated	July 26, 2026
Repository	lllllllama/rigorpilot-skills ↗

What it does

Execute deep learning training conservatively with reproducibility context and normalized evidence.

Who is it for?

Startup verification,Short-run verification,Full training kickoff,Resume handling

Skip if: Environment setup,Inference-only execution,Exploratory sweeps,Autonomous idea implementation

When should I use this skill?

The training command has been selected and should be executed conservatively with structured status and metric reporting.

What you get

train_outputs/ bundle with SUMMARY.md, COMMANDS.md, LOG.md, SCIENTIFIC_CHANGELOG.md, COMPARABILITY_REPORT.md, and status.json.

train_outputs/SUMMARY.md
train_outputs/COMMANDS.md
train_outputs/LOG.md

By the numbers

6 standardized training output files tracking execution, configs, checkpoints, and metrics

Files

SKILL.mdMarkdownGitHub ↗

run-train

Use this as the Rigor Train skill. The installed slug remains run-train for compatibility.

Use the shared operating principles in ../../references/agent-operating-principles.md; this skill should keep training evidence bounded while leaving repository-specific monitoring details to the model.

When to apply

When the training command has already been selected and should be executed conservatively.
When the researcher wants startup verification, short-run verification, full training kickoff, or resume handling.
When the run needs structured training status, checkpoint, and metric reporting.

When not to apply

When the main task is environment setup or asset download.
When the researcher wants inference-only or evaluation-only execution.
When the task is speculative exploration, multi-variant sweeps, or autonomous idea implementation.
When the user still needs repository intake or paper gap resolution.

Clear boundaries

This skill executes a selected training command and normalizes the resulting evidence.
It does not choose the overall research goal on its own.
It does not own exploratory branching or speculative code adaptation.
It should record partial, blocked, resumed, and kicked-off states clearly.
It should preserve reproducibility context such as configs, seeds,

checkpoints, logs, metrics, and runtime assumptions when available.

Input expectations

selected training goal
runnable training command
environment and asset assumptions
run mode such as startup verification, short-run verification, full kickoff, or resume

Output expectations

train_outputs/SUMMARY.md
train_outputs/COMMANDS.md
train_outputs/LOG.md
train_outputs/SCIENTIFIC_CHANGELOG.md
train_outputs/COMPARABILITY_REPORT.md
train_outputs/status.json

Notes

Use references/training-policy.md, ../../references/deep-learning-experiment-principles.md, scripts/run_training.py, and scripts/write_outputs.py.

display_name: Rigor Train
short_description: Rigor Train mode for conservative selected training command execution and train_outputs evidence.
default_prompt: Run the selected training command conservatively, capture startup, resume, and status evidence, and write SUMMARY.md COMMANDS.md LOG.md and status.json into train_outputs.

#!/usr/bin/env python3
"""Execute a selected training command and normalize conservative training evidence."""

from __future__ import annotations

import argparse
import json
import re
import shlex
import subprocess
from pathlib import Path
from typing import Any, Dict, Iterable, List, Optional, Tuple


EPOCH_RE = re.compile(r"(?:epoch)\s*[:=\[/ ]+\s*(\d+)", flags=re.IGNORECASE)
STEP_RE = re.compile(r"(?:step|iter|iteration)\s*[:=\[/ ]+\s*(\d+)", flags=re.IGNORECASE)
CHECKPOINT_RE = re.compile(r"([\w./\\-]+\.(?:ckpt|pth|pt|bin|safetensors))", flags=re.IGNORECASE)
METRIC_RE = re.compile(
    r"\b([A-Za-z][A-Za-z0-9_.-]{1,31})\s*[:=]\s*(-?\d+(?:\.\d+)?(?:[eE][+-]?\d+)?)"
)


def combine_logs(parts: Iterable[str]) -> str:
    return "\n".join(part for part in parts if part).strip()


def parse_progress(text: str) -> Dict[str, Any]:
    last_epoch: Optional[int] = None
    last_step: Optional[int] = None
    checkpoint_candidates: List[str] = []
    observed_metrics: Dict[str, float] = {}
    best_metric: Optional[Dict[str, Any]] = None

    for match in EPOCH_RE.finditer(text):
        last_epoch = int(match.group(1))
    for match in STEP_RE.finditer(text):
        last_step = int(match.group(1))
    for match in CHECKPOINT_RE.finditer(text):
        candidate = match.group(1).replace("\\", "/")
        if candidate not in checkpoint_candidates:
            checkpoint_candidates.append(candidate)
    for match in METRIC_RE.finditer(text):
        name = match.group(1)
        value = float(match.group(2))
        observed_metrics[name] = value

    priority_names = [
        name for name in observed_metrics
        if not any(token in name.lower() for token in {"loss", "lr", "time", "mem"})
    ]
    if priority_names:
        chosen = priority_names[-1]
        best_metric = {"name": chosen, "value": observed_metrics[chosen]}
    elif observed_metrics:
        chosen = list(observed_metrics)[-1]
        best_metric = {"name": chosen, "value": observed_metrics[chosen]}

    return {
        "last_epoch": last_epoch,
        "last_step": last_step,
        "checkpoint_candidates": checkpoint_candidates,
        "observed_metrics": observed_metrics,
        "best_metric": best_metric,
    }


def split_command(command: str) -> List[str]:
    return shlex.split(command, posix=True)


def run_git(repo: Path, args: List[str]) -> subprocess.CompletedProcess[str]:
    return subprocess.run(
        ["git", *args],
        cwd=repo,
        capture_output=True,
        text=True,
        timeout=15,
        check=False,
    )


def git_status_snapshot(repo: Path) -> Tuple[Optional[Dict[str, str]], Dict[str, Any]]:
    probe = run_git(repo, ["rev-parse", "--is-inside-work-tree"])
    if probe.returncode != 0 or probe.stdout.strip() != "true":
        return None, {
            "collection_method": "git-status-diff",
            "available": False,
            "reason": "git-unavailable-or-not-a-worktree",
        }

    result = run_git(repo, ["status", "--porcelain=v1", "--untracked-files=all"])
    if result.returncode != 0:
        return None, {
            "collection_method": "git-status-diff",
            "available": False,
            "reason": "git-status-failed",
            "stderr": result.stderr.strip(),
        }

    snapshot: Dict[str, str] = {}
    for raw_line in result.stdout.splitlines():
        line = raw_line.rstrip()
        if len(line) < 4:
            continue
        status = line[:2]
        path = line[3:]
        if " -> " in path:
            _old, _arrow, path = path.partition(" -> ")
        normalized = path.replace("\\", "/").strip()
        if normalized:
            snapshot[normalized] = status
    return snapshot, {
        "collection_method": "git-status-diff",
        "available": True,
        "status_entries": len(snapshot),
    }


def diff_status_snapshots(
    before: Optional[Dict[str, str]],
    after: Optional[Dict[str, str]],
) -> Dict[str, List[str]]:
    if before is None or after is None:
        return {
            "changed_files": [],
            "new_files": [],
            "deleted_files": [],
            "touched_paths": [],
            "touched_symbols": [],
        }

    changed_files: List[str] = []
    new_files: List[str] = []
    deleted_files: List[str] = []
    for path, status in after.items():
        previous_status = before.get(path)
        if previous_status == status:
            continue
        normalized_status = status.replace(" ", "")
        if "D" in normalized_status:
            deleted_files.append(path)
            continue
        if "?" in normalized_status or "A" in normalized_status:
            new_files.append(path)
            continue
        changed_files.append(path)

    touched_paths = []
    for path in [*changed_files, *new_files, *deleted_files]:
        if path not in touched_paths:
            touched_paths.append(path)
    return {
        "changed_files": changed_files,
        "new_files": new_files,
        "deleted_files": deleted_files,
        "touched_paths": touched_paths,
        "touched_symbols": [],
    }


def execute_command(repo: Path, command: str, timeout: int) -> Tuple[Dict[str, Any], str]:
    before_status, before_capture = git_status_snapshot(repo)
    try:
        result = subprocess.run(
            split_command(command),
            cwd=repo,
            capture_output=True,
            text=True,
            timeout=timeout,
            check=False,
        )
        combined = combine_logs(
            [
                f"STDOUT:\n{result.stdout.strip()}" if result.stdout.strip() else "",
                f"STDERR:\n{result.stderr.strip()}" if result.stderr.strip() else "",
            ]
        )
        execution = {
            "returncode": result.returncode,
            "timed_out": False,
            "stdout": result.stdout or "",
            "stderr": result.stderr or "",
        }
        after_status, after_capture = git_status_snapshot(repo)
        execution.update(diff_status_snapshots(before_status, after_status))
        execution["evidence_capture"] = {
            **after_capture,
            "before_status_entries": before_capture.get("status_entries"),
        }
        return execution, combined
    except FileNotFoundError as exc:
        return {
            "returncode": None,
            "timed_out": False,
            "launch_error": str(exc),
            "stdout": "",
            "stderr": "",
            "changed_files": [],
            "new_files": [],
            "deleted_files": [],
            "touched_paths": [],
            "touched_symbols": [],
            "evidence_capture": before_capture,
        }, f"Command failed before launch: {exc}"
    except subprocess.TimeoutExpired as exc:
        stdout = exc.stdout or ""
        stderr = exc.stderr or ""
        combined = combine_logs(
            [
                f"STDOUT:\n{stdout.strip()}" if stdout.strip() else "",
                f"STDERR:\n{stderr.strip()}" if stderr.strip() else "",
                f"TIMEOUT: Command exceeded the {timeout}-second monitoring window.",
            ]
        )
        execution = {
            "returncode": None,
            "timed_out": True,
            "stdout": stdout,
            "stderr": stderr,
        }
        after_status, after_capture = git_status_snapshot(repo)
        execution.update(diff_status_snapshots(before_status, after_status))
        execution["evidence_capture"] = {
            **after_capture,
            "before_status_entries": before_capture.get("status_entries"),
        }
        return execution, combined


def decide_outcome(
    *,
    command: str,
    run_mode: str,
    lane: str,
    timeout: int,
    execution: Dict[str, Any],
    progress: Dict[str, Any],
) -> Dict[str, Any]:
    combined_text = combine_logs([execution.get("stdout", ""), execution.get("stderr", "")])
    last_step = progress.get("last_step")
    completed_steps = last_step if last_step is not None else 0
    checkpoint_candidates = progress.get("checkpoint_candidates", [])
    best_checkpoint = checkpoint_candidates[-1] if checkpoint_candidates else None

    if execution.get("launch_error"):
        return {
            "status": "blocked",
            "documented_command_status": "blocked",
            "main_blocker": f"Executable not found for training command: {execution['launch_error']}",
            "stop_reason": "launch_failed",
            "completed_steps": completed_steps,
            "best_checkpoint": best_checkpoint,
            "best_metric": progress.get("best_metric"),
            "execution_log": [f"Command failed before launch: {execution['launch_error']}"],
            "monitoring_scope": "no_run",
        }

    if execution.get("timed_out"):
        if run_mode == "startup_verification" and completed_steps > 0:
            return {
                "status": "partial",
                "documented_command_status": "partial",
                "main_blocker": "The run stopped after the planned startup verification window.",
                "stop_reason": "startup_verification_window_elapsed",
                "completed_steps": completed_steps,
                "best_checkpoint": best_checkpoint,
                "best_metric": progress.get("best_metric"),
                "execution_log": [combined_text],
                "monitoring_scope": f"timeout:{timeout}s",
            }
        return {
            "status": "partial",
            "documented_command_status": "partial",
            "main_blocker": f"The run exceeded the {timeout}-second monitoring window.",
            "stop_reason": "monitoring_window_elapsed",
            "completed_steps": completed_steps,
            "best_checkpoint": best_checkpoint,
            "best_metric": progress.get("best_metric"),
            "execution_log": [combined_text],
            "monitoring_scope": f"timeout:{timeout}s",
        }

    if execution.get("returncode") == 0:
        stop_reason = "completed"
        if run_mode == "startup_verification":
            stop_reason = "startup_verified"
        elif run_mode == "short_run_verification":
            stop_reason = "short_run_verified"
        elif run_mode == "resume":
            stop_reason = "resume_checkpoint_verified"
        elif run_mode == "full_kickoff":
            stop_reason = "full_training_command_completed"

        return {
            "status": "success",
            "documented_command_status": "success",
            "main_blocker": "None.",
            "stop_reason": stop_reason,
            "completed_steps": completed_steps,
            "best_checkpoint": best_checkpoint,
            "best_metric": progress.get("best_metric"),
            "execution_log": [combined_text] if combined_text else [],
            "monitoring_scope": "process_completion",
        }

    main_blocker = f"Training command exited with code {execution.get('returncode')}."
    if not combined_text:
        combined_text = main_blocker
    return {
        "status": "partial",
        "documented_command_status": "partial",
        "main_blocker": main_blocker,
        "stop_reason": "nonzero_exit",
        "completed_steps": completed_steps,
        "best_checkpoint": best_checkpoint,
        "best_metric": progress.get("best_metric"),
        "execution_log": [combined_text],
        "monitoring_scope": "process_completion",
    }


def main() -> int:
    parser = argparse.ArgumentParser(description="Run a conservative training command and summarize evidence.")
    parser.add_argument("--repo", required=True, help="Path to the target repository.")
    parser.add_argument("--command", required=True, help="Selected training command.")
    parser.add_argument("--timeout", type=int, default=120, help="Monitoring timeout in seconds.")
    parser.add_argument("--lane", choices=["trusted", "explore"], default="trusted")
    parser.add_argument(
        "--run-mode",
        choices=["startup_verification", "short_run_verification", "full_kickoff", "resume"],
        default="startup_verification",
    )
    parser.add_argument("--dataset", default="unknown")
    parser.add_argument("--checkpoint-source", default="none")
    parser.add_argument("--resume-from", default="")
    parser.add_argument("--max-steps", type=int, default=0)
    args = parser.parse_args()

    repo = Path(args.repo).resolve()
    execution, combined = execute_command(repo, args.command, args.timeout)
    progress = parse_progress(combine_logs([execution.get("stdout", ""), execution.get("stderr", "")]))
    outcome = decide_outcome(
        command=args.command,
        run_mode=args.run_mode,
        lane=args.lane,
        timeout=args.timeout,
        execution=execution,
        progress=progress,
    )

    payload = {
        "lane": args.lane,
        "run_mode": args.run_mode,
        "resume_from": args.resume_from or None,
        "dataset": args.dataset,
        "checkpoint_source": args.checkpoint_source,
        "max_steps": args.max_steps,
        "completed_steps": outcome["completed_steps"],
        "best_metric": outcome["best_metric"],
        "best_checkpoint": outcome["best_checkpoint"],
        "stop_reason": outcome["stop_reason"],
        "status": outcome["status"],
        "documented_command_status": outcome["documented_command_status"],
        "main_blocker": outcome["main_blocker"],
        "execution_log": outcome["execution_log"],
        "last_epoch": progress.get("last_epoch"),
        "last_step": progress.get("last_step"),
        "observed_metrics": progress.get("observed_metrics", {}),
        "checkpoint_candidates": progress.get("checkpoint_candidates", []),
        "monitoring_scope": outcome["monitoring_scope"],
        "changed_files": execution.get("changed_files", []),
        "new_files": execution.get("new_files", []),
        "deleted_files": execution.get("deleted_files", []),
        "touched_paths": execution.get("touched_paths", []),
        "touched_symbols": execution.get("touched_symbols", []),
        "evidence_capture": execution.get("evidence_capture", {}),
    }
    print(json.dumps(payload, indent=2, ensure_ascii=False))
    return 0


if __name__ == "__main__":
    raise SystemExit(main())

#!/usr/bin/env python3
"""Compatibility wrapper for trusted training output bundles."""

from __future__ import annotations

import importlib.util
from pathlib import Path


def load_shared_module():
    module_path = Path(__file__).resolve().parents[3] / "shared" / "scripts" / "write_run_bundle.py"
    spec = importlib.util.spec_from_file_location("write_run_bundle", module_path)
    if spec is None or spec.loader is None:
        raise RuntimeError(f"Unable to load shared writer module from {module_path}")
    module = importlib.util.module_from_spec(spec)
    spec.loader.exec_module(module)
    return module


def main() -> int:
    module = load_shared_module()
    return module.main(default_mode="train", default_output_dir="train_outputs")


if __name__ == "__main__":
    raise SystemExit(main())

Related skills

Microsoft FoundryDeploy, evaluate, and continuously improve Microsoft Foundry agents from a single agent interface.478k1.3k

Ai Research ReproductionOrchestrate trustworthy, auditable reproduction of deep learning repositories directly from their READMEs.164k507

Explore RunSafely run isolated exploratory experiments with clear recording and conservative selection before committing changes.164k507

Paper Context ResolverFetch precise reproduction-critical details like dataset splits, preprocessing steps, or evaluation protocols from the original academic paper when the repo README leav141k507

Repo Intake And PlanScan unfamiliar AI research repositories and receive a minimal, trustworthy reproduction target before investing significant time.140k507

Env And Assets BootstrapCreate a reproducible, conservative conda environment plus required checkpoints, datasets and caches before attempting to run any AI research paper reproduction.140k507

Forks & variants (3)

Run Train has 3 known copies in the catalog totaling 434 installs. They canonicalize to this original listing.

lllllllama - 395 installs
lllllllama - 29 installs
lllllllama - 10 installs

How it compares

Use run-train for a bounded training execution with train_outputs; use ai-research-reproduction when the goal is full README-first reproduction orchestration.

FAQ

What does this skill not own?

It does not choose the research goal, own exploratory branching, or implement code changes - it executes a selected command.

What run modes are supported?

Startup verification, short-run verification, full kickoff, and resume handling.

Is Run Train safe to install?

skills.sh reports 2 of 3 security scanners passed. Review the Security Audits panel on this page before installing in production.

Data Science & MLpipelinesanalytics

Run Train

About

Run Train by the numbers

run-train capabilities & compatibility

Add your badge

What it does

Who is it for?

When should I use this skill?

What you get

By the numbers

Files

run-train

When to apply

When not to apply

Clear boundaries

Input expectations

Output expectations

Notes

Training Policy

Purpose

Requirements

Avoid

Related skills

Forks & variants (3)

How it compares

FAQ

What does this skill not own?

What run modes are supported?

Is Run Train safe to install?

About

Run Train by the numbers

run-train capabilities & compatibility

Add your badge

What it does

Who is it for?

When should I use this skill?

What you get

By the numbers

Files

run-train

When to apply

When not to apply

Clear boundaries

Input expectations

Output expectations

Notes

Training Policy

Purpose

Requirements

Avoid

Related skills

Forks & variants (3)

How it compares

FAQ

What does this skill not own?

What run modes are supported?

Is Run Train safe to install?

This week in AI coding