Symbolic Equation

Name: Symbolic Equation
Author: lingzhi227

lingzhi227/agent-research-skills

Guide an agent through LLM-SR-style symbolic regression: multi-island evolution, prompt buffers, and scored equation discovery for scientific systems.

Overview

symbolic-equation is an agent skill for the Idea phase that documents LLM-SR multi-island symbolic regression patterns for discovering scientific equation structure.

Install

npx skills add https://github.com/lingzhi227/agent-research-skills --skill symbolic-equation

What is this skill?

Documents LLM-SR pipeline: ExperienceBuffer multi-island clusters plus parallel samplers and evaluators
Main sampler loop: get_prompt → draw_samples → analyse with island_id and version_generated
Prompt construction uses versioned improving function sequences per island (buffer.py pattern)
Core LLM instruction frames inputs with physical meaning for scientific function structure
Config knobs include max_sample_nums and global sample limits for evolutionary search control
Multi-island ExperienceBuffer architecture (Island 0…N with clusters and programs)
Parallel samplers and evaluators in documented main loop

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 666 installs on skills.sh; 114 GitHub stars; 2/3 security scanners passed (skills.sh audits).

What problem does it solve?

You want LLM-guided symbolic regression but lack a procedural map of islands, prompts, samplers, and evaluators from a real LLM-SR codebase.

Who is it for?

Research agents and indie ML experimenters implementing or extending LLM-SR-style symbolic equation search in Python.

Skip if: Casual app builders who need charts or A/B analytics without scientific modeling or custom evaluator infrastructure.

When should I use this skill?

You are implementing or extending LLM-SR-style symbolic equation discovery with multi-island buffers, LLM sampling, and parallel evaluators.

What do I get? / Deliverables

Your agent follows a structured evolutionary loop—buffered prompts, sampled candidates, parallel analysis, and scored registration—aligned with LLM-SR architecture.

Agent-aligned LLM-SR loop design (prompt → sample → analyse → register)
Island-aware prompt and versioning conventions
Documented integration points across pipeline, sampler, buffer, and evaluator modules

Recommended Skills

Paper Context Resolverlllllllama/ai-paper-reproduction-skill

Optional helper-tier skill that supplements README-guided deep learning reproduction by resolving specific paper details…140k installs·412 stars

Repo Intake And Planlllllllama/ai-paper-reproduction-skill

Rigor Intake scans repository docs and layout to classify documented commands and propose a minimal reproduction plan fo…140k installs·412 stars

Env And Assets Bootstraplllllllama/ai-paper-reproduction-skill

Rigor Setup establishes conservative environment and asset assumptions aligned with README and config evidence before ex…140k installs·412 stars

Minimal Run And Auditlllllllama/ai-paper-reproduction-skill

RigorPilot executes the selected minimal reproduction command and produces normalized, auditable run evidence for paper …140k installs·412 stars

Analyze Projectlllllllama/rigorpilot-skills

analyze-project is a read-only agent skill from the RigorPilot family aimed at solo builders and small teams inheriting …32.3k installs·412 stars

Ai Research Reproductionlllllllama/rigorpilot-skills

ai-research-reproduction is the RigorPilot Reproduce orchestrator for solo builders and small teams who need to rerun a …32.3k installs·412 stars

Journey fit

Primary fit

IdeaOpportunity & market research

Idea/research is the canonical shelf because the skill documents discovery of mathematical structure before you commit to a production modeling pipeline. Research fits equation discovery, evaluator loops, and LLM-sampled program search rather than shipping or growth work.

How it compares

Research procedural knowledge for symbolic regression pipelines—not a drop-in chart MCP or a general Python debugging skill.

Common Questions / FAQ

Who is symbolic-equation for?

Builders automating scientific ML or agent-research workflows who need LLM-SR patterns for multi-island equation discovery, not general frontend or DevOps tasks.

When should I use symbolic-equation?

During Idea research when exploring interpretable functional forms for a dataset, before you freeze a model architecture or ship inference APIs.

Is symbolic-equation safe to install?

Treat it as research tooling that may drive code execution and LLM API usage; review the Security Audits panel on this page and sandbox evaluator runs.

SKILL.md

READMESKILL.md - Symbolic Equation

# LLM-SR Patterns

Extracted from LLM-SR codebase (llmsr/pipeline.py, sampler.py, buffer.py, evaluator.py, config.py).

## LLM Instruction Prompt

```
You are a helpful assistant tasked with discovering mathematical function structures for scientific systems. Complete the 'equation' function below, considering the physical meaning and relationships of inputs.
```

## Multi-Island Evolutionary Algorithm

### Architecture Overview
```
Pipeline:
  ├── ExperienceBuffer (multi-island)
  │   ├── Island 0 → Clusters → Programs
  │   ├── Island 1 → Clusters → Programs
  │   ├── ...
  │   └── Island N → Clusters → Programs
  ├── Samplers (parallel LLM callers)
  │   └── get_prompt() → LLM → draw_samples()
  └── Evaluators (parallel execution)
      └── analyse(sample) → score → register()
```

### Main Loop (sampler.py)
```python
def sample(self):
    """Continuously gets prompts, samples programs, sends them for analysis."""
    while True:
        if self._max_sample_nums and self._global_samples_nums >= self._max_sample_nums:
            break

        prompt = self._database.get_prompt()
        samples = self._llm.draw_samples(prompt.code, self.config)

        for sample in samples:
            chosen_evaluator = np.random.choice(self._evaluators)
            chosen_evaluator.analyse(
                sample, prompt.island_id, prompt.version_generated)
```

## Prompt Construction (buffer.py)

### Versioned Function Sequence
Programs from an island are formatted as an improving sequence:

```python
def equation_v0(x, v, params):
    """Describe the acceleration of a damped oscillator."""
    return params[0] * x

def equation_v1(x, v, params):
    """Improved version of equation_v0."""
    return params[0] * x + params[1] * v

def equation_v2(x, v, params):
    """Improved version of equation_v1."""
    return params[0] * x + params[1] * v + params[2] * x**2

def equation_v3(x, v, params):
    """Improved version of equation_v2."""
    # LLM completes this
```

### Cluster-Based Selection
```python
def get_prompt(self):
    """Constructs prompt from island clusters."""
    signatures = list(self._clusters.keys())
    cluster_scores = np.array(
        [self._clusters[sig].score for sig in signatures])

    # Temperature-scheduled softmax
    period = self._cluster_sampling_temperature_period
    temperature = self._cluster_sampling_temperature_init * (
        1 - (self._num_programs % period) / period)
    probabilities = _softmax(cluster_scores, temperature)

    # Sample clusters weighted by score
    functions_per_prompt = min(len(self._clusters), self._functions_per_prompt)
    idx = np.random.choice(len(signatures), size=functions_per_prompt, p=probabilities)

    # Sort by score ascending (worst to best)
    implementations = [self._clusters[signatures[i]].sample_program() for i in idx]
    indices = np.argsort([self._clusters[signatures[i]].score for i in idx])
    sorted_implementations = [implementations[i] for i in indices]

    return self._generate_prompt(sorted_implementations)
```

## Softmax Sampling (buffer.py)

```python
def _softmax(logits, temperature):
    """Returns tempered softmax of 1D finite logits."""
    if not np.all(np.isfinite(logits)):
        raise ValueError(f'logits contains non-finite values')
    result = scipy.special.softmax(logits / temperature, axis=-1)
    # Fix numerical precision: ensure probabilities sum to 1
    index = np.argmax(result)
    result[index] = 1 - np.sum(result[0:index]) - np.sum(result[index + 1:])
    return result
```

### Within-Cluster Sampling (Shorter Programs Preferred)
```python
def sample_program(self):
    """Samples a program, giving higher probability to shorter programs."""
    normalized_lengths = (np.array(self._lengths) - min(self._lengths)) / (
        max(self._lengths) + 1e-6)
    probabilities = _softmax(-normalized_lengths, temperature=1.0)
    return np.random.choice(self._programs, p=probabilities)
```

## Island Reset Mechanism (buffer.py)

```pyth

What is this skill?

Documents LLM-SR pipeline: ExperienceBuffer multi-island clusters plus parallel samplers and evaluators

Main sampler loop: get_prompt → draw_samples → analyse with island_id and version_generated

Prompt construction uses versioned improving function sequences per island (buffer.py pattern)

Core LLM instruction frames inputs with physical meaning for scientific function structure

Config knobs include max_sample_nums and global sample limits for evolutionary search control

Multi-island ExperienceBuffer architecture (Island 0…N with clusters and programs)

Parallel samplers and evaluators in documented main loop

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 666 installs on skills.sh; 114 GitHub stars; 2/3 security scanners passed (skills.sh audits).

What do I get? / Deliverables

Your agent follows a structured evolutionary loop—buffered prompts, sampled candidates, parallel analysis, and scored registration—aligned with LLM-SR architecture.

Agent-aligned LLM-SR loop design (prompt → sample → analyse → register)

Island-aware prompt and versioning conventions

Documented integration points across pipeline, sampler, buffer, and evaluator modules

Journey fit

Primary fit

IdeaOpportunity & market research

SKILL.md

READMESKILL.md - Symbolic Equation

# LLM-SR Patterns

Extracted from LLM-SR codebase (llmsr/pipeline.py, sampler.py, buffer.py, evaluator.py, config.py).

## LLM Instruction Prompt

```
You are a helpful assistant tasked with discovering mathematical function structures for scientific systems. Complete the 'equation' function below, considering the physical meaning and relationships of inputs.
```

## Multi-Island Evolutionary Algorithm

### Architecture Overview
```
Pipeline:
  ├── ExperienceBuffer (multi-island)
  │   ├── Island 0 → Clusters → Programs
  │   ├── Island 1 → Clusters → Programs
  │   ├── ...
  │   └── Island N → Clusters → Programs
  ├── Samplers (parallel LLM callers)
  │   └── get_prompt() → LLM → draw_samples()
  └── Evaluators (parallel execution)
      └── analyse(sample) → score → register()
```

### Main Loop (sampler.py)
```python
def sample(self):
    """Continuously gets prompts, samples programs, sends them for analysis."""
    while True:
        if self._max_sample_nums and self._global_samples_nums >= self._max_sample_nums:
            break

        prompt = self._database.get_prompt()
        samples = self._llm.draw_samples(prompt.code, self.config)

        for sample in samples:
            chosen_evaluator = np.random.choice(self._evaluators)
            chosen_evaluator.analyse(
                sample, prompt.island_id, prompt.version_generated)
```

## Prompt Construction (buffer.py)

### Versioned Function Sequence
Programs from an island are formatted as an improving sequence:

```python
def equation_v0(x, v, params):
    """Describe the acceleration of a damped oscillator."""
    return params[0] * x

def equation_v1(x, v, params):
    """Improved version of equation_v0."""
    return params[0] * x + params[1] * v

def equation_v2(x, v, params):
    """Improved version of equation_v1."""
    return params[0] * x + params[1] * v + params[2] * x**2

def equation_v3(x, v, params):
    """Improved version of equation_v2."""
    # LLM completes this
```

### Cluster-Based Selection
```python
def get_prompt(self):
    """Constructs prompt from island clusters."""
    signatures = list(self._clusters.keys())
    cluster_scores = np.array(
        [self._clusters[sig].score for sig in signatures])

    # Temperature-scheduled softmax
    period = self._cluster_sampling_temperature_period
    temperature = self._cluster_sampling_temperature_init * (
        1 - (self._num_programs % period) / period)
    probabilities = _softmax(cluster_scores, temperature)

    # Sample clusters weighted by score
    functions_per_prompt = min(len(self._clusters), self._functions_per_prompt)
    idx = np.random.choice(len(signatures), size=functions_per_prompt, p=probabilities)

    # Sort by score ascending (worst to best)
    implementations = [self._clusters[signatures[i]].sample_program() for i in idx]
    indices = np.argsort([self._clusters[signatures[i]].score for i in idx])
    sorted_implementations = [implementations[i] for i in indices]

    return self._generate_prompt(sorted_implementations)
```

## Softmax Sampling (buffer.py)

```python
def _softmax(logits, temperature):
    """Returns tempered softmax of 1D finite logits."""
    if not np.all(np.isfinite(logits)):
        raise ValueError(f'logits contains non-finite values')
    result = scipy.special.softmax(logits / temperature, axis=-1)
    # Fix numerical precision: ensure probabilities sum to 1
    index = np.argmax(result)
    result[index] = 1 - np.sum(result[0:index]) - np.sum(result[index + 1:])
    return result
```

### Within-Cluster Sampling (Shorter Programs Preferred)
```python
def sample_program(self):
    """Samples a program, giving higher probability to shorter programs."""
    normalized_lengths = (np.array(self._lengths) - min(self._lengths)) / (
        max(self._lengths) + 1e-6)
    probabilities = _softmax(-normalized_lengths, temperature=1.0)
    return np.random.choice(self._programs, p=probabilities)
```

## Island Reset Mechanism (buffer.py)

```pyth

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Who is symbolic-equation for?

When should I use symbolic-equation?

Is symbolic-equation safe to install?

SKILL.md

This week for builders

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Who is symbolic-equation for?

When should I use symbolic-equation?

Is symbolic-equation safe to install?

SKILL.md