
Symbolic Equation
Guide an agent through LLM-SR-style symbolic regression: multi-island evolution, prompt buffers, and scored equation discovery for scientific systems.
Overview
symbolic-equation is an agent skill for the Idea phase that documents LLM-SR multi-island symbolic regression patterns for discovering scientific equation structure.
Install
npx skills add https://github.com/lingzhi227/agent-research-skills --skill symbolic-equationWhat is this skill?
- Documents LLM-SR pipeline: ExperienceBuffer multi-island clusters plus parallel samplers and evaluators
- Main sampler loop: get_prompt → draw_samples → analyse with island_id and version_generated
- Prompt construction uses versioned improving function sequences per island (buffer.py pattern)
- Core LLM instruction frames inputs with physical meaning for scientific function structure
- Config knobs include max_sample_nums and global sample limits for evolutionary search control
- Multi-island ExperienceBuffer architecture (Island 0…N with clusters and programs)
- Parallel samplers and evaluators in documented main loop
Adoption & trust: 666 installs on skills.sh; 114 GitHub stars; 2/3 security scanners passed (skills.sh audits).
What problem does it solve?
You want LLM-guided symbolic regression but lack a procedural map of islands, prompts, samplers, and evaluators from a real LLM-SR codebase.
Who is it for?
Research agents and indie ML experimenters implementing or extending LLM-SR-style symbolic equation search in Python.
Skip if: Casual app builders who need charts or A/B analytics without scientific modeling or custom evaluator infrastructure.
When should I use this skill?
You are implementing or extending LLM-SR-style symbolic equation discovery with multi-island buffers, LLM sampling, and parallel evaluators.
What do I get? / Deliverables
Your agent follows a structured evolutionary loop—buffered prompts, sampled candidates, parallel analysis, and scored registration—aligned with LLM-SR architecture.
- Agent-aligned LLM-SR loop design (prompt → sample → analyse → register)
- Island-aware prompt and versioning conventions
- Documented integration points across pipeline, sampler, buffer, and evaluator modules
Recommended Skills
Journey fit
Idea/research is the canonical shelf because the skill documents discovery of mathematical structure before you commit to a production modeling pipeline. Research fits equation discovery, evaluator loops, and LLM-sampled program search rather than shipping or growth work.
How it compares
Research procedural knowledge for symbolic regression pipelines—not a drop-in chart MCP or a general Python debugging skill.
Common Questions / FAQ
Who is symbolic-equation for?
Builders automating scientific ML or agent-research workflows who need LLM-SR patterns for multi-island equation discovery, not general frontend or DevOps tasks.
When should I use symbolic-equation?
During Idea research when exploring interpretable functional forms for a dataset, before you freeze a model architecture or ship inference APIs.
Is symbolic-equation safe to install?
Treat it as research tooling that may drive code execution and LLM API usage; review the Security Audits panel on this page and sandbox evaluator runs.
SKILL.md
READMESKILL.md - Symbolic Equation
# LLM-SR Patterns Extracted from LLM-SR codebase (llmsr/pipeline.py, sampler.py, buffer.py, evaluator.py, config.py). ## LLM Instruction Prompt ``` You are a helpful assistant tasked with discovering mathematical function structures for scientific systems. Complete the 'equation' function below, considering the physical meaning and relationships of inputs. ``` ## Multi-Island Evolutionary Algorithm ### Architecture Overview ``` Pipeline: ├── ExperienceBuffer (multi-island) │ ├── Island 0 → Clusters → Programs │ ├── Island 1 → Clusters → Programs │ ├── ... │ └── Island N → Clusters → Programs ├── Samplers (parallel LLM callers) │ └── get_prompt() → LLM → draw_samples() └── Evaluators (parallel execution) └── analyse(sample) → score → register() ``` ### Main Loop (sampler.py) ```python def sample(self): """Continuously gets prompts, samples programs, sends them for analysis.""" while True: if self._max_sample_nums and self._global_samples_nums >= self._max_sample_nums: break prompt = self._database.get_prompt() samples = self._llm.draw_samples(prompt.code, self.config) for sample in samples: chosen_evaluator = np.random.choice(self._evaluators) chosen_evaluator.analyse( sample, prompt.island_id, prompt.version_generated) ``` ## Prompt Construction (buffer.py) ### Versioned Function Sequence Programs from an island are formatted as an improving sequence: ```python def equation_v0(x, v, params): """Describe the acceleration of a damped oscillator.""" return params[0] * x def equation_v1(x, v, params): """Improved version of equation_v0.""" return params[0] * x + params[1] * v def equation_v2(x, v, params): """Improved version of equation_v1.""" return params[0] * x + params[1] * v + params[2] * x**2 def equation_v3(x, v, params): """Improved version of equation_v2.""" # LLM completes this ``` ### Cluster-Based Selection ```python def get_prompt(self): """Constructs prompt from island clusters.""" signatures = list(self._clusters.keys()) cluster_scores = np.array( [self._clusters[sig].score for sig in signatures]) # Temperature-scheduled softmax period = self._cluster_sampling_temperature_period temperature = self._cluster_sampling_temperature_init * ( 1 - (self._num_programs % period) / period) probabilities = _softmax(cluster_scores, temperature) # Sample clusters weighted by score functions_per_prompt = min(len(self._clusters), self._functions_per_prompt) idx = np.random.choice(len(signatures), size=functions_per_prompt, p=probabilities) # Sort by score ascending (worst to best) implementations = [self._clusters[signatures[i]].sample_program() for i in idx] indices = np.argsort([self._clusters[signatures[i]].score for i in idx]) sorted_implementations = [implementations[i] for i in indices] return self._generate_prompt(sorted_implementations) ``` ## Softmax Sampling (buffer.py) ```python def _softmax(logits, temperature): """Returns tempered softmax of 1D finite logits.""" if not np.all(np.isfinite(logits)): raise ValueError(f'logits contains non-finite values') result = scipy.special.softmax(logits / temperature, axis=-1) # Fix numerical precision: ensure probabilities sum to 1 index = np.argmax(result) result[index] = 1 - np.sum(result[0:index]) - np.sum(result[index + 1:]) return result ``` ### Within-Cluster Sampling (Shorter Programs Preferred) ```python def sample_program(self): """Samples a program, giving higher probability to shorter programs.""" normalized_lengths = (np.array(self._lengths) - min(self._lengths)) / ( max(self._lengths) + 1e-6) probabilities = _softmax(-normalized_lengths, temperature=1.0) return np.random.choice(self._programs, p=probabilities) ``` ## Island Reset Mechanism (buffer.py) ```pyth