
Hypogenic
Configure automated hypothesis generation and testing pipelines over labeled text datasets for research-style ML experiments.
Overview
HypoGeniC is an agent skill most often used in Idea (also Validate, Build) that supplies a complete YAML configuration for hypothesis generation and testing over structured text datasets.
Install
npx skills add https://github.com/k-dense-ai/scientific-agent-skills --skill hypogenicWhat is this skill?
- YAML template for HypoGeniC, HypoRefine, and Union generation methods
- Configurable model provider (GPT-4, Claude-3) with optional Redis API caching
- Batched hypothesis generation with literature PDF ingestion for HypoRefine
- Structured prompts for observations and batched generation from labeled text features
- Example defaults include num_hypotheses: 20 and max_iterations: 10 in the generation block
Adoption & trust: 542 installs on skills.sh; 27.6k GitHub stars; 2/3 security scanners passed (skills.sh audits).
What problem does it solve?
You have labeled text data but no standardized config to generate, refine, and union testable hypotheses with an LLM pipeline.
Who is it for?
Indie ML or research agents who already have JSON train/val/test splits and want HypoGeniC-style batch hypothesis workflows.
Skip if: Builders who need a turnkey CLI with no dataset prep, or product teams that only need a single static prompt without iterative hypothesis testing.
When should I use this skill?
You are standing up or extending a scientific-agent HypoGeniC pipeline and need the full example configuration for data, model, cache, and prompts.
What do I get? / Deliverables
You get a copy-paste HypoGeniC config with data paths, model params, caching, and prompt slots ready to plug into your scientific agent runner.
- Completed hypogenic-style YAML config
- Prompt templates for observations and batched generation
Recommended Skills
Journey fit
Spans multiple journey phases - primary shelf plus alternate fits below.
Hypothesis generation from data belongs on the research shelf before you commit to a product or model direction. The template drives pattern discovery and testable claims from train/val/test splits—core research work, not shipping code.
Where it fits
Draft a 20-hypothesis generation run from early labeled samples before choosing a product angle.
Compare union vs hyporefine configs to see which claims survive on a validation split.
Drop the YAML into an agent runner with Redis cache enabled to limit repeat API calls during tuning.
How it compares
A structured experiment config template, not an MCP server or a single-shot brainstorming skill.
Common Questions / FAQ
Who is hypogenic for?
Solo builders and small teams running text-classification or pattern-discovery experiments who want HypoGeniC-style generation configs instead of improvising YAML each run.
When should I use hypogenic?
Use it in Idea when mining patterns from samples, in Validate when stress-testing label hypotheses before build, and in Build when wiring agent pipelines that batch-generate and test claims against held-out data.
Is hypogenic safe to install?
Treat it as config that references API keys and local data paths; review the Security Audits panel on this page and never commit secrets into the YAML.
SKILL.md
READMESKILL.md - Hypogenic
# HypoGeniC Configuration Template # Complete example configuration for hypothesis generation and testing # Dataset paths data: train: "data/train.json" validation: "data/val.json" test: "data/test.json" # Dataset should contain: # - text_features_1, text_features_2, ... text_features_n (lists of strings) # - label (list of strings) # Model configuration model: name: "gpt-4" # or "gpt-3.5-turbo", "claude-3", etc. api_key_env: "OPENAI_API_KEY" # Environment variable for API key temperature: 0.7 max_tokens: 2048 # Redis caching (optional - reduces API costs) cache: enabled: true host: "localhost" port: 6832 # Hypothesis generation parameters generation: method: "hypogenic" # Options: "hypogenic", "hyporefine", "union" num_hypotheses: 20 batch_size: 5 max_iterations: 10 # For HypoRefine method literature: papers_directory: "papers/" # Directory containing PDF files num_papers: 10 # For Union methods union: literature_hypotheses: "literature_hypotheses.json" deduplicate: true # Prompt templates prompts: # Observations prompt - generates initial observations from data observations: | Analyze the following data samples and identify patterns: {data_samples} Generate 5 distinct observations about patterns that distinguish between the two classes. Focus on specific, testable characteristics. # Batched generation prompt - creates hypotheses from observations batched_generation: | Based on these observations about the data: {observations} Generate {num_hypotheses} distinct, testable hypotheses that could explain the differences between classes. Each hypothesis should: 1. Be specific and measurable 2. Focus on a single characteristic or pattern 3. Be falsifiable through empirical testing Format each hypothesis as: "Hypothesis X: [clear statement]" # Inference prompt - tests hypotheses against data inference: | Hypothesis: {hypothesis} Data sample: {sample_text} Does this sample support or contradict the hypothesis? Respond with: SUPPORT, CONTRADICT, or NEUTRAL Explanation: [brief reasoning] # Relevance checking prompt - filters hypotheses relevance_check: | Hypothesis: {hypothesis} Task: {task_description} Is this hypothesis relevant and testable for the given task? Respond with: RELEVANT or NOT_RELEVANT Reasoning: [brief explanation] # Adaptive refinement prompt - for HypoRefine adaptive_refinement: | Current hypothesis: {hypothesis} This hypothesis performed poorly on these challenging examples: {challenging_examples} Generate an improved hypothesis that addresses these failures while maintaining the core insight. Improved hypothesis: [statement] # Inference configuration inference: method: "voting" # Options: "voting", "weighted", "ensemble" confidence_threshold: 0.7 max_samples: 1000 # Limit for large test sets # Output configuration output: directory: "output/" save_intermediate: true # Save hypotheses after each iteration format: "json" # Options: "json", "csv" verbose: true # Custom label extraction (optional) # Define a custom function in your code to parse specific output formats label_extraction: pattern: "PREDICTION: {label}" # Regex pattern for extracting predictions valid_labels: ["0", "1"] # Expected label values # Task-specific settings task: name: "example_task" description: "Binary classification task for [describe your specific domain]" features: - name: "text_features_1" description: "Primary text content" - name: "text_features_2" description: "Additional contextual information" labels: - name: "0" description: "Negative class" - name: "1" description: "Positive class" # Evaluation metrics evaluation: metrics: - "accuracy" - "precision" - "recall" - "f1" cross_validation: false num_folds: 5 # Logging loggi