
Math Olympiad
Install this when you need an agent to solve or critique olympiad-level proofs (IMO, Putnam, USAMO, AIME) with competition-style rigor instead of casual homework help.
Overview
Math Olympiad is an agent skill for the Validate phase that solves and critiques olympiad-level mathematics problems with competition-standard proof discipline.
Install
npx skills add https://github.com/anthropics/claude-plugins-official --skill math-olympiadWhat is this skill?
- Handles IMO, Putnam, USAMO, and AIME-style statements with proof verification and gap analysis
- Finds counterexamples and tests conjectures from competition settings
- Simplifies overly complicated olympiad write-ups while keeping logical validity
- Rejects casual CS homework triggers (prime functions, Big-O trivia, intro calculus explanations)
- Explicit positive triggers include 10+ competition-proof query patterns in the skill eval set
- 10+ documented should_trigger query patterns for competition proofs
Adoption & trust: 996 installs on skills.sh; 29.6k GitHub stars; 3/3 security scanners passed (skills.sh audits).
What problem does it solve?
You have a contest-level problem or partial proof and need rigorous verification, a cleaner presentation, or a valid counterexample—not a homework shortcut.
Who is it for?
Contest practice, coaching write-ups, and agent products that must stay inside competition-math guardrails.
Skip if: Intro coursework, implementation tasks like writing a prime checker, generic research reading lists, or proof-assistant debugging unless the prompt is explicitly olympiad-shaped.
When should I use this skill?
User asks to solve, verify, simplify, or critique IMO, Putnam, USAMO, AIME, or similar olympiad problems and proofs.
What do I get? / Deliverables
You get a tightened olympiad-style argument, identified logical gaps, or a structured solution outline you can submit or teach from.
- Verified or revised proof outline
- Identified gaps or counterexamples
- Cleaner olympiad presentation of the solution
Recommended Skills
Journey fit
Competition math sits in Validate because the work is proving claims, checking counterexamples, and tightening arguments before you treat a solution as correct. Scope fits rigorous problem framing: verifying steps, finding gaps, and deciding whether a conjecture or proof strategy holds under contest constraints.
How it compares
Use instead of a general math tutor when the prompt is IMO, Putnam, or AIME-class and needs proof-level scrutiny.
Common Questions / FAQ
Who is math-olympiad for?
Contest participants, coaches, and builders who want agent behavior locked to olympiad proof work rather than everyday coding or intro math help.
When should I use math-olympiad?
Use it during Validate when you are proving a competition statement, checking a Putnam attempt, hunting counterexamples, or polishing a USAMO solution for presentation.
Is math-olympiad safe to install?
Review the Security Audits panel on this Prism page for the ingested package hash and audit signals before enabling it in production agent workflows.
SKILL.md
READMESKILL.md - Math Olympiad
[ {"query": "Solve this IMO problem: Let n ≥ 2 be an integer. Prove that...", "should_trigger": true}, {"query": "Is this Putnam proof correct? Here's my attempt at B3...", "should_trigger": true}, {"query": "Find a counterexample to: every continuous function on [0,1] is uniformly continuous", "should_trigger": true}, {"query": "Prove this olympiad inequality: for positive reals a,b,c with a+b+c=1...", "should_trigger": true}, {"query": "Help me with this USAMO geometry problem", "should_trigger": true}, {"query": "Verify my solution to AIME 2024 problem 12", "should_trigger": true}, {"query": "I think there's a gap in this competition proof, can you find it?", "should_trigger": true}, {"query": "Simplify this proof — it feels overly complicated", "should_trigger": true}, {"query": "Here's a conjecture from a math competition. Is it true?", "should_trigger": true}, {"query": "What's the cleanest way to present this olympiad solution?", "should_trigger": true}, {"query": "Help me verify the time complexity of this sorting algorithm", "should_trigger": false}, {"query": "Write a Python function that checks if a number is prime", "should_trigger": false}, {"query": "I'm doing research on the Riemann Hypothesis, where should I start reading?", "should_trigger": false}, {"query": "Debug this proof assistant code — my Lean tactic isn't working", "should_trigger": false}, {"query": "Explain the proof of the fundamental theorem of calculus to a high schooler", "should_trigger": false}, {"query": "What's a good textbook for learning competition math?", "should_trigger": false}, {"query": "Generate 10 practice problems similar to AIME level", "should_trigger": false}, {"query": "Compute the integral of x^2 sin(x) dx", "should_trigger": false}, {"query": "Review my research paper draft on analytic number theory", "should_trigger": false}, {"query": "What's the difference between IMO and Putnam in difficulty?", "should_trigger": false} ] # Adversarial Verifier Prompts — Math Olympiad Prompt bank for the verifier subagent. Fresh context: problem statement + cleaned solution, NO thinking trace. Agent has NO tools — pure reasoning only. **Source**: `shared/verifier_patterns_source.md`. Background: arXiv:2503.21934 showed self-verified 85.7% IMO success drops to <5% under human grading. These prompts are the human grader. **Verifier isolation**: You do NOT know how other verifiers voted. You are not told if this proof has been confirmed or refuted by anyone else. Assume you're the first and only reviewer. (Social proof — "3 others confirmed" — biases toward agreement.) --- ## Reasons to REFUTE (the taxonomy — look for ANY one of these) Your goal is to find ANY reason to refute. These are the seven categories a hole falls into: 1. **Step doesn't follow** — The conclusion of some step is not implied by its premises. (Includes direction errors: A>B and C>D does NOT give A−C>B−D.) 2. **Hypothesis not satisfied** — An invoked theorem needs a condition the proof never verified. (Pattern #5: "entire" ≠ "analytic in a disk".) 3. **Claim false in small case** — A stated identity or bound fails at n=2, n=3, or the first nontrivial block. Mentally test it. 4. **Tautological** — The "gap" at the end is the original problem in disguise. (Pattern #18: substitute the proof's own identities back in.) 5. **Proves too much** — The argument's skeleton applies to a famous object and proves something open or false about it. (Pattern #4.) 6. **Wrong interpretation** — Solves an easier reading of the problem than the intended one. (Pattern #60.) 7. **Hand-wave at the crux** — "iterating and optimizing gives the result", "by standard methods", "the details are routine" — at exactly the step that ISN'T routine. If none of these fire after a genuine attempt, CONFIRM. Do not confirm because the proof _sounds_ confident. --- ## 1. General Adversarial (default) You are an adversarial verifier.