Ako4all

Name: Ako4all
Author: TongmingLAIC

TongmingLAIC/AKO4ALL

Run a repeatable agentic loop that profiles, correctness-checks, and iteratively speeds up a CUDA/Triton/TileLang GPU kernel against a PyTorch reference.

Install

npx skills add https://github.com/TongmingLAIC/AKO4ALL --skill SKILL.md

What is this skill?

Drives an agentic optimization loop aimed at maximum GPU kernel speedup
Supports CUDA, Triton, TileLang, C++, and Python kernel entry points
Handles workspace bootstrap, ncu profiling, correctness checking, and git commits per iteration
Benchmarks optimized kernels against a PyTorch reference implementation
Responds to AKO / AKO4ALL / AKO4X and “make this kernel faster” style requests

Adoption & trust: 268 GitHub stars.

Recommended Skills

Agent Browservercel-labs/agent-browser

agent-browser is a Node-installed browser automation CLI built for AI agents that need dependable programmatic web inter…428k installs·35.5k stars

Lark Imlarksuite/cli

Lark IM is a Larksuite agent skill that exposes Feishu/Lark instant messaging to Claude Code, Cursor, and similar agents…210k installs·13.7k stars

Lark Calendarlarksuite/cli

lark-calendar is an agent skill for Feishu/Lark Calendar v4 exposed via lark-cli. Solo builders and small teams who alre…209k installs·13.7k stars

Lark Sheetslarksuite/cli

Skill for programmatic Feishu spreadsheet and worksheet management—create tables, bulk data IO, lookup, and export—using…209k installs·13.7k stars

Lark Vclarksuite/cli

lark-vc is an agent skill for Feishu/Lark video conferencing history and artifacts through lark-cli. After calls end, so…208k installs·13.7k stars

Lark Contactlarksuite/cli

CLI skill for Lark directory lookup: search employees and fetch metadata by open_id, with clear boundaries vs IM, calend…208k installs·13.7k stars

Journey fit

Primary fit

Canonical shelf is Ship because the skill’s purpose is measurable speedup and benchmarking before you treat the kernel as production-ready. Perf is the right subphase for ncu profiling, iteration logging, and chasing speedup versus a reference implementation.

SKILL.md

READMESKILL.md - Ako4all

Drive an agentic loop that iteratively optimizes a GPU kernel for maximum speedup. Use this skill whenever the user wants to optimize / speed up / benchmark a GPU kernel (CUDA, Triton, TileLang, C++, Python), mentions AKO / AKO4ALL / AKO4X / agentic kernel optimization, asks to "make this kernel faster", or has a kernel they want measured against a PyTorch reference. The skill handles setup, profiling (ncu), correctness checking, iteration logging, and git commits. Bootstraps a workspace in any directory the user points at.

# ako4all

{
  "name": "ako4all",
  "description": "Drive an agentic loop that iteratively optimizes a GPU kernel for maximum speedup. Use this skill whenever the user wants to optimize / speed up / benchmark a GPU kernel (CUDA, Triton, TileLang, C++, Python), mentions AKO / AKO4ALL / AKO4X / agentic kernel optimization, asks to \"make this kernel faster\", or has a kernel they want measured against a PyTorch reference. The skill handles setup, profiling (ncu), correctness checking, iteration logging, and git commits. Bootstraps a workspace in any directory the user points at."
}

Ako4all

TongmingLAIC/AKO4ALL

Run a repeatable agentic loop that profiles, correctness-checks, and iteratively speeds up a CUDA/Triton/TileLang GPU kernel against a PyTorch reference.

Install

npx skills add https://github.com/TongmingLAIC/AKO4ALL --skill SKILL.md

What is this skill?

Drives an agentic optimization loop aimed at maximum GPU kernel speedup
Supports CUDA, Triton, TileLang, C++, and Python kernel entry points
Handles workspace bootstrap, ncu profiling, correctness checking, and git commits per iteration
Benchmarks optimized kernels against a PyTorch reference implementation
Responds to AKO / AKO4ALL / AKO4X and “make this kernel faster” style requests

Adoption & trust: 268 GitHub stars.

Recommended Skills

Agent Browservercel-labs/agent-browser

agent-browser is a Node-installed browser automation CLI built for AI agents that need dependable programmatic web inter…428k installs·35.5k stars

Lark Imlarksuite/cli

Lark IM is a Larksuite agent skill that exposes Feishu/Lark instant messaging to Claude Code, Cursor, and similar agents…210k installs·13.7k stars

Lark Calendarlarksuite/cli

lark-calendar is an agent skill for Feishu/Lark Calendar v4 exposed via lark-cli. Solo builders and small teams who alre…209k installs·13.7k stars

Lark Sheetslarksuite/cli

Skill for programmatic Feishu spreadsheet and worksheet management—create tables, bulk data IO, lookup, and export—using…209k installs·13.7k stars

Lark Vclarksuite/cli

lark-vc is an agent skill for Feishu/Lark video conferencing history and artifacts through lark-cli. After calls end, so…208k installs·13.7k stars

Lark Contactlarksuite/cli

CLI skill for Lark directory lookup: search employees and fetch metadata by open_id, with clear boundaries vs IM, calend…208k installs·13.7k stars

Journey fit

Primary fit

SKILL.md

READMESKILL.md - Ako4all

Drive an agentic loop that iteratively optimizes a GPU kernel for maximum speedup. Use this skill whenever the user wants to optimize / speed up / benchmark a GPU kernel (CUDA, Triton, TileLang, C++, Python), mentions AKO / AKO4ALL / AKO4X / agentic kernel optimization, asks to "make this kernel faster", or has a kernel they want measured against a PyTorch reference. The skill handles setup, profiling (ncu), correctness checking, iteration logging, and git commits. Bootstraps a workspace in any directory the user points at.

# ako4all

{
  "name": "ako4all",
  "description": "Drive an agentic loop that iteratively optimizes a GPU kernel for maximum speedup. Use this skill whenever the user wants to optimize / speed up / benchmark a GPU kernel (CUDA, Triton, TileLang, C++, Python), mentions AKO / AKO4ALL / AKO4X / agentic kernel optimization, asks to \"make this kernel faster\", or has a kernel they want measured against a PyTorch reference. The skill handles setup, profiling (ncu), correctness checking, iteration logging, and git commits. Bootstraps a workspace in any directory the user points at."
}

Install

What is this skill?

Recommended Skills

Journey fit

SKILL.md

This week for builders

Install

What is this skill?

Recommended Skills

Journey fit

SKILL.md