
inference-sh/skills
29 skills18.8k installs14.8k starsGitHub
Install
npx skills add https://github.com/inference-sh/skillsSkills in this repo
1Agent ToolsAgent-tools documents the Belt CLI workflows solo builders use to find AI-capable apps before they hard-code model IDs or wrong APIs. You learn how to list your own apps, page through the public store, narrow by media category, run keyword search, pull featured or newest listings, and drill into a single app such as falai/flux-dev-lora with --json for full schema detail. That schema-first discovery reduces guesswork when you are wiring Claude Code, Cursor, or Codex to image, video, audio, or text generation services. The skill also highlights well-known endpoints across image and video generation so you can shortlist candidates quickly. It is procedural reference material—not a deploy pipeline—aimed at anyone building agent automations who needs a repeatable catalog search ritual instead of hunting docs ad hoc.935installs2Ai Image GenerationAI Image Generation is an agent skill package for solo builders who need on-demand visuals without opening a separate design tool. It documents how to drive inference.sh’s belt CLI to run falai/flux-dev-lora and dozens of comparable apps with structured prompts and inputs. Typical moments include Validate-phase landing hero concepts, Build-phase UI mockups and in-app illustrations, and Launch-phase social and ad creatives. The skill emphasizes model selection (speed vs quality vs LoRA), capability flags (t2i, i2i, inpaint), and reproducible bash invocations an coding agent can paste into workflows. It is a procedural integration skill—not a hosted MCP server—so you keep API access inside your belt account and audit each command. Pair with copy and layout skills when you need full campaigns; use this when the bottleneck is generating or iterating images from text.935installs3Ai Video GenerationAI Video Generation is an agent skill that routes solo builders through inference.sh’s belt CLI to produce videos from text or images using a large catalog of hosted models. It maps common go-to-market needs—short social clips, explainers, product demos, and avatar-style footage—to concrete app IDs and input patterns so you are not guessing which runway-class alternative fits a given job. The workflow assumes you install and authenticate belt first, then invoke named apps such as google/veo-3-1-fast with JSON payloads. Capabilities span generation, reference-driven edits, lipsync, and auxiliary steps like upscale or foley when the platform exposes them. It complements code-heavy Build work by filling the asset gap many indies hit at Launch and Grow. Pair it with belt CLI install docs when agents lack the binary or credentials.934installs4Twitter Automationtwitter-automation documents how solo builders wire Claude Code, Cursor, Codex, or similar agents to X (Twitter) using the inference.sh belt command-line interface. After installing the belt CLI skill and logging in, you invoke named apps such as post-tweet for plain text, post-create when media is attached, and supporting engagement primitives including like, retweet, direct messages, and follows. The skill is aimed at repeatable distribution and light growth automation—scheduling announcements, liking community posts, retweeting launch threads, and sending DMs—without hand-rolling OAuth clients in your repo. It sits in the automation layer of your stack: the agent shells out to belt with structured JSON inputs rather than embedding unofficial scraping. You should treat credentials and rate limits as production concerns, keep bots aligned with X’s policies, and pair this with human review for brand voice. Prism lists it for builders who already use inference.sh and want trigger phrases like twitter api, tweet automation, or post to twitter to resolve to a concrete CLI recipe during launch and lifecycle planning.816installs5Web Searchweb-search teaches solo builders to query the live web through inference.sh’s belt CLI using Tavily and Exa apps instead of ad-hoc browser tabs or brittle scrapers. After `belt login`, you run structured jobs such as `tavily/search-assistant` for AI answers, `tavily/extract` and `exa/extract` for page content, and `exa/search` or `exa/answer` for smart retrieval—ideal when your agent needs citations, RAG context, or verification while scoping an idea, drafting launch copy, or enriching automations. The skill is an integration layer: it assumes the belt CLI skill is installed and points to official install docs. Pair it with your own prompt discipline so search results feed plans or specs rather than unchecked assumptions.784installs6Agent BrowserAgent Browser authentication patterns document how solo builders run headful or scripted browser sessions so Claude Code, Cursor, and similar agents can act inside logged-in web products. It walks through belt-style agent-browser commands: open URL, snapshot elements, fill and click by ref, wait for navigation, and verify success—then extends to OAuth, SSO, two-factor authentication, session reuse, and cookie extraction. The material is reference prose for integration work when your agent must post to a dashboard, scrape a protected settings page, or regression-test a customer login path. It assumes you already use the inference-sh agent-browser stack rather than raw Playwright snippets alone. Use during Build when connecting agents to SaaS tools; pair with session-management for persistence details and the parent SKILL.md for command quick start.782installs7Python ExecutorPython Executor is an agent skill that delegates arbitrary Python snippets to inference.sh’s sandboxed `infsh/python-executor` app through the belt CLI. Solo builders use it when they want the agent to prove a script, crunch a CSV, scrape a page, or render media without installing heavy deps on their laptop. You pass JSON input with a `code` string; the platform returns stdout, errors, and artifacts within RAM and timeout limits. It fits the Build phase when you are validating integrations, prototyping ETL, or generating one-off assets during shipping prep. It is not a replacement for your production backend—think disposable compute for agent-driven experiments. Install `npx skills add belt-sh/cli`, run `belt login`, then invoke with natural triggers like “execute python” or “run this script.”744installs8Remotion Renderremotion-render is an integration skill for solo builders who already think in React and want motion graphics or data-driven videos as files, not screen recordings. You pass Remotion component code to the inference.sh belt CLI, which runs the hosted remotion-render app and returns MP4 output—useful for launch trailers, changelog clips, social snippets, or automated visualizations from the same TSX you prototype in the repo. The skill documents allowed-tools Bash(belt *) and points to belt installation via npx skills add belt-sh/cli, so the agent’s job is to shape valid Remotion code and invoke belt with the right JSON payload rather than wiring FFmpeg by hand. Triggers in the frontmatter cover remotion, tsx-to-video, programmatic video, and motion design from code. Complexity is intermediate: you need basic Remotion mental models plus comfort with CLI auth. It is phase-specific to Build frontend work, with natural handoff to Launch distribution once MP4s exist. Prism categorizes it under Generative Media as a remote render integration, distinct from editing timelines in a desktop NLE.709installs9Ai Avatar VideoAI Avatar Video is an agent skill that wires your coding agent to inference.sh so you can render AI presenters and talking-head clips from a still image and audio or text. Solo and indie builders use it when they need HeyGen- or Synthesia-style output but want a CLI-driven, scriptable workflow inside Claude Code, Cursor, or similar agents. Typical jobs include explainer narrations, virtual influencers, marketing spokespersons, dubbing with lipsync, and UGC-style ad creatives. The skill documents quick-start `belt app run` examples (including pruna/p-video-avatar), model alternatives, and TTS backends so you can trade off speed, cost, and language coverage. Install the belt CLI skill separately via the documented npx skills add flow before invoking avatar runs.648installs10Landing Page DesignLanding Page Design is an agent skill for solo and indie builders who need a credible marketing page during validation, not a full design system. It walks through conversion-oriented structure: what must appear above the fold, how hero imagery supports the promise, where social proof earns trust, and how mobile readers scan in an F-pattern. The skill ties those rules to practical execution by documenting inference.sh belt CLI flows for generating hero photography and researching high-converting SaaS examples, so your agent can both advise and produce assets in one session. Use it when you are scoping a startup landing page, refreshing a product page, or optimizing click-through before launch. It assumes you will install belt (`npx skills add belt-sh/cli`) and log in for generation and research commands. The outcome is a page blueprint your agent can implement in your stack plus optional hero imagery aligned with your positioning.640installs11Image To VideoImage to Video is a procedural guide for solo builders who already use inference.sh and want reliable still-to-video pipelines without guessing which hosted model fits each creative brief. It centers the belt CLI: authenticate, generate a reference frame with something like falai/flux-dev-lora, then animate with falai/wan-2-5-i2v using motion-focused prompts that respect the source composition. A comparison-oriented model table helps you choose Wan 2.5 i2v versus Seedance, Fabric, or Grok Video based on motion style and subject matter—product loops, portraits, landscapes, or stylized scenes. The skill is integration-heavy rather than a local ComfyUI graph, which suits indie makers shipping landing page heroes, short ads, or social clips from one agent session. Pair strong static prompts with gentle kinetic instructions so faces and logos stay stable while backgrounds gain life.631installs12Youtube Thumbnail DesignYoutube-thumbnail-design teaches solo creators and indie marketers how to produce high–click-through YouTube cover images using inference.sh’s belt CLI and AI image generation, not ad-hoc screenshots. It locks dimensions and file limits to YouTube’s requirements, emphasizes safe zones so titles survive mobile crops, and walks through prompt patterns for expressive faces, dramatic lighting, and high contrast palettes that read at small sizes. The skill fits builders who ship video as a growth channel but lack a dedicated designer: you log into belt, run Flux with explicit width and height, and iterate toward thumbnails worth A/B testing. Triggers align with searches for CTR optimization, video cover images, and thumbnail makers. Pair it with your scripting workflow in Validate or Build docs phases only when you already have a video topic; the skill’s value peaks at Launch and Grow when distribution and lifecycle creatives must compete in crowded feeds.629installs13App Store ScreenshotsApp Store Screenshots is an inference.sh skills package that teaches agents to produce platform-correct store imagery using the belt CLI. Solo mobile founders use it at Launch when ASO deadlines loom and manually resizing frames for every iPhone class is error-prone. The skill encodes quick-start commands such as belt login and belt app run against falai/flux-dev-lora with portrait dimensions suited to phone mockups, plus specification tables for required iOS display sizes. It bridges marketing craft and agent automation: you still supply prompts describing the UI story, but the skill keeps agents inside Apple and Google constraints. Prism classifies it as phase-specific Launch ASO with Marketing & SEO category overlap to mobile distribution. Pair it with belt CLI installation before invoking generation steps.621installs14Product PhotographyProduct-photography is an agent skill for solo ecommerce founders who need listing-grade images without hiring a studio. It documents how to brief AI renders through the inference.sh belt CLI: clean white packshots, floating hero angles, lifestyle scenes, and detail macros aligned with common marketplace rules. Prompt templates emphasize soft studio light, subtle shadows, sharp focus, and resolution hints (for example 2K) so outputs feel commercial rather than generic illustration. You install the belt CLI skill, log in, and run app workflows such as bytedance/seedream-4-5 with structured JSON prompts. Use it when you are refreshing a shop catalog, A/B testing hero images, or preparing Amazon-primary assets during a launch push. It is generative media workflow knowledge packaged for agents—not a physical lighting rig or Photoshop retouching course.621installs15Storyboard CreationStoryboard Creation teaches agents the language of film planning—shot abbreviations, camera angles, movement, and continuity rules like the 180-degree line—then applies that vocabulary to generate consistent storyboard panels with inference.sh. Solo creators use it for ads, music videos, short films, and animation prep when they need a visual script before cameras roll or generative video bills rack up. The quick start shows belt commands for Flux LoRA stills at cinematic aspect ratios and stitching multiple PNG panels into a single board for review. It complements video-ad-specs when you move from narrative planning to platform delivery specs. Complexity is intermediate: you must understand basic cinematography terms and operate the belt CLI, but you do not need a professional storyboard artist on retainer.619installs16Video Ad SpecsVideo Ad Specs is an agent skill for solo builders and small teams who run paid social without a full agency. It centralizes exact delivery requirements—aspect ratios like 9:16 for TikTok, duration bands (for example 15–30 seconds recommended), MP4/MOV limits, and sound-on defaults—so creative briefs match what each network actually accepts. The skill pairs those reference tables with the inference.sh belt CLI for generating vertical ad footage, including prompts tuned for authentic UGC-style unboxing or product moments. Use it when you are producing reels, stories, in-feed video, LinkedIn sponsored video, or YouTube pre-roll and need agents to stop guessing pixels and codecs. It sits in the launch and growth parts of the journey but also helps validate creative direction before you burn budget on rejected uploads.618installs17Competitor TeardownCompetitor Teardown is an agent skill that structures competitive analysis for indie builders validating ideas or refining positioning. It pairs a seven-layer research framework—product through positioning—with inference.sh belt commands for web research and competitor site screenshots. Use it when you need feature matrices, SWOT, pricing comparison, or visual evidence for a deck—not a one-off ChatGPT brainstorm. The skill expects the belt CLI (`npx skills add belt-sh/cli`) and logged-in belt access for Tavily search and browser capture. Solo founders can produce citable landscape summaries before committing to build, and reuse outputs in Validate pricing conversations or Launch distribution messaging. It is an integration-heavy workflow skill, not a passive checklist.617installs18Character Design SheetCharacter Design Sheet is an agent skill for solo builders who ship illustrated games, comics, or marketing characters and cannot get stable faces across generations. It teaches reference-sheet prompting—front turnarounds, clean lines on white backgrounds, concept-art framing—and points to LoRA-oriented Flux workflows through the inference.sh belt CLI. You use it when naive re-prompting wastes credits on inconsistent hair, eyes, and wardrobe. The skill is intermediate: you need belt login and comfort with JSON inputs to app runs, not deep ML training. It pairs procedural guidance (what to put on a sheet) with a concrete falai/flux-dev-lora example so agents can reproduce a baseline character block before you branch expressions or palettes. It does not replace a full rigging pipeline or legal clearance for likeness; it focuses on repeatable generative consistency for indie-scale content.613installs19Infsh Cliinfsh-cli documents the Belt CLI surface for discovering inference and generative apps. Solo and indie builders use it while connecting agents or scripts to hosted models: search your installed apps, page through the public store, narrow by media category, and pull schema details before calling an endpoint. The readme emphasizes practical discovery flows—`belt app store search`, category flags, featured and new listings, and named high-traffic image generators—so you can compare options without manually browsing a web UI. It fits teams shipping content pipelines, agent tools, or SaaS features that depend on external inference. Complexity stays beginner-friendly because commands are copy-paste bash with optional JSON output for automation.613installs20Product Hunt LaunchProduct Hunt Launch is an agent skill for solo builders and side-project makers who want launch day to look intentional, not improvised. It walks through Product Hunt listing specifications—how to size gallery heroes, tighten product names and taglines, and plan maker comments—then ties those specs to concrete belt CLI commands on inference.sh. You can generate a clean SaaS-style hero image with flux-dev-lora and run Tavily search-assistant queries to study recent top SaaS launches before you publish. The skill is built for triggers like product hunt launch, gallery optimization, and launch day tactics, so your agent knows when PH prep belongs in the workflow versus generic marketing copy. Install belt via the documented npx skills path, log in once, and reuse the command patterns as a checklist while you finalize timing and social amplification around the PH post.613installs21Agent UiAgent UI is an agent skill for solo builders who need a production-shaped chat and copilot interface in React or Next.js without assembling a dozen packages by hand. It documents the inference.sh batteries-included Agent block: install through shadcn, add the Inference SDK for a secure server-side proxy route, set `INFERENCE_API_KEY`, and mount the component with `proxyUrl` and `agentConfig` pointing at your chosen model app. Human-in-the-loop approvals, streaming, built-in tool wiring, and widget hooks address the gap between a raw API and something shippable in a SaaS dashboard. Triggers match searches for agent components, shadcn agent blocks, and copilot UI patterns. Use it in the Build phase on the frontend subphase when the product story is an in-app assistant rather than a CLI-only agent.562installs22Chat Uichat-ui is an agent skill that documents Chat UI building blocks published on ui.inference.sh for React and Next.js projects. Solo and indie builders use it when they need a credible messaging layout without hand-rolling scroll regions, bubbles, and input bars from scratch. Installation follows the familiar shadcn CLI pattern against a remote registry manifest, then imports land under a registry/blocks/chat path with named exports for container, message rows, input, and auxiliary chrome like typing indicators and avatars. The skill emphasizes composition: wrap messages in ChatContainer, map history to ChatMessage with user versus assistant roles, and wire ChatInput onSubmit to your send or streaming pipeline with disabled state while the model responds. It pairs naturally with agent backends and inference.sh belt tooling mentioned in the header. Use during the build phase once routing and API keys exist; it does not replace auth, persistence, or SSE wiring but accelerates the visible chat shell product reviewers expect.522installs23Javascript Sdkjavascript-sdk documents common agent construction patterns for the Inference.sh JavaScript SDK. Solo builders shipping agent features in TypeScript learn how to register sub-agents as agentTool definitions, attach them to an orchestrator with a clear system prompt, and run sendMessage flows for tasks like blog generation from research plus writing. A second major pattern shows retrieval-augmented generation: wrap a search assistant app as appTool, pass queries and context into the agent, and keep generation grounded on fetched results. The readme is pattern-oriented reference material rather than a full API dump—ideal when you already have an inf_ API key and want paste-ready structure for my-org/researcher@latest-style refs. Install when you are in active Build work on agent backends or automation services that must call hosted inference apps consistently.521installs24Python SdkPython SDK documents how solo builders use the inference.sh Python package to call hundreds of hosted AI applications from application code instead of hand-rolling HTTP for every model. After pip install inferencesh, you construct a client with an API key and invoke apps by slug with JSON inputs, which suits automation scripts, backend services, and agent prototypes alike. The skill text highlights synchronous and asynchronous modes, streaming responses, uploads, tool-builder flows, and human-in-the-loop approval patterns common when shipping agent features safely. It assumes Python 3.8+ and pairs with the belt CLI skill mentioned for broader CLI workflows. Use it when you are past prototype prompts in chat and need durable Python integration for RAG, image generation, or multi-step agents running on inference.sh infrastructure.515installs25Widgets Uiwidgets-ui is an agent skill for turning structured JSON into interactive React/Next.js UI using inference.sh’s widget renderer and shadcn registry blocks. Solo builders shipping chat or agent products use it when models return declarative layouts—forms for confirmations, cards for summaries, buttons for next steps—without generating fragile JSX on every turn. The skill documents Widget and WidgetAction types, basic WidgetRenderer usage, and layout primitives (row, col, box) so Codex, Cursor, or Claude Code can scaffold blocks consistently. It fits the build phase when you already have a Next app and need a predictable bridge from agent structured output to shadcn-styled components, especially for dynamic forms and data display rather than one-off marketing pages.515installs26Tools Uitools-ui is a frontend skill package for solo builders shipping agent-powered apps on React or Next.js. It pulls prebuilt blocks from ui.inference.sh so you can show each tool invocation as a first-class UI moment: queued, executing, awaiting human approval, succeeded, or failed. That matters when you move from a terminal-only agent to a product customers trust—you need visible progress, explicit approve/deny gates, and readable results instead of raw JSON dumps. Installation follows the familiar shadcn add pattern, which keeps customization in your repo. Use it while building dashboards, copilots, or internal ops tools where Claude, Cursor-driven backends, or MCP tools call external functions and users must see the lifecycle clearly.514installs27Background RemovalBackground Removal is an agent skill for solo builders who need production-ready cutouts without opening a desktop editor. It drives the inference.sh belt CLI: authenticate once, then run the infsh/birefnet app with a public image URL to strip backgrounds with BiRefNet, or use falai/reve with natural-language prompts to delete backgrounds or place subjects on new scenes. The skill targets e-commerce listings, portraits, marketing assets, and any workflow that needs transparent PNGs or consistent catalog imagery. It assumes you install belt (npx skills add belt-sh/cli is referenced) and are comfortable passing JSON inputs from bash. You can chain generation and edit steps when assets do not exist yet. This is a hosted inference integration—not local rembg—so plan for network access and inference.sh account setup. Intermediate complexity reflects CLI JSON and multi-app workflows.509installs28Speech To TextSpeech-to-text is an agent skill package for solo builders who need reliable audio transcription inside Claude Code, Cursor, Codex, or similar agents. It documents how to use the inference.sh belt CLI to invoke ElevenLabs Scribe v2 for high-accuracy, diarized output across 90+ languages, or Whisper variants when speed or familiarity matters. Typical jobs include meeting transcripts, podcast show notes, SRT-style subtitle generation, and turning voice memos into editable text. You trigger it when searches or tasks mention speech to text, whisper, STT, or transcribe meeting—not when you only need live voice chat. It assumes you install the belt CLI and log in once; the skill itself is procedural knowledge plus model selection guidance, not a hosted API. Pair it with content or PM skills downstream once you have clean text to summarize or ship.505installs29Text To SpeechText-to-Speech is an inference.sh agent skill that turns written copy into spoken audio using the belt CLI and a catalog of hosted synthesis apps. Solo and indie builders use it when they need believable narration, character dialogue, accessibility readouts, or podcast-style multi-speaker tracks without standing up their own TTS stack or juggling separate vendor SDKs. The skill documents login, model selection, and example invocations from quick one-liners through longer scripts with delivery control. It maps naturally to product work—wiring voice into agents and games during build—but also supports launch demos, marketing voiceovers, and grow-phase content like serialized podcasts. Triggers in the skill description emphasize voice generation, cloning, and provider names (ElevenLabs, Inworld), which helps discovery when you are searching for “AI voiceover” or “text to audio” inside Claude Code, Cursor, or similar agents. Install the belt CLI skill first, then run named apps with your text payload.505installs