Generate Image

Name: Generate Image
Author: gupsammy

gupsammy/claudest

Craft mode-specific image-generation prompts for photoreal scenes, product shots, logos, illustrations, and legible text overlays.

Overview

Generate Image is an agent skill for the Build phase that supplies mode-specific prompting patterns for photoreal, product, logo, illustration, and text-in-image outputs.

Install

npx skills add https://github.com/gupsammy/claudest --skill generate-image

What is this skill?

Five capability modes: photorealistic scenes, product photography, logos & text, stylized illustration, text rendering
Photoreal mode: lens, aperture, lighting direction, and mood spelled like a photographer brief
Product modes: isolation (e-commerce), lifestyle, and hero shots with text-safe framing
Logo and text: quoted strings, typography weight/style, placement, and iterative refinement
Nano Banana–oriented text rendering tips (quoted copy, font characteristics, placement)
Five documented capability pattern sections (photorealistic, product, logos & text, stylized illustration, text renderin

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 1 installs on skills.sh; 253 GitHub stars; 2/3 security scanners passed (skills.sh audits); trending (+100% hot-view momentum).

What problem does it solve?

Your image prompts are vague, so generations miss lens, lighting, typography, or format constraints you need for ship-ready visuals.

Who is it for?

Indie builders producing storefront, social, or app visuals who want repeatable prompt recipes per visual mode.

Skip if: Builders who only need SVG/code logos, automated batch pipelines, or image APIs wired without prompt craft.

When should I use this skill?

During prompt crafting workflow step 2 when loading mode-specific image prompting guidance.

What do I get? / Deliverables

You get structured, mode-aware prompts—camera, light, style, and quoted text—ready to paste into your image generator and refine.

Mode-specific image prompts
Iteration-ready text and typography briefs

Recommended Skills

Video Editagentspace-so/runcomfy-agent-skills

Video Edit is a RunComfy-focused agent skill that acts as a smart router between your edit intent and the correct model …211k installs·15 stars

Image To Videoagentspace-so/runcomfy-agent-skills

Image-to-Video on RunComfy picks the right i2v model for each intent—HappyHorse for general animation, Wan 2.7 with audi…210k installs·15 stars

Image Editagentspace-so/runcomfy-agent-skills

Image Edit is a RunComfy Pro Pack agent skill that acts as a smart router between your edit intent and the right model i…210k installs·15 stars

Flux Kontextagentspace-so/runcomfy-agent-skills

Flux Kontext Pro on RunComfy packages Black Forest Labs' precise local edit model with documented prompting patterns and…210k installs·15 stars

Nano Banana 2agentspace-so/runcomfy-agent-skills

Nano Banana 2 on RunComfy wraps Google's Gemini-family flash text-to-image model with prompting patterns for fast iterat…210k installs·15 stars

Nano Banana Editagentspace-so/runcomfy-agent-skills

Nano Banana Edit on RunComfy documents Google's image-to-image edit endpoint for identity-preserving changes, background…210k installs·15 stars

Journey fit

Primary fit

BuildUI/UX & frontend

Visual assets for landing pages, apps, and marketing are produced during Build when UI and brand collateral are created. Prompt patterns target front-of-house visuals—hero images, stickers, e-commerce shots—not backend or ops work.

Also useful

LaunchDistribution & launch channels

How it compares

Prompt-pattern skill for generative art—not a hosted image API or design-system component library.

Common Questions / FAQ

Who is generate-image for?

Solo builders and designers using AI coding agents who need stronger image-generation prompts for marketing, product, and UI collateral during product build.

When should I use generate-image?

In the Build phase while crafting prompts for heroes, product shots, stickers, or branded text renders before launch creative goes live.

Is generate-image safe to install?

Check the Security Audits panel on this Prism page; the skill is instructional text only but your image tool may need network/API access separately.

SKILL.md

READMESKILL.md - Generate Image

# Capability Patterns

Mode-specific prompting tips. Load the relevant section during prompt crafting (workflow step 2).

---

## Photorealistic Scenes

Think like a photographer: describe lens, light, moment.

- Specify camera (85mm portrait, 24mm wide), aperture (f/1.8 bokeh, f/11 sharp throughout)
- Describe lighting direction and quality (golden hour from camera-left, three-point softbox)
- Include mood and format (serene, vertical portrait)

## Product Photography

- Isolation: Clean white backdrop, soft even lighting, e-commerce ready
- Lifestyle: Product in use context, natural setting, aspirational but authentic
- Hero shots: Cinematic framing, dramatic lighting, space for text overlay

## Logos & Text

- Put text in quotes: `'Morning Brew Coffee Co'`
- Describe typography: "clean bold sans-serif with generous letter-spacing"
- Specify color scheme, shape constraints, design intent
- Iterate with follow-up edits for refinement

## Stylized Illustration

- Name the style: "kawaii-style sticker", "anime-influenced", "vintage travel poster"
- Describe design language: "bold outlines, flat colors, cel-shading"
- Include format constraints: "white background", "die-cut sticker format"

## Text Rendering

Nano Banana has advanced text rendering capabilities. For best results:
- Put all text in single quotes within the prompt
- Describe font characteristics: weight, style, size relative to the image
- Specify text placement: "centered at the top," "bottom-right corner"
- For multiple text elements, describe each separately with position
- Use `--thinking high` for complex multi-line text or precise typography

## Google Search Grounding

Enable with `--grounding` flag when real-time data helps (weather visualizations, current events infographics, real-world data charts).

**Image search grounding** (Nano Banana only): Add `--image-grounding` alongside `--grounding` to enable image search results as additional visual context. Useful when the model needs to reference real-world visuals (product designs, architectural styles, specific locations).

---

## Best Practices

### Hyper-Specificity

Vague prompts produce generic results. Every unspecified attribute becomes a random variable.

```
Vague:    "A woman in a park"
Specific: "A 30-year-old woman with shoulder-length auburn hair sits cross-legged
           on a green wool blanket in a sun-dappled oak grove, reading a hardcover
           book. Late afternoon golden hour, shallow depth of field at f/2.0."
```

Quantities, colors, materials, spatial positions, and named objects all reduce variance.

### Context & Intent

State what the image is for. Purpose shapes composition, mood, and framing decisions.

```
Generic:     "A flat white coffee on a marble counter"
With intent: "A hero image for an artisan coffee brand's homepage — a flat white
              in a handmade ceramic cup on a marble counter, steam rising, soft
              morning light from the left, negative space on the right for text overlay"
```

### Step-by-Step Instructions

Complex scenes benefit from sequential directives rather than a single compound sentence.

```
"Start with a wide establishing shot of a misty fjord at dawn.
 In the foreground, place a wooden dock extending from the lower left.
 A small red sailboat is moored at the dock's end.
 Mountains fill the background, their peaks just catching the first golden light.
 The water is perfectly still, creating mirror reflections."
```

### Positive Framing for Exclusions

Naming a concept under negation ("no X", "not X") biases the output toward X — diffusion models condition on tokens regardless of polarity. To exclude something, name a positive alternative that fills the same role, or scope the scene so the unwanted element is physically not there.

```
Bad:   "A professional headshot on a neutral gray backdrop.
        No distracting background elements, no visible logos or text,
        no harsh shadows on the face."

Good:  "A professional he

What is this skill?

Five capability modes: photorealistic scenes, product photography, logos & text, stylized illustration, text rendering

Photoreal mode: lens, aperture, lighting direction, and mood spelled like a photographer brief

Product modes: isolation (e-commerce), lifestyle, and hero shots with text-safe framing

Logo and text: quoted strings, typography weight/style, placement, and iterative refinement

Nano Banana–oriented text rendering tips (quoted copy, font characteristics, placement)

Five documented capability pattern sections (photorealistic, product, logos & text, stylized illustration, text renderin

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 1 installs on skills.sh; 253 GitHub stars; 2/3 security scanners passed (skills.sh audits); trending (+100% hot-view momentum).

Journey fit

Primary fit

BuildUI/UX & frontend

Also useful

LaunchDistribution & launch channels

SKILL.md

READMESKILL.md - Generate Image

# Capability Patterns

Mode-specific prompting tips. Load the relevant section during prompt crafting (workflow step 2).

---

## Photorealistic Scenes

Think like a photographer: describe lens, light, moment.

- Specify camera (85mm portrait, 24mm wide), aperture (f/1.8 bokeh, f/11 sharp throughout)
- Describe lighting direction and quality (golden hour from camera-left, three-point softbox)
- Include mood and format (serene, vertical portrait)

## Product Photography

- Isolation: Clean white backdrop, soft even lighting, e-commerce ready
- Lifestyle: Product in use context, natural setting, aspirational but authentic
- Hero shots: Cinematic framing, dramatic lighting, space for text overlay

## Logos & Text

- Put text in quotes: `'Morning Brew Coffee Co'`
- Describe typography: "clean bold sans-serif with generous letter-spacing"
- Specify color scheme, shape constraints, design intent
- Iterate with follow-up edits for refinement

## Stylized Illustration

- Name the style: "kawaii-style sticker", "anime-influenced", "vintage travel poster"
- Describe design language: "bold outlines, flat colors, cel-shading"
- Include format constraints: "white background", "die-cut sticker format"

## Text Rendering

Nano Banana has advanced text rendering capabilities. For best results:
- Put all text in single quotes within the prompt
- Describe font characteristics: weight, style, size relative to the image
- Specify text placement: "centered at the top," "bottom-right corner"
- For multiple text elements, describe each separately with position
- Use `--thinking high` for complex multi-line text or precise typography

## Google Search Grounding

Enable with `--grounding` flag when real-time data helps (weather visualizations, current events infographics, real-world data charts).

**Image search grounding** (Nano Banana only): Add `--image-grounding` alongside `--grounding` to enable image search results as additional visual context. Useful when the model needs to reference real-world visuals (product designs, architectural styles, specific locations).

---

## Best Practices

### Hyper-Specificity

Vague prompts produce generic results. Every unspecified attribute becomes a random variable.

```
Vague:    "A woman in a park"
Specific: "A 30-year-old woman with shoulder-length auburn hair sits cross-legged
           on a green wool blanket in a sun-dappled oak grove, reading a hardcover
           book. Late afternoon golden hour, shallow depth of field at f/2.0."
```

Quantities, colors, materials, spatial positions, and named objects all reduce variance.

### Context & Intent

State what the image is for. Purpose shapes composition, mood, and framing decisions.

```
Generic:     "A flat white coffee on a marble counter"
With intent: "A hero image for an artisan coffee brand's homepage — a flat white
              in a handmade ceramic cup on a marble counter, steam rising, soft
              morning light from the left, negative space on the right for text overlay"
```

### Step-by-Step Instructions

Complex scenes benefit from sequential directives rather than a single compound sentence.

```
"Start with a wide establishing shot of a misty fjord at dawn.
 In the foreground, place a wooden dock extending from the lower left.
 A small red sailboat is moored at the dock's end.
 Mountains fill the background, their peaks just catching the first golden light.
 The water is perfectly still, creating mirror reflections."
```

### Positive Framing for Exclusions

Naming a concept under negation ("no X", "not X") biases the output toward X — diffusion models condition on tokens regardless of polarity. To exclude something, name a positive alternative that fills the same role, or scope the scene so the unwanted element is physically not there.

```
Bad:   "A professional headshot on a neutral gray backdrop.
        No distracting background elements, no visible logos or text,
        no harsh shadows on the face."

Good:  "A professional he

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Who is generate-image for?

When should I use generate-image?

Is generate-image safe to install?

SKILL.md

This week for builders

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Who is generate-image for?

When should I use generate-image?

Is generate-image safe to install?

SKILL.md