Nano Banana Pro

Name: Nano Banana Pro
Author: intellectronica

intellectronica/agent-skills

2.9k installs
279 repo stars
Updated April 25, 2026
intellectronica/agent-skills

nano-banana-pro is an agent skill that generates and edits PNG images using Google's Gemini 3 Pro Image API via a uv-run Python script.

About

nano-banana-pro generates and edits images through Google's Nano Banana Pro API, implemented as Gemini 3 Pro Image, using a bundled Python script run with uv from the user's current working directory. Text-to-image mode passes a prompt and output filename; image-to-image editing adds --input-image without reading the source file into context first. Resolution flags accept 1K as default, 2K for medium output, and 4K for high resolution, mapping user phrases like 1080p or ultra to the correct API parameter. API keys resolve from --api-key or GEMINI_API_KEY environment variable with a hard error when missing. Filenames follow yyyy-mm-dd-hh-mm-ss-descriptive-name.png patterns with concise hyphenated descriptors. The skill instructs agents not to read generated images back and to report only the saved file path. Editing prompts cover add or remove elements, style changes, color adjustments, and background blur tasks on user-supplied image paths.

Runs generate_image.py via uv for text-to-image and image-to-image editing.
Supports 1K 2K and 4K resolution flags mapped from natural language requests.
Uses --input-image for edits without reading the source image into agent context.
Resolves GEMINI_API_KEY from flag or environment with explicit missing-key errors.
Saves timestamped PNG filenames in the user's current working directory.

Nano Banana Pro by the numbers

2,859 all-time installs (skills.sh)
+58 installs in the week ending Jul 28, 2026 (Skillselion tracking)
Ranked #136 of 1,340 Generative Media skills by installs in the Skillselion catalog
Security screen: HIGH risk (skills.sh audit)
Data as of Jul 28, 2026 (Skillselion catalog sync)

At a glance

nano-banana-pro capabilities & compatibility

Capabilities: text to image generation via gemini 3 pro image · image to image editing with input image path · 1k 2k 4k resolution parameter mapping · timestamped png output filename generation
Works with: openai
Use cases: image generation
Pricing: Bring your own API key

From the docs

What nano-banana-pro says it does

DO NOT read the image file first - use this skill directly with the --input-image parameter.

SKILL.md

npx skills add https://github.com/intellectronica/agent-skills --skill nano-banana-pro

Add your badge

Show developers this skill is listed on Skillselion. Paste this into your README.

[![Listed on Skillselion](https://skillselion.com/badge/skills/intellectronica/agent-skills/nano-banana-pro.svg)](https://skillselion.com/skills/intellectronica/agent-skills/nano-banana-pro)

Installs	2.9k
repo stars	★ 279
Security audit	1 / 3 scanners passed
Last updated	April 25, 2026
Repository	intellectronica/agent-skills ↗

How do I generate or edit images with Gemini 3 Pro Image at 1K, 2K, or 4K without manually wiring the API?

Generate or edit PNG images via Google's Nano Banana Pro Gemini 3 Pro Image API with 1K, 2K, or 4K resolution options.

Who is it for?

Developers who want agent-driven text-to-image or image-to-image edits through Gemini with resolution control.

Skip if: Skip when you need video, 3D assets, or a non-Gemini image provider.

When should I use this skill?

User asks to generate, create, edit, modify, or change an image, including modify this image with a file path.

What you get

A PNG saved to the working directory with the script-printed full output path reported to the user.

PNG image file in working directory

By the numbers

Supports 3 resolution options: 1K, 2K, and 4K
Requires Python >=3.10 with google-genai>=1.0.0 and pillow>=10.0.0

Files

SKILL.mdMarkdownGitHub ↗

Nano Banana Pro Image Generation & Editing

Generate new images or edit existing ones using Google's Nano Banana Pro API (Gemini 3 Pro Image).

Usage

Run the script using absolute path (do NOT cd to skill directory first):

Generate new image:

uv run ~/.claude/skills/nano-banana-pro/scripts/generate_image.py --prompt "your image description" --filename "output-name.png" [--resolution 1K|2K|4K] [--api-key KEY]

Edit existing image:

uv run ~/.claude/skills/nano-banana-pro/scripts/generate_image.py --prompt "editing instructions" --filename "output-name.png" --input-image "path/to/input.png" [--resolution 1K|2K|4K] [--api-key KEY]

Important: Always run from the user's current working directory so images are saved where the user is working, not in the skill directory.

Resolution Options

The Gemini 3 Pro Image API supports three resolutions (uppercase K required):

1K (default) - ~1024px resolution
2K - ~2048px resolution
4K - ~4096px resolution

Map user requests to API parameters:

No mention of resolution → 1K
"low resolution", "1080", "1080p", "1K" → 1K
"2K", "2048", "normal", "medium resolution" → 2K
"high resolution", "high-res", "hi-res", "4K", "ultra" → 4K

API Key

The script checks for API key in this order: 1. --api-key argument (use if user provided key in chat) 2. GEMINI_API_KEY environment variable

If neither is available, the script exits with an error message.

Filename Generation

Generate filenames with the pattern: yyyy-mm-dd-hh-mm-ss-name.png

Format: {timestamp}-{descriptive-name}.png

Timestamp: Current date/time in format yyyy-mm-dd-hh-mm-ss (24-hour format)
Name: Descriptive lowercase text with hyphens
Keep the descriptive part concise (1-5 words typically)
Use context from user's prompt or conversation
If unclear, use random identifier (e.g., x9k2, a7b3)

Examples:

Prompt "A serene Japanese garden" → 2025-11-23-14-23-05-japanese-garden.png
Prompt "sunset over mountains" → 2025-11-23-15-30-12-sunset-mountains.png
Prompt "create an image of a robot" → 2025-11-23-16-45-33-robot.png
Unclear context → 2025-11-23-17-12-48-x9k2.png

Image Editing

When the user wants to modify an existing image: 1. Check if they provide an image path or reference an image in the current directory 2. Use --input-image parameter with the path to the image 3. The prompt should contain editing instructions (e.g., "make the sky more dramatic", "remove the person", "change to cartoon style") 4. Common editing tasks: add/remove elements, change style, adjust colors, blur background, etc.

Prompt Handling

For generation: Pass user's image description as-is to --prompt. Only rework if clearly insufficient.

For editing: Pass editing instructions in --prompt (e.g., "add a rainbow in the sky", "make it look like a watercolor painting")

Preserve user's creative intent in both cases.

Output

Saves PNG to current directory (or specified path if filename includes directory)
Script outputs the full path to the generated image
Do not read the image back - just inform the user of the saved path

Examples

Generate new image:

uv run ~/.claude/skills/nano-banana-pro/scripts/generate_image.py --prompt "A serene Japanese garden with cherry blossoms" --filename "2025-11-23-14-23-05-japanese-garden.png" --resolution 4K

Edit existing image:

uv run ~/.claude/skills/nano-banana-pro/scripts/generate_image.py --prompt "make the sky more dramatic with storm clouds" --filename "2025-11-23-14-25-30-dramatic-sky.png" --input-image "original-photo.jpg" --resolution 2K

#!/usr/bin/env python3
# /// script
# requires-python = ">=3.10"
# dependencies = [
#     "google-genai>=1.0.0",
#     "pillow>=10.0.0",
# ]
# ///
"""
Generate images using Google's Nano Banana Pro (Gemini 3 Pro Image) API.

Usage:
    uv run generate_image.py --prompt "your image description" --filename "output.png" [--resolution 1K|2K|4K] [--api-key KEY]
"""

import argparse
import os
import sys
from pathlib import Path


def get_api_key(provided_key: str | None) -> str | None:
    """Get API key from argument first, then environment."""
    if provided_key:
        return provided_key
    return os.environ.get("GEMINI_API_KEY")


def main():
    parser = argparse.ArgumentParser(
        description="Generate images using Nano Banana Pro (Gemini 3 Pro Image)"
    )
    parser.add_argument(
        "--prompt", "-p",
        required=True,
        help="Image description/prompt"
    )
    parser.add_argument(
        "--filename", "-f",
        required=True,
        help="Output filename (e.g., sunset-mountains.png)"
    )
    parser.add_argument(
        "--input-image", "-i",
        help="Optional input image path for editing/modification"
    )
    parser.add_argument(
        "--resolution", "-r",
        choices=["1K", "2K", "4K"],
        default="1K",
        help="Output resolution: 1K (default), 2K, or 4K"
    )
    parser.add_argument(
        "--api-key", "-k",
        help="Gemini API key (overrides GEMINI_API_KEY env var)"
    )

    args = parser.parse_args()

    # Get API key
    api_key = get_api_key(args.api_key)
    if not api_key:
        print("Error: No API key provided.", file=sys.stderr)
        print("Please either:", file=sys.stderr)
        print("  1. Provide --api-key argument", file=sys.stderr)
        print("  2. Set GEMINI_API_KEY environment variable", file=sys.stderr)
        sys.exit(1)

    # Import here after checking API key to avoid slow import on error
    from google import genai
    from google.genai import types
    from PIL import Image as PILImage

    # Initialise client
    client = genai.Client(api_key=api_key)

    # Set up output path
    output_path = Path(args.filename)
    output_path.parent.mkdir(parents=True, exist_ok=True)

    # Load input image if provided
    input_image = None
    output_resolution = args.resolution
    if args.input_image:
        try:
            input_image = PILImage.open(args.input_image)
            print(f"Loaded input image: {args.input_image}")

            # Auto-detect resolution if not explicitly set by user
            if args.resolution == "1K":  # Default value
                # Map input image size to resolution
                width, height = input_image.size
                max_dim = max(width, height)
                if max_dim >= 3000:
                    output_resolution = "4K"
                elif max_dim >= 1500:
                    output_resolution = "2K"
                else:
                    output_resolution = "1K"
                print(f"Auto-detected resolution: {output_resolution} (from input {width}x{height})")
        except Exception as e:
            print(f"Error loading input image: {e}", file=sys.stderr)
            sys.exit(1)

    # Build contents (image first if editing, prompt only if generating)
    if input_image:
        contents = [input_image, args.prompt]
        print(f"Editing image with resolution {output_resolution}...")
    else:
        contents = args.prompt
        print(f"Generating image with resolution {output_resolution}...")

    try:
        response = client.models.generate_content(
            model="gemini-3-pro-image-preview",
            contents=contents,
            config=types.GenerateContentConfig(
                response_modalities=["TEXT", "IMAGE"],
                image_config=types.ImageConfig(
                    image_size=output_resolution
                )
            )
        )
        
        # Process response and convert to PNG
        image_saved = False
        for part in response.parts:
            if part.text is not None:
                print(f"Model response: {part.text}")
            elif part.inline_data is not None:
                # Convert inline data to PIL Image and save as PNG
                from io import BytesIO

                # inline_data.data is already bytes, not base64
                image_data = part.inline_data.data
                if isinstance(image_data, str):
                    # If it's a string, it might be base64
                    import base64
                    image_data = base64.b64decode(image_data)

                image = PILImage.open(BytesIO(image_data))

                # Ensure RGB mode for PNG (convert RGBA to RGB with white background if needed)
                if image.mode == 'RGBA':
                    rgb_image = PILImage.new('RGB', image.size, (255, 255, 255))
                    rgb_image.paste(image, mask=image.split()[3])
                    rgb_image.save(str(output_path), 'PNG')
                elif image.mode == 'RGB':
                    image.save(str(output_path), 'PNG')
                else:
                    image.convert('RGB').save(str(output_path), 'PNG')
                image_saved = True
        
        if image_saved:
            full_path = output_path.resolve()
            print(f"\nImage saved: {full_path}")
        else:
            print("Error: No image was generated in the response.", file=sys.stderr)
            sys.exit(1)
            
    except Exception as e:
        print(f"Error generating image: {e}", file=sys.stderr)
        sys.exit(1)


if __name__ == "__main__":
    main()