
Generate Image
Generate or edit images from the agent via OpenRouter image models (Gemini, Flux, and similar) using a documented Python CLI and .env API key setup.
Overview
generate-image is an agent skill most often used in Build (also Validate prototype assets, Launch distribution art) that runs OpenRouter-backed image generation and editing from a Python CLI.
Install
npx skills add https://github.com/k-dense-ai/scientific-agent-skills --skill generate-imageWhat is this skill?
- Python CLI for OpenRouter image generation and editing
- Documents models such as google/gemini-3.1-flash-image-preview and black-forest-labs/flux.2-pro / flux.2-flex
- Input-image + prompt path for editing workflows with base64 loading
- Resolves OPENROUTER_API_KEY from .env in cwd or parent directories
- argparse-driven interface suitable for agent shell invocation
- Supports multiple OpenRouter image models including Gemini 3.1 flash image preview and Flux 2 pro/flex
Adoption & trust: 567 installs on skills.sh; 27.6k GitHub stars; 3/3 security scanners passed (skills.sh audits).
What problem does it solve?
You need images from modern multimodal APIs but do not have a consistent, agent-invokable script for keys, models, and edit-vs-generate modes.
Who is it for?
Solo builders and researchers who already use OpenRouter and want agent-driven batch or one-off image generation and edits.
Skip if: Teams requiring on-device only generation, proprietary pipelines without network calls, or DALL·E-only workflows with no OpenRouter key.
When should I use this skill?
You need to generate or edit images via OpenRouter from an agent session using the bundled Python workflow and API key from .env.
What do I get? / Deliverables
Your agent can call the documented Python workflow with the right OpenRouter model and optional input image, yielding generated or edited image output on disk or stdout.
- Generated image files or API response payloads
- Documented model and edit-mode invocation
Recommended Skills
Journey fit
Spans multiple journey phases - primary shelf plus alternate fits below.
Primary shelf is Build because the skill delivers runnable image-generation tooling agents invoke while assembling product assets or scientific visuals. It is agent-tooling: a scriptable capability wired to external multimodal APIs rather than in-app frontend layout work.
Where it fits
Agent runs the script to produce diagram assets for a scientific report or app README.
Quick hero image for a landing prototype before committing to a designer.
Generate social or blog thumbnails for a ship-week announcement.
How it compares
OpenRouter CLI skill—not an in-browser Canvas MCP or a hosted design SaaS.
Common Questions / FAQ
Who is generate-image for?
Solo builders and scientific agent users who want OpenRouter image models (Gemini, Flux, etc.) callable from Python with clear env and editing semantics.
When should I use generate-image?
Use it in Build when creating product or paper figures via agents; in Validate for quick prototype visuals; in Launch when you need social or landing images—always with OPENROUTER_API_KEY configured.
Is generate-image safe to install?
It expects network access and API secrets; check the Security Audits panel on this page and never commit .env keys before installing.
SKILL.md
READMESKILL.md - Generate Image
#!/usr/bin/env python3 """ Generate and edit images using OpenRouter API with various image generation models. Supports models like: - google/gemini-3.1-flash-image-preview (generation and editing) - black-forest-labs/flux.2-pro (generation and editing) - black-forest-labs/flux.2-flex (generation) - And more image generation models available on OpenRouter For image editing, provide an input image along with an editing prompt. """ import sys import json import base64 import argparse from pathlib import Path from typing import Optional def check_env_file() -> Optional[str]: """Check if .env file exists and contains OPENROUTER_API_KEY.""" # Look for .env in current directory and parent directories current_dir = Path.cwd() for parent in [current_dir] + list(current_dir.parents): env_file = parent / ".env" if env_file.exists(): with open(env_file, 'r') as f: for line in f: if line.startswith('OPENROUTER_API_KEY='): api_key = line.split('=', 1)[1].strip().strip('"').strip("'") if api_key: return api_key return None def load_image_as_base64(image_path: str) -> str: """Load an image file and return it as a base64 data URL.""" path = Path(image_path) if not path.exists(): print(f"❌ Error: Image file not found: {image_path}") sys.exit(1) # Determine MIME type from extension ext = path.suffix.lower() mime_types = { '.png': 'image/png', '.jpg': 'image/jpeg', '.jpeg': 'image/jpeg', '.gif': 'image/gif', '.webp': 'image/webp', } mime_type = mime_types.get(ext, 'image/png') with open(path, 'rb') as f: image_data = f.read() base64_data = base64.b64encode(image_data).decode('utf-8') return f"data:{mime_type};base64,{base64_data}" def save_base64_image(base64_data: str, output_path: str) -> None: """Save base64 encoded image to file.""" # Remove data URL prefix if present if ',' in base64_data: base64_data = base64_data.split(',', 1)[1] # Decode and save image_data = base64.b64decode(base64_data) with open(output_path, 'wb') as f: f.write(image_data) def generate_image( prompt: str, model: str = "google/gemini-3.1-flash-image-preview", output_path: str = "generated_image.png", api_key: Optional[str] = None, input_image: Optional[str] = None ) -> dict: """ Generate or edit an image using OpenRouter API. Args: prompt: Text description of the image to generate, or editing instructions model: OpenRouter model ID (default: google/gemini-3.1-flash-image-preview) output_path: Path to save the generated image api_key: OpenRouter API key (will check .env if not provided) input_image: Path to an input image for editing (optional) Returns: dict: Response from OpenRouter API """ try: import requests except ImportError: print("Error: 'requests' library not found. Install with: pip install requests") sys.exit(1) # Check for API key if not api_key: api_key = check_env_file() if not api_key: print("❌ Error: OPENROUTER_API_KEY not found!") print("\nPlease create a .env file in your project directory with:") print("OPENROUTER_API_KEY=your-api-key-here") print("\nOr set the environment variable:") print("export OPENROUTER_API_KEY=your-api-key-here") print("\nGet your API key from: https://openrouter.ai/keys") sys.exit(1) # Determine if this is generation or editing is_editing = input_image is not None if is_editing: print(f"✏️ Editing image with model: {model}") print(f"📷 Input image: {input_image}") print(f"📝 Edit prompt: {prompt}") # Load input image as base64 image_dat