
Capture Screen
Capture precise macOS application window screenshots by resolving window IDs, steering UI with AppleScript, and exporting PNGs for docs or visual QA.
Overview
Capture Screen is an agent skill most often used in Build (also Ship, Launch) that finds macOS window IDs, controls app views with AppleScript, and captures images with screencapture.
Install
npx skills add https://github.com/daymade/claude-code-skills --skill capture-screenWhat is this skill?
- Resolves window IDs via Swift CGWindowListCopyWindowInfo with get_window_id.swift
- Documents inline swift -e snippet to keyword-match owner and window name
- AppleScript osascript step for zoom, scroll, and cell selection before capture
- screencapture -x -l WID for targeted window PNG/JPEG without full desktop noise
- Explicit three-step workflow diagram: Find Window → Control View → Capture
- Three-step workflow: Find Window, Control View, Capture
- Recommends Swift CGWindowListCopyWindowInfo as the only reliable window ID method on macOS
Adoption & trust: 510 installs on skills.sh; 1.2k GitHub stars; 3/3 security scanners passed (skills.sh audits).
What problem does it solve?
You need repeatable screenshots of specific app windows for docs or reviews, but manual cropping and wrong-window captures waste time.
Who is it for?
macOS-based solo builders automating UI documentation, tutorial assets, or multi-step visual capture chains tied to real apps like Excel.
Skip if: Headless Linux CI screenshots, browser-only capture without native window IDs, or teams unwilling to grant screen recording permissions on Mac.
When should I use this skill?
Use when automating screenshots, capturing application windows for documentation, or building multi-shot visual workflows on macOS.
What do I get? / Deliverables
You run a documented find-control-capture sequence that outputs accurate window-scoped PNG or JPEG files for documentation or visual workflows.
- PNG or JPEG window captures via screencapture -l
- Resolved numeric window IDs for target applications
- Prepared UI state (zoom/scroll/selection) before each shot
Recommended Skills
Journey fit
Spans multiple journey phases - primary shelf plus alternate fits below.
Screenshot automation most often ships alongside documentation and visual proof while the product is being built, even though the same flow supports launch assets later. The three-step window-find, view-control, screencapture pipeline is documentation infrastructure for agents publishing accurate UI references.
Where it fits
Capture each step of an Excel report after AppleScript positions the sheet for a README tutorial.
Attach window-scoped PNG evidence to a pre-release QA checklist for a desktop feature.
Regenerate App Store or landing-page screenshots from the live app window after a UI polish pass.
How it compares
Local macOS window capture with CoreGraphics IDs—not a cloud visual testing platform or generic desktop RPA suite.
Common Questions / FAQ
Who is capture-screen for?
Solo developers and technical writers on macOS who want agents to screenshot specific application windows for docs and visual regression notes.
When should I use capture-screen?
During Build while authoring README or in-app help visuals; during Ship for review evidence; during Launch when refreshing marketing screenshots—whenever automated macOS window captures must match controlled UI state.
Is capture-screen safe to install?
Review the Security Audits panel on this page; screen capture and AppleScript require macOS permissions and you should audit bundled Swift scripts before execution.
SKILL.md
READMESKILL.md - Capture Screen
# Capture Screen Programmatic screenshot capture on macOS: find windows, control views, capture images. ## Quick Start ```bash # Find Excel window ID swift scripts/get_window_id.swift Excel # Capture that window (replace 12345 with actual WID) screencapture -x -l 12345 output.png ``` ## Overview Three-step workflow: ``` 1. Find Window → Swift CGWindowListCopyWindowInfo → get numeric Window ID 2. Control View → AppleScript (osascript) → zoom, scroll, select 3. Capture → screencapture -l <WID> → PNG/JPEG output ``` ## Step 1: Get Window ID (Swift) Use Swift with CoreGraphics to enumerate windows. This is the **only reliable method** on macOS. ### Quick inline execution ```bash swift -e ' import CoreGraphics let keyword = "Excel" let list = CGWindowListCopyWindowInfo(.optionOnScreenOnly, kCGNullWindowID) as? [[String: Any]] ?? [] for w in list { let owner = w[kCGWindowOwnerName as String] as? String ?? "" let name = w[kCGWindowName as String] as? String ?? "" let wid = w[kCGWindowNumber as String] as? Int ?? 0 if owner.localizedCaseInsensitiveContains(keyword) || name.localizedCaseInsensitiveContains(keyword) { print("WID=\(wid) | App=\(owner) | Title=\(name)") } } ' ``` ### Using the bundled script ```bash swift scripts/get_window_id.swift Excel swift scripts/get_window_id.swift Chrome swift scripts/get_window_id.swift # List all windows ``` Output format: `WID=12345 | App=Microsoft Excel | Title=workbook.xlsx` Parse the WID number for use with `screencapture -l`. ## Step 2: Control Window (AppleScript) Verified commands for controlling application windows before capture. ### Microsoft Excel (full AppleScript support) ```bash # Activate (bring to front) osascript -e 'tell application "Microsoft Excel" to activate' # Set zoom level (percentage) osascript -e 'tell application "Microsoft Excel" set zoom of active window to 120 end tell' # Scroll to specific row osascript -e 'tell application "Microsoft Excel" set scroll row of active window to 45 end tell' # Scroll to specific column osascript -e 'tell application "Microsoft Excel" set scroll column of active window to 3 end tell' # Select a cell range osascript -e 'tell application "Microsoft Excel" select range "A1" of active sheet end tell' # Select a specific sheet osascript -e 'tell application "Microsoft Excel" activate object sheet "DCF" of active workbook end tell' # Open a file osascript -e 'tell application "Microsoft Excel" open POSIX file "/path/to/file.xlsx" end tell' ``` ### Any application (basic control) ```bash # Activate any app osascript -e 'tell application "Google Chrome" to activate' # Bring specific window to front (by index) osascript -e 'tell application "System Events" tell process "Google Chrome" perform action "AXRaise" of window 1 end tell end tell' ``` ### Timing and Timeout Always add `sleep 1` after AppleScript commands before capturing, to allow UI rendering to complete. **IMPORTANT**: `osascript` hangs indefinitely if the target application is not running or not responding. Always wrap with `timeout`: ```bash timeout 5 osascript -e 'tell application "Microsoft Excel" to activate' ``` ## Step 3: Capture (screencapture) ```bash # Capture specific window by ID screencapture -l <WID> output.png # Silent capture (no camera shutter sound) screencapture -x -l <WID> output.png # Capture as JPEG screencapture -l <WID> -t jpg output.jpg # Capture with delay (seconds) screencapture -l <WID> -T 2 output.png # Capture a screen region (interactive) screencap