
Ui Test
Run browser-driven UI checks with diff context, structured pass markers, and adversarial clicks before you ship frontend changes.
Overview
Ui-test is an agent skill for the Ship phase that runs diff-aware browser UI checks with snapshots, clicks, and structured STEP_PASS evidence.
Install
npx skills add https://github.com/browserbase/skills --skill ui-testWhat is this skill?
- Diff-driven flow: git diff scopes which component changed before browsing
- Full assertion protocol: before/after snapshots plus structured STEP_PASS|step|evidence markers
- Happy-path and adversarial cases such as rapid repeated clicks on the same control
- Browse automation commands: open, wait load, snapshot, click, stop with ref ids like @0-8
- Example summaries report passed counts such as 3/3 with per-step evidence lines
- Documented examples use a 3/3 passed summary format with STEP_PASS lines
- Assertion protocol combines before/after comparison with adversarial testing
Adoption & trust: 2k installs on skills.sh; 3.5k GitHub stars; 2/3 security scanners passed (skills.sh audits).
What problem does it solve?
You changed a button or hero section and have no fast, documented proof the UI still works under normal and sloppy clicks.
Who is it for?
Solo builders iterating on localhost UI who want agent-guided browse tests tied to the latest git diff.
Skip if: Backend-only changes, load testing at scale, or teams that require checked-in Playwright/Cypress suites as the sole source of truth.
When should I use this skill?
A user asks to test a UI change, verify a component after a diff, or validate clicks and stability in the browser.
What do I get? / Deliverables
You leave a markdown-friendly test result block with per-step STEP_PASS lines and a passed/total summary the agent or you can attach to a PR.
- UI Test Results section with STEP_PASS|step|evidence lines
- Summary line with passed/total counts
Recommended Skills
Journey fit
Automated UI verification with explicit evidence belongs in Ship under testing, right before release or PR sign-off. Testing is the canonical home for snapshot-compare and click flows against localhost or staged URLs.
How it compares
Agent-driven browse assertion workflow, not a replacement for a full CI browser farm or visual Percy baseline service.
Common Questions / FAQ
Who is ui-test for?
Solo and indie frontend builders who use browser automation from the agent and want structured UI regression evidence after small component changes.
When should I use ui-test?
Use it in Ship when a git diff touches UI, before merging a CTA or layout change, or when you need adversarial click checks on localhost after a dev-server refresh.
Is ui-test safe to install?
It drives a browser against URLs you specify (often localhost); review the Security Audits panel on this Prism page and avoid aiming it at production admin surfaces without scoping.
SKILL.md
READMESKILL.md - Ui Test
# UI Test Examples Each example demonstrates the full assertion protocol: before/after comparison, structured markers, and adversarial testing. ## Example 1: Diff-Driven Component Test (Happy + Adversarial) **User request**: "I updated the CTA button text. Test it." ```bash # Analyze diff git diff --name-only HEAD~1 # Output: src/components/HeroSection.tsx git diff HEAD~1 -- src/components/HeroSection.tsx # Shows: "Get Started" changed to "Start Free Trial" # Setup browse open http://localhost:3000/ --local browse wait load # BEFORE snapshot browse snapshot # Tree: @0-8 button "Start Free Trial" # Evidence: button exists with new text # Happy path: button is clickable browse click @0-8 browse snapshot # AFTER: check no crash, page still functional # STEP_PASS|cta-text|button "Start Free Trial" found at @0-8 # STEP_PASS|cta-click|button click succeeded, page intact after click # Adversarial: rapid click browse open http://localhost:3000/ browse wait load browse snapshot browse click @0-8 browse click @0-8 browse click @0-8 browse snapshot # Check: no duplicate dialogs, no console errors, page still stable # STEP_PASS|cta-rapid-click|3 rapid clicks, page remains stable, no duplicate side effects browse stop ``` **Result**: ``` ## UI Test Results ### STEP_PASS|cta-text|button "Start Free Trial" found at @0-8 ### STEP_PASS|cta-click|clicked @0-8, page intact ### STEP_PASS|cta-rapid-click|3 rapid clicks, no duplicate effects **Summary: 3/3 passed** ``` ## Example 2: Form Validation — Happy Path, Errors, and Adversarial **User request**: "I added email validation to the signup form. Test it thoroughly." ```bash browse open http://localhost:3000/signup --local browse wait load # ---- Test 1: Invalid email → error ---- # BEFORE browse snapshot # @0-3 textbox "Email", @0-7 button "Sign Up" # ACT browse fill "input[name=email]" "not-an-email" browse click @0-7 # AFTER browse snapshot # @0-9 alert "Please enter a valid email" # STEP_PASS|invalid-email|alert "Please enter a valid email" appeared at @0-9 # ---- Test 2: Valid email → success ---- browse open http://localhost:3000/signup browse wait load # BEFORE browse snapshot # ACT browse fill "input[name=email]" "user@example.com" browse click @0-7 browse wait load # AFTER browse snapshot # heading "Welcome! Check your email." appeared, form gone # STEP_PASS|valid-email|heading "Welcome!" appeared, form removed from tree # ---- Test 3: Empty submission ---- browse open http://localhost:3000/signup browse wait load # BEFORE browse snapshot # ACT: submit with nothing filled browse click @0-7 # AFTER browse snapshot # Check: error message? Or silent failure? Or crash? # STEP_PASS|empty-submit|alert "Please enter a valid email" appeared — form handles empty input # ---- Test 4: XSS in email field ---- browse open http://localhost:3000/signup browse wait load browse fill "input[name=email]" "<script>alert('xss')</script>" browse click @0-7 browse snapshot # Note: the XSS payload WILL appear in the snapshot as StaticText inside the input # field — that's just the input value, not rendered HTML. The real checks are: # 1. Is there a validation error? (email format rejected) # 2. Is the payload rendered as HTML outside the input? (check for script execution) browse eval "document.querySelector('[role=alert]')?.textContent || 'no alert'" # Result: "Please enter a valid email" browse eval "document.querySelector('input[name=email]')?.value" # Result: "<script>alert('xss')</script>" — payload stays as text in the input, not rendered as HTML # STEP_PASS|xss-email|XSS payload rejected by validation, no inline script injection detected # ---- Test 5: Very long email ---- browse open http://localhost:3000/signup browse wait load browse fill "input[name=email]" "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa@test.com" browse snapshot # Check: does the input overflow its container? Is layout broken? browse screenshot --path /tmp/long-email.png # Visual check