E2e Tests Studio

Name: E2e Tests Studio
Author: mastra-ai

mastra-ai/mastra

Generate Playwright E2E tests for Mastra Studio playground UI changes that prove product behavior, not superficial UI clicks.

Overview

E2E Tests Studio is an agent skill most often used in Ship (also Build) that generates Playwright E2E tests validating Mastra Studio product behavior instead of cosmetic UI states.

Install

npx skills add https://github.com/mastra-ai/mastra --skill e2e-tests-studio

What is this skill?

REQUIRED trigger when creating or changing files in packages/playground-ui or packages/playground
Core rule: test product behavior (persistence, streaming, tool output) instead of dropdown/modal UI states
Step 1 questionnaire: feature intent, user problem, and success criteria before writing tests
Playwright MCP prerequisite via claude mcp add playwright -- npx @playwright/mcp@latest
Optimized for Claude Opus 4.5 per skill metadata for studio UI work
Explicit do-not-test vs must-test behavior lists for Studio E2E
REQUIRED trigger on any create/modify/refactor in packages/playground-ui or packages/playground

Compatible agents: Claude Code, Cursor, Codex

Adoption & trust: 946 installs on skills.sh; 24.9k GitHub stars; 3/3 security scanners passed (skills.sh audits).

What problem does it solve?

You changed Mastra playground UI but only have brittle tests that prove modals open—not that agents, tools, chat, or workflows actually work.

Who is it for?

Mastra contributors or indie builders forking playground-ui who need mandatory behavior E2E coverage on every studio feature or fix.

Skip if: Backend-only API changes with no playground impact, or teams unwilling to install Playwright MCP for agent-driven test authoring.

When should I use this skill?

REQUIRED when modifying any file in packages/playground-ui or packages/playground—React component work, UI changes, new playground features, or studio bug fixes.

What do I get? / Deliverables

You ship Playwright E2E specs that assert real Studio behaviors (persistence, streaming, ordered workflow execution) before merging playground changes.

Playwright E2E test files asserting Studio product behavior
Feature-intent notes answered before test authoring

Recommended Skills

Agent Browservercel-labs/open-agents

agent-browser is a Vercel Open Agents skill that wraps a CLI for programmatic browser control—ideal when solo builders n…404k installs·5.6k stars

Tddmattpocock/skills

TDD is an agent skill that coaches test-driven development using the red-green-refactor loop for solo and indie builders…214k installs·121k stars

Use My Browserxixu-me/skills

Use My Browser skill forces agents to classify tasks as static-capable or browser-required before choosing tools—staying…198k installs·61 stars

Test Driven Developmentobra/superpowers

Test-Driven Development is an agent skill from obra/superpowers that forces a test-first implementation ritual: write a …118k installs·221k stars

Verification Before Completionobra/superpowers

Verification Before Completion is an agent skill from the Superpowers lineage that blocks premature success claims durin…100k installs·221k stars

Webapp Testinganthropics/skills

webapp-testing is an agent skill for solo builders who need to prove that a local web application actually works—not jus…90.9k installs·148k stars

Journey fit

Spans multiple journey phases - primary shelf plus alternate fits below.

Primary fit

End-to-end behavior validation is canonically a Ship/testing concern before releases, even though the code lives in Build packages. Testing subphase is where product-behavior E2E suites belong—verifying agents, tools, chat streaming, and workflows actually work.

Also useful

BuildUI/UX & frontend

Where it fits

Example use

BuildUI/UX & frontend

After refactoring the agents list UI, author E2E that creating an agent persists and appears in the list.

Example use

ShipTesting & QA

Before release, run behavior tests that tool invocations with parameters return expected outputs in Studio.

Example use

ShipTesting & QA

Gate a bug fix by proving chat messages stream and retain conversation context end-to-end.

Example use

BuildAgent skills & templates

When adding workflow editor controls, test that execution triggers tools in the documented order.

How it compares

Behavior-first Playwright playbook for Mastra Studio, not generic snapshot or component-only unit testing.

Common Questions / FAQ

Who is e2e-tests-studio for?

Developers modifying Mastra packages/playground or packages/playground-ui who want agent-assisted Playwright tests tied to product outcomes.

When should I use e2e-tests-studio?

In Build/frontend when adding studio components, and in Ship/testing before release whenever UI changes affect agents, tools, chat streaming, or workflow execution.

Is e2e-tests-studio safe to install?

It instructs adding Playwright MCP and running browsers locally—review the Security Audits panel on this page and scope MCP permissions to your dev environment.

SKILL.md

READMESKILL.md - E2e Tests Studio

# E2E Behavior Validation for Frontend Modifications

## Core Principle: Test Product Behavior, Not UI States

**CRITICAL**: Tests must verify that product features WORK correctly, not just that UI elements render.

### What NOT to test (UI States):

- ❌ "Dropdown opens when clicked"
- ❌ "Modal appears after button click"
- ❌ "Loading spinner shows during request"
- ❌ "Form fields are visible"
- ❌ "Sidebar collapses"

### What TO test (Product Behavior):

- ✅ "Selecting an LLM provider configures the agent to use that provider"
- ✅ "Creating a new agent persists it and shows in the agents list"
- ✅ "Running a tool with parameters returns the expected output"
- ✅ "Chat messages stream correctly and maintain conversation context"
- ✅ "Workflow execution triggers tools in the correct order"

## Prerequisites

Requires Playwright MCP server. If the `browser_navigate` tool is unavailable, instruct the user to add it:

```sh
claude mcp add playwright -- npx @playwright/mcp@latest
```

## Step 1: Understand the Feature Intent

Before writing ANY test, answer these questions:

1. **What user problem does this feature solve?**
2. **What is the expected outcome when the feature works correctly?**
3. **What data flows through the system?** (user input → API → state → UI)
4. **What should persist after page reload?**
5. **What downstream effects should this action have?**

Document these answers as comments in your test file.

## Step 2: Build and Start

```sh
pnpm build:cli
cd packages/playground/e2e/kitchen-sink && pnpm dev
```

Verify server at http://localhost:4111

## Step 3: Map Feature to Behavior Tests

### Feature-to-Test Mapping Guide

| Feature Category           | What to Test                                      | Example Assertion                                            |
| -------------------------- | ------------------------------------------------- | ------------------------------------------------------------ |
| **Agent Configuration**    | Config changes affect agent behavior              | Send message → verify response uses selected model           |
| **LLM Provider Selection** | Selected provider is used in requests             | Intercept API call → verify provider in request payload      |
| **Tool Execution**         | Tool runs with correct params & returns result    | Execute tool → verify output matches expected transformation |
| **Workflow Execution**     | Steps execute in order, data flows between steps  | Run workflow → verify each step's output feeds next step     |
| **Chat/Streaming**         | Messages persist, context maintained across turns | Multi-turn conversation → verify context awareness           |
| **MCP Server Tools**       | Server tools are callable and return data         | Call MCP tool → verify response structure and content        |
| **Memory/Persistence**     | Data survives page reload                         | Create item → reload → verify item exists                    |
| **Error Handling**         | Errors surface correctly to user                  | Trigger error condition → verify error message + recovery    |

## Step 4: Write Behavior-Focused Tests

### Test Structure Template

```ts
import { test, expect, Page } from '@playwright/test';
import { resetStorage } from '../__utils__/reset-storage';
import { selectFixture } from '../__utils__/select-fixture';
import { nanoid } from 'nanoid';

/**
 * FEATURE: [Name of feature]
 * USER STORY: As a user, I want to [action] so that [outcome]
 * BEHAVIOR UNDER TEST: [Specific behavior being validated]
 */

test.describe('[Feature Name] - Behavior Tests', () =>

What is this skill?

REQUIRED trigger when creating or changing files in packages/playground-ui or packages/playground

Core rule: test product behavior (persistence, streaming, tool output) instead of dropdown/modal UI states

Step 1 questionnaire: feature intent, user problem, and success criteria before writing tests

Playwright MCP prerequisite via claude mcp add playwright -- npx @playwright/mcp@latest

Optimized for Claude Opus 4.5 per skill metadata for studio UI work

Explicit do-not-test vs must-test behavior lists for Studio E2E

REQUIRED trigger on any create/modify/refactor in packages/playground-ui or packages/playground

Compatible agents: Claude Code, Cursor, Codex

Adoption & trust: 946 installs on skills.sh; 24.9k GitHub stars; 3/3 security scanners passed (skills.sh audits).

Journey fit

Spans multiple journey phases - primary shelf plus alternate fits below.

Primary fit

Also useful

BuildUI/UX & frontend

Where it fits

Example use

BuildUI/UX & frontend

After refactoring the agents list UI, author E2E that creating an agent persists and appears in the list.

Example use

ShipTesting & QA

Before release, run behavior tests that tool invocations with parameters return expected outputs in Studio.

Example use

ShipTesting & QA

Gate a bug fix by proving chat messages stream and retain conversation context end-to-end.

Example use

BuildAgent skills & templates

When adding workflow editor controls, test that execution triggers tools in the documented order.

SKILL.md

READMESKILL.md - E2e Tests Studio

# E2E Behavior Validation for Frontend Modifications

## Core Principle: Test Product Behavior, Not UI States

**CRITICAL**: Tests must verify that product features WORK correctly, not just that UI elements render.

### What NOT to test (UI States):

- ❌ "Dropdown opens when clicked"
- ❌ "Modal appears after button click"
- ❌ "Loading spinner shows during request"
- ❌ "Form fields are visible"
- ❌ "Sidebar collapses"

### What TO test (Product Behavior):

- ✅ "Selecting an LLM provider configures the agent to use that provider"
- ✅ "Creating a new agent persists it and shows in the agents list"
- ✅ "Running a tool with parameters returns the expected output"
- ✅ "Chat messages stream correctly and maintain conversation context"
- ✅ "Workflow execution triggers tools in the correct order"

## Prerequisites

Requires Playwright MCP server. If the `browser_navigate` tool is unavailable, instruct the user to add it:

```sh
claude mcp add playwright -- npx @playwright/mcp@latest
```

## Step 1: Understand the Feature Intent

Before writing ANY test, answer these questions:

1. **What user problem does this feature solve?**
2. **What is the expected outcome when the feature works correctly?**
3. **What data flows through the system?** (user input → API → state → UI)
4. **What should persist after page reload?**
5. **What downstream effects should this action have?**

Document these answers as comments in your test file.

## Step 2: Build and Start

```sh
pnpm build:cli
cd packages/playground/e2e/kitchen-sink && pnpm dev
```

Verify server at http://localhost:4111

## Step 3: Map Feature to Behavior Tests

### Feature-to-Test Mapping Guide

| Feature Category           | What to Test                                      | Example Assertion                                            |
| -------------------------- | ------------------------------------------------- | ------------------------------------------------------------ |
| **Agent Configuration**    | Config changes affect agent behavior              | Send message → verify response uses selected model           |
| **LLM Provider Selection** | Selected provider is used in requests             | Intercept API call → verify provider in request payload      |
| **Tool Execution**         | Tool runs with correct params & returns result    | Execute tool → verify output matches expected transformation |
| **Workflow Execution**     | Steps execute in order, data flows between steps  | Run workflow → verify each step's output feeds next step     |
| **Chat/Streaming**         | Messages persist, context maintained across turns | Multi-turn conversation → verify context awareness           |
| **MCP Server Tools**       | Server tools are callable and return data         | Call MCP tool → verify response structure and content        |
| **Memory/Persistence**     | Data survives page reload                         | Create item → reload → verify item exists                    |
| **Error Handling**         | Errors surface correctly to user                  | Trigger error condition → verify error message + recovery    |

## Step 4: Write Behavior-Focused Tests

### Test Structure Template

```ts
import { test, expect, Page } from '@playwright/test';
import { resetStorage } from '../__utils__/reset-storage';
import { selectFixture } from '../__utils__/select-fixture';
import { nanoid } from 'nanoid';

/**
 * FEATURE: [Name of feature]
 * USER STORY: As a user, I want to [action] so that [outcome]
 * BEHAVIOR UNDER TEST: [Specific behavior being validated]
 */

test.describe('[Feature Name] - Behavior Tests', () =>

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Where it fits

Who is e2e-tests-studio for?

When should I use e2e-tests-studio?

Is e2e-tests-studio safe to install?

SKILL.md

This week for builders

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Where it fits

Who is e2e-tests-studio for?

When should I use e2e-tests-studio?

Is e2e-tests-studio safe to install?

SKILL.md