
Apify Generate Output Schema
Generate Apify Actor dataset, output, and key-value store schema JSON from real source code so Console displays run results correctly.
Overview
Apify Generate Output Schema is an agent skill for the Build phase that generates and updates Apify Actor output schema files by analyzing source code.
Install
npx skills add https://github.com/apify/agent-skills --skill apify-generate-output-schemaWhat is this skill?
- Produces dataset_schema.json, output_schema.json, and key_value_store_schema.json when applicable
- Code-first analysis: infer fields from what the Actor actually pushes, never guess
- Mandatory nullable: true on fields for unpredictable web/API outputs
- Cross-checks TypeScript types against runtime push code and reuses repo schema patterns
- Updates actor.json to register schemas for Apify Console
- Targets three schema artifacts: dataset_schema.json, output_schema.json, and key_value_store_schema.json when KV store i
- Phase 1 workflow step: discover Actor structure before generating schemas
Adoption & trust: 5.2k installs on skills.sh; 2.1k GitHub stars; 3/3 security scanners passed (skills.sh audits).
What problem does it solve?
Your Apify Actor runs but Console cannot display results well because dataset and output schemas are missing, stale, or disconnected from what the code actually stores.
Who is it for?
Solo builders shipping or maintaining Apify Actors who want schemas derived from code rather than hand-waved field lists.
Skip if: Projects with no Apify Actor or teams that only need generic JSON Schema for non-Apify APIs.
When should I use this skill?
Creating or updating Actor output schemas for Apify Console display.
What do I get? / Deliverables
Schema JSON files and actor.json reflect real pushed fields with nullable, anonymized examples so Apify Console can present runs accurately.
- dataset_schema.json and output_schema.json aligned to code
- key_value_store_schema.json when applicable
- Updated actor.json registering output schemas
Recommended Skills
Journey fit
Output schemas are created while building or updating Apify Actors, which is integration work on the scraping/automation product itself. The skill wires Apify Console presentation to code-derived dataset and store shapes—classic third-party platform integration during Build.
How it compares
Use as an Apify-specific schema generator tied to actor.json, not as a general-purpose OpenAPI or Prisma schema skill.
Common Questions / FAQ
Who is apify-generate-output-schema for?
Developers building Apify Actors who need Console-ready output schemas synchronized with their scraping or automation code.
When should I use apify-generate-output-schema?
When creating a new Actor’s output schemas or updating schemas after you change dataset or key-value store writes during Build integrations work.
Is apify-generate-output-schema safe to install?
The skill reads Actor source locally; review Security Audits on this Prism page and avoid putting real PII in schema examples per the skill’s anonymization rules.
SKILL.md
READMESKILL.md - Apify Generate Output Schema
# Generate Actor output schema You are generating output schema files for an Apify Actor. The output schema tells Apify Console how to display run results. You will analyze the Actor's source code, create `dataset_schema.json`, `output_schema.json`, and `key_value_store_schema.json` (if the Actor uses key-value store), and update `actor.json`. ## Core principles - **Analyze code first**: Read the Actor's source to understand what data it actually pushes to the dataset — never guess - **Every field is nullable**: APIs and websites are unpredictable — always set `"nullable": true` - **Anonymize examples**: Never use real user IDs, usernames, or personal data in examples - **Verify against code**: If TypeScript types exist, cross-check the schema against both the type definition AND the code that produces the values - **Reuse existing patterns**: Before generating schemas, check if other Actors in the same repository already have output schemas — match their structure, naming conventions, description style, and formatting - **Don't reinvent the wheel**: Reuse existing type definitions, interfaces, and utilities from the codebase instead of creating duplicate definitions --- ## Phase 1: Discover Actor structure **Goal**: Locate the Actor and understand its output Initial request: $ARGUMENTS **Actions**: 1. Create todo list with all phases 2. Find the `.actor/` directory containing `actor.json` 3. Read `actor.json` to understand the Actor's configuration 4. Check if `dataset_schema.json`, `output_schema.json`, and `key_value_store_schema.json` already exist 5. **Search for existing schemas in the repository**: Look for other `.actor/` directories or schema files (e.g., `**/dataset_schema.json`, `**/output_schema.json`, `**/key_value_store_schema.json`) to learn the repo's conventions — match their description style, field naming, example formatting, and overall structure 6. Find all places where data is pushed to the dataset: - **JavaScript/TypeScript**: Search for `Actor.pushData(`, `dataset.pushData(`, `Dataset.pushData(` - **Python**: Search for `Actor.push_data(`, `dataset.push_data(`, `Dataset.push_data(` 7. Find all places where data is stored in the key-value store: - **JavaScript/TypeScript**: Search for `Actor.setValue(`, `keyValueStore.setValue(`, `KeyValueStore.setValue(` - **Python**: Search for `Actor.set_value(`, `key_value_store.set_value(`, `KeyValueStore.set_value(` 8. Find output type definitions — **reuse them directly** instead of recreating from scratch: - **TypeScript**: Look for output type interfaces/types (e.g., in `src/types/`, `src/types/output.ts`). If an interface or type already defines the output shape, derive the schema fields from it — do not create a parallel definition - **Python**: Look for TypedDict, dataclass, or Pydantic model definitions. Use the existing field names, types, and docstrings as the source of truth 9. Check for existing shared schema utilities or helper functions in the codebase that handle schema generation or validation — reuse them rather than creating new logic 10. If inline `storages.dataset` or `storages.keyValueStore` config exists in `actor.json`, note it for migration Present findings to user: list all discovered dataset output fields, key-value store keys, their types, and where they come from. --- ## Phase 2: Generate `dataset_schema.json` **Goal**: Create a complete dataset schema with field definitions and display views ### File structure ```json { "actorSpecification": 1, "fields": { "$schema": "http://json-schema.org/draft-07/schema#", "type": "object", "properties": { // ALL output fields here — every field the Actor can produce,