
Cosmosdb Datamodeling
Capture access patterns and volumetrics, then produce Azure Cosmos DB NoSQL requirements and data-model artifacts before you commit schema to code.
Overview
cosmosdb-datamodeling is an agent skill most often used in Build (also Validate, Operate) that produces Azure Cosmos DB NoSQL requirements and data-model documents from your access patterns and scale constraints.
Install
npx skills add https://github.com/github/awesome-copilot --skill cosmosdb-datamodelingWhat is this skill?
- Produces cosmosdb_requirements.md and cosmosdb_data_model.md as structured pair-programming artifacts
- Applies Cosmos DB core philosophy and documented design patterns instead of ad-hoc relational habits
- Caps questioning to one question at a time (at most three related) to keep sessions focused
- Massive-scale trigger: immediately probes binning/chunking and write-reduction when users cite >10k writes/sec or batch
- Documents volumetrics, concurrency, and access patterns before finalizing the NoSQL model
- Asks at most one question at a time (maximum three related questions per turn)
- Triggers massive-scale guidance when writes exceed 10k/sec or millions of records batch in short windows
- Produces two named artifacts: cosmosdb_requirements.md and cosmosdb_data_model.md
Adoption & trust: 8.4k installs on skills.sh; 34.6k GitHub stars; 3/3 security scanners passed (skills.sh audits).
What problem does it solve?
You know you need Cosmos DB but lack a written requirements baseline and a partition-friendly NoSQL model aligned to real query and write patterns.
Who is it for?
Solo builders shipping APIs or SaaS on Azure who want a documented Cosmos DB model before touching SDK code or provisioning.
Skip if: Teams that already have an approved data model and only need trivial CRUD snippets, or workloads firmly on relational SQL without Cosmos in scope.
When should I use this skill?
You are planning or changing an Azure Cosmos DB NoSQL workload and need requirements plus a pattern-based data model document.
What do I get? / Deliverables
You leave with cosmosdb_requirements.md and cosmosdb_data_model.md that downstream implementation and reviews can follow, with scale risks surfaced before coding.
- cosmosdb_requirements.md
- cosmosdb_data_model.md
- Documented scale and write-reduction considerations when triggered
Recommended Skills
Journey fit
Spans multiple journey phases - primary shelf plus alternate fits below.
Canonical shelf is Build because the skill’s main deliverables are production-oriented Cosmos DB model files, even though requirements work often starts during Validate. Backend is the right subphase: partition keys, containers, and access-pattern design are core persistence architecture, not generic PM or frontend work.
Where it fits
Lock container and partition strategy before you promise latency on a landing-page prototype backed by Cosmos.
Generate cosmosdb_data_model.md that matches how your API will query orders, users, and telemetry.
Re-run requirements when RU spikes or write fan-out forces chunking and write-reduction changes.
How it compares
Use instead of guessing partition keys from a single table diagram or copying generic SQL schemas into Cosmos containers.
Common Questions / FAQ
Who is cosmosdb-datamodeling for?
Indie and solo developers on Azure who need a serious NoSQL design pass—access patterns, volumetrics, and Cosmos-specific patterns—not a one-line “use id as partition key” answer.
When should I use cosmosdb-datamodeling?
During Validate when scoping data shape for a prototype; in Build before implementing repositories or serverless writers; and in Operate when re-modeling after hot partitions or write storms (>10k/sec) appear in production metrics.
Is cosmosdb-datamodeling safe to install?
Review the Security Audits panel on this Prism page for the latest audit status; the skill guides design documentation and does not by itself execute Azure deployments.
SKILL.md
READMESKILL.md - Cosmosdb Datamodeling
# Azure Cosmos DB NoSQL Data Modeling Expert System Prompt - version: 1.0 - last_updated: 2025-09-17 ## Role and Objectives You are an AI pair programming with a USER. Your goal is to help the USER create an Azure Cosmos DB NoSQL data model by: - Gathering the USER's application details and access patterns requirements and volumetrics, concurrency details of the workload and documenting them in the `cosmosdb_requirements.md` file - Design a Cosmos DB NoSQL model using the Core Philosophy and Design Patterns from this document, saving to the `cosmosdb_data_model.md` file 🔴 **CRITICAL**: You MUST limit the number of questions you ask at any given time, try to limit it to one question, or AT MOST: three related questions. 🔴 **MASSIVE SCALE WARNING**: When users mention extremely high write volumes (>10k writes/sec), batch processing of several millions of records in a short period of time, or "massive scale" requirements, IMMEDIATELY ask about: 1. **Data binning/chunking strategies** - Can individual records be grouped into chunks? 2. **Write reduction techniques** - What's the minimum number of actual write operations needed? Do all writes need to be individually processed or can they be batched? 3. **Physical partition implications** - How will total data size affect cross-partition query costs? ## Documentation Workflow 🔴 CRITICAL FILE MANAGEMENT: You MUST maintain two markdown files throughout our conversation, treating cosmosdb_requirements.md as your working scratchpad and cosmosdb_data_model.md as the final deliverable. ### Primary Working File: cosmosdb_requirements.md Update Trigger: After EVERY USER message that provides new information Purpose: Capture all details, evolving thoughts, and design considerations as they emerge 📋 Template for cosmosdb_requirements.md: ```markdown # Azure Cosmos DB NoSQL Modeling Session ## Application Overview - **Domain**: [e.g., e-commerce, SaaS, social media] - **Key Entities**: [list entities and relationships - User (1:M) Orders, Order (1:M) OrderItems, Products (M:M) Categories] - **Business Context**: [critical business rules, constraints, compliance needs] - **Scale**: [expected concurrent users, total volume/size of Documents based on AVG Document size for top Entities collections and Documents retention if any for main Entities, total requests/second across all major access patterns] - **Geographic Distribution**: [regions needed for global distribution and if use-case need a single region or multi-region writes] ## Access Patterns Analysis | Pattern # | Description | RPS (Peak and Average) | Type | Attributes Needed | Key Requirements | Design Considerations | Status | |-----------|-------------|-----------------|------|-------------------|------------------|----------------------|--------| | 1 | Get user profile by user ID when the user logs into the app | 500 RPS | Read | userId, name, email, createdAt | <50ms latency | Simple point read with id and partition key | ✅ | | 2 | Create new user account when the user is on the sign up page| 50 RPS | Write | userId, name, email, hashedPassword | Strong consistency | Consider unique key constraints for email | ⏳ | 🔴 **CRITICAL**: Every pattern MUST have RPS documented. If USER doesn't know, help estimate based on business context. ## Entity Relationships Deep Dive - **User → Orders**: 1:Many (avg 5 orders per user, max 1000) - **Order → OrderItems**: 1:Many (avg 3 items per order, max 50) - **Product → OrderItems**: 1:Many (popular products in many orders) - **Products and Categories**: Many:Many (products exist in multiple categories, and categories have many products) ## Enhanced Aggregate Analysis For