
Software Architecture Design
Apply a system scalability and high-availability checklist when designing services, databases, and infra for growth.
Install
npx skills add https://github.com/vasilyu1983/ai-agents-public --skill software-architecture-designWhat is this skill?
- Load estimation template: current, peak, growth projection, and target capacity multipliers
- Horizontal vs vertical scaling checklist with Redis sessions, object storage, and auto-scaling
- Database read/write scaling: replicas, read/write split, pooling, and sharding strategy placeholders
- Actionable checkboxes for Layer 7 load balancing and query latency targets
Adoption & trust: 618 installs on skills.sh; 61 GitHub stars; 3/3 security scanners passed (skills.sh audits).
Recommended Skills
Entra App Registrationmicrosoft/azure-skills
Azure Aigatewaymicrosoft/azure-skills
Lark Openapi Explorerlarksuite/cli
Supabasesupabase/agent-skills
Firebase Auth Basicsfirebase/agent-skills
Firebase Data Connectfirebase/agent-skills
Journey fit
Primary fit
Architecture and scale decisions land when you are building backend systems; the checklist is shelved under Build backend as the primary implementation reference. Backend subphase covers stateless services, read replicas, sharding, and connection pooling—the core of this document.
Common Questions / FAQ
Is Software Architecture Design safe to install?
skills.sh reports 3 of 3 security scanners passed. Review the Security Audits panel on this page before installing in production.
SKILL.md
READMESKILL.md - Software Architecture Design
# System Scalability Checklist Use this checklist when designing for horizontal scalability and high availability. ## Load Estimation - [ ] **Current load:** [N] requests/second, [N] concurrent users - [ ] **Peak load:** [N] requests/second (expected during [event/time]) - [ ] **Growth projection:** [X]% yearly growth - [ ] **Target capacity:** Support [N]x current load ## Scalability Dimensions ### Horizontal Scaling (Preferred) - [ ] **Stateless services:** All application servers are stateless - [ ] **Session storage:** Use Redis/Memcached for distributed sessions - [ ] **File storage:** Use object storage (S3/GCS) instead of local filesystem - [ ] **Auto-scaling:** Configure based on CPU/memory/RPS metrics - [ ] **Load balancer:** Layer 7 (application-aware) load balancing ### Vertical Scaling (Limited) - [ ] **Database instance:** Right-sized for current + 6 months growth - [ ] **Cache instance:** Sized for working set + 20% headroom - [ ] **Max capacity:** Identified vertical scaling limit ## Database Scalability ### Read Scaling - [ ] **Read replicas:** [N] replicas for read-heavy workloads - [ ] **Read/write splitting:** Route reads to replicas, writes to primary - [ ] **Connection pooling:** PgBouncer/ProxySQL to limit connections - [ ] **Query optimization:** Queries < [10ms] with proper indexing ### Write Scaling - [ ] **Sharding strategy:** [Hash-based / Range-based / Geographic] - Shard key: [user_id / tenant_id / region] - Number of shards: [N] (plan for [10x] growth) - [ ] **Write-ahead logging:** Asynchronous replication for replicas - [ ] **Bulk operations:** Batch inserts/updates to reduce round trips ### Caching Strategy - [ ] **Cache layers:** - L1: In-memory application cache ([Caffeine / Guava]) - L2: Distributed cache ([Redis / Memcached]) - L3: CDN for static assets ([CloudFront / Fastly]) - [ ] **Cache hit ratio:** Target > [90]% - [ ] **TTL strategy:** Balance freshness vs load ([5min] for hot data, [1h] for warm) - [ ] **Cache eviction:** LRU/LFU policy configured - [ ] **Cache warming:** Pre-populate on deployment - [ ] **Cache invalidation:** Event-driven invalidation for updates ## API Gateway - [ ] **Rate limiting:** - Per-user: [N] req/min - Per-IP: [N] req/min - Burst allowance: [N] requests - [ ] **Request throttling:** Queue requests during spikes - [ ] **Response compression:** Gzip/Brotli enabled - [ ] **API versioning:** Support [N] concurrent versions ## Asynchronous Processing - [ ] **Message queue:** [Kafka / RabbitMQ / AWS SQS] - Throughput: [N] messages/second - Retention: [X] days - [ ] **Worker pools:** [N] workers per queue - [ ] **Backpressure:** Reject requests when queue length > [N] - [ ] **Dead letter queue:** For failed message handling ## Content Delivery - [ ] **CDN:** CloudFront/Fastly for static assets - [ ] **Edge caching:** Cache-Control headers configured - [ ] **Image optimization:** WebP format, lazy loading - [ ] **Asset bundling:** Minified and bundled CSS/JS ## Data Storage Patterns ### Hot/Warm/Cold Data - [ ] **Hot data:** Last [7] days in primary DB (fast access) - [ ] **Warm data:** Last [30] days in read replicas - [ ] **Cold data:** Older than [30] days archived to S3/Glacier - [ ] **Archival strategy:** Automated data lifecycle policies ### Data Partitioning - [ ] **Time-based partitioning:** Partition by month/year for time-series data - [ ] **Hash partitioning:** Distribute by hash(user_id) for even distribution - [ ] **List partitioning:** Partition by region/tenant for isolation ## Connection Management - [ ] **Database connection pool:** - Min connections: [N] - Max connections: [N] - Connection timeout: [Xms] - [ ] **HTTP keep-alive:** Reuse connections to upstream services - [ ] **Circuit breaker:** Prevent cascade failures ## Observability for Scalability ### Key Metrics - [ ] **Golden signals:** - Latency: p50, p95, p99 response times - Traffic: Requests per second - Errors: Error rate