
M13 Domain Error
Design Rust domain error types with clear audiences, recovery paths, and operational behavior before you implement handlers.
Overview
m13-domain-error is an agent skill most often used in Ship (also Build, Operate) that guides solo builders through Rust domain error categorization, recovery strategies, and resilience patterns before implementation hard
Install
npx skills add https://github.com/actionbook/rust-skills --skill m13-domain-errorWhat is this skill?
- Five-way error categorization table: user-facing, internal, system, transient, and permanent
- Recovery playbook: retry with backoff, fallbacks, circuit breaker, and graceful degradation
- Thinking prompts for audience (user, developer, ops) before designing each error variant
- Guidance on error codes, context for debugging, and transient vs permanent handling
- Layer 2 design-choice framing: who handles the error and how they recover
- 5 error categorization types in the core table
- Layer 2 design-choices framing in SKILL.md
Adoption & trust: 916 installs on skills.sh; 1.2k GitHub stars; 3/3 security scanners passed (skills.sh audits).
What problem does it solve?
You are adding Rust error types ad hoc and cannot tell which failures users should see, which deserve retries, or what ops must alert on.
Who is it for?
Solo builders shipping Rust APIs or services who want structured errors before writing `match` arms and HTTP mappings.
Skip if: Teams that only need a single generic `anyhow` wrapper with no user-facing or operational taxonomy.
When should I use this skill?
Use when designing domain error handling—error categorization, recovery strategy, retry, fallback, hierarchy, user-facing vs internal.
What do I get? / Deliverables
You leave with a domain error strategy—audience, recovery, and context rules—that your agent can implement as a coherent hierarchy in the next coding pass.
- Documented error categories with audience and recovery per type
- Hierarchy or naming rules for user vs internal vs system errors
Recommended Skills
Journey fit
Spans multiple journey phases - primary shelf plus alternate fits below.
Error taxonomy and recovery strategy belong on the ship shelf because they govern how failures surface in review, security posture, and production resilience—not just type names in build. Review is where solo builders decide user-facing vs internal errors, retry/backoff, and degradation before code merges and goes live.
Where it fits
Define `InvalidEmail` vs `DatabaseError` variants before wiring Actix or Axum handlers.
Check that HTTP status codes and messages match the user-facing vs internal split in your error enum.
Map `ConnectionTimeout` and `RateLimited` to retry policies and paging rules for on-call.
Ensure internal parse and DB errors never leak stack details to end users.
How it compares
Use for error *design* and recovery policy upfront, not as a linter that scans existing code for missing variants.
Common Questions / FAQ
Who is m13-domain-error for?
Indie and solo Rust builders designing services, CLIs, or backends who need clear boundaries between user, developer, and ops-facing failures.
When should I use m13-domain-error?
During build when modeling domain failures, in ship review before merging error-handling changes, and in operate when aligning retries, fallbacks, and alerts with error classes.
Is m13-domain-error safe to install?
It is documentation-style procedural guidance with no runtime hooks; review the Security Audits panel on this Prism page before adding any third-party skill repo to your agent.
SKILL.md
READMESKILL.md - M13 Domain Error
# Domain Error Strategy > **Layer 2: Design Choices** ## Core Question **Who needs to handle this error, and how should they recover?** Before designing error types: - Is this user-facing or internal? - Is recovery possible? - What context is needed for debugging? --- ## Error Categorization | Error Type | Audience | Recovery | Example | |------------|----------|----------|---------| | User-facing | End users | Guide action | `InvalidEmail`, `NotFound` | | Internal | Developers | Debug info | `DatabaseError`, `ParseError` | | System | Ops/SRE | Monitor/alert | `ConnectionTimeout`, `RateLimited` | | Transient | Automation | Retry | `NetworkError`, `ServiceUnavailable` | | Permanent | Human | Investigate | `ConfigInvalid`, `DataCorrupted` | --- ## Thinking Prompt Before designing error types: 1. **Who sees this error?** - End user → friendly message, actionable - Developer → detailed, debuggable - Ops → structured, alertable 2. **Can we recover?** - Transient → retry with backoff - Degradable → fallback value - Permanent → fail fast, alert 3. **What context is needed?** - Call chain → anyhow::Context - Request ID → structured logging - Input data → error payload --- ## Trace Up ↑ To domain constraints (Layer 3): ``` "How should I handle payment failures?" ↑ Ask: What are the business rules for retries? ↑ Check: domain-fintech (transaction requirements) ↑ Check: SLA (availability requirements) ``` | Question | Trace To | Ask | |----------|----------|-----| | Retry policy | domain-* | What's acceptable latency for retry? | | User experience | domain-* | What message should users see? | | Compliance | domain-* | What must be logged for audit? | --- ## Trace Down ↓ To implementation (Layer 1): ``` "Need typed errors" ↓ m06-error-handling: thiserror for library ↓ m04-zero-cost: Error enum design "Need error context" ↓ m06-error-handling: anyhow::Context ↓ Logging: tracing with fields "Need retry logic" ↓ m07-concurrency: async retry patterns ↓ Crates: tokio-retry, backoff ``` --- ## Quick Reference | Recovery Pattern | When | Implementation | |------------------|------|----------------| | Retry | Transient failures | exponential backoff | | Fallback | Degraded mode | cached/default value | | Circuit Breaker | Cascading failures | failsafe-rs | | Timeout | Slow operations | `tokio::time::timeout` | | Bulkhead | Isolation | separate thread pools | ## Error Hierarchy ```rust #[derive(thiserror::Error, Debug)] pub enum AppError { // User-facing #[error("Invalid input: {0}")] Validation(String), // Transient (retryable) #[error("Service temporarily unavailable")] ServiceUnavailable(#[source] reqwest::Error), // Internal (log details, show generic) #[error("Internal error")] Internal(#[source] anyhow::Error), } impl AppError { pub fn is_retryable(&self) -> bool { matches!(self, Self::ServiceUnavailable(_)) } } ``` ## Retry Pattern ```rust use tokio_retry::{Retry, strategy::ExponentialBackoff}; async fn with_retry<F, T, E>(f: F) -> Result<T, E> where F: Fn() -> impl Future<Output = Result<T, E>>, E: std::fmt::Debug, { let strategy = ExponentialBackoff::from_millis(100) .max_delay(Duration::from_secs(10)) .take(5); Retry::spawn(strategy, || f()).await } ``` --- ## Common Mistakes | Mistake | Why Wrong | Better | |---------|-----------|--------| | Same error for all | No actionability | Categorize by audience | | Retry everythi