
Domain Cloud Native
Guide Rust services toward 12-factor, stateless, observable patterns that survive Kubernetes restarts and horizontal scale.
Overview
domain-cloud-native is an agent skill most often used in Build (also Ship, Operate) that encodes Kubernetes-ready Rust service constraints for stateless design, health checks, tracing, and graceful shutdown.
Install
npx skills add https://github.com/zhanghandong/rust-skills --skill domain-cloud-nativeWhat is this skill?
- Maps domain rules (12-factor, observability, health, graceful shutdown) to explicit Rust design choices
- Enforces stateless pods: external Redis/DB state, no local persistence or static mut
- Graceful shutdown via tokio::signal and connection draining for zero-downtime deploys
- Distributed tracing with tracing spans and OpenTelemetry export on every request
- Container-friendly release binaries and Layer-3 domain constraints that trace to lifecycle patterns
Adoption & trust: 557 installs on skills.sh; 1.2k GitHub stars; 3/3 security scanners passed (skills.sh audits).
What problem does it solve?
You are building Rust microservices for containers but lack a concise map from cloud-native rules to code-level decisions your agent keeps getting wrong.
Who is it for?
Indie backends targeting k8s/Docker with tonic or async Rust who want agent output checked against production cluster realities.
Skip if: Pure client-side Rust, embedded targets, or teams that only need a hello-world binary with no orchestration or SRE expectations.
When should I use this skill?
Building cloud-native apps with Kubernetes, Docker, gRPC, microservices, observability, tracing, metrics, or health checks.
What do I get? / Deliverables
Your service design aligns with 12-factor, observable, restart-safe patterns so deploys and on-call debugging match what clusters actually do.
- Architecture-aligned constraint checklist applied to service design
- Observability and shutdown patterns specified for implementation
Recommended Skills
Journey fit
Spans multiple journey phases - primary shelf plus alternate fits below.
Cloud-native constraint knowledge sits on the canonical Build shelf because most teams invoke it while designing backends and deployment-facing services. Backend is the primary shelf for service design, graceful shutdown, health checks, and externalized state before ship and operate hardening.
Where it fits
Sketch a tonic gRPC service with externalized session state in Redis instead of in-memory static mut.
Add liveness/readiness routes and shutdown draining before the first production rollout.
Ensure every inbound request carries trace context exportable via OpenTelemetry when triaging latency spikes.
How it compares
Use as domain constraint reference layered on generic Rust skills—not as a deploy pipeline or infrastructure-as-code generator.
Common Questions / FAQ
Who is domain-cloud-native for?
Solo and small-team Rust builders shipping APIs or microservices to Kubernetes or Docker who need agents to respect statelessness, probes, tracing, and SIGTERM handling.
When should I use domain-cloud-native?
During Build when shaping backends and gRPC services; during Ship when adding health checks and shutdown hooks; during Operate when tightening metrics and distributed tracing before incidents.
Is domain-cloud-native safe to install?
It is procedural guidance without shell or network side effects; review the Security Audits panel on this page before adding any skill pack to your agent workflow.
SKILL.md
READMESKILL.md - Domain Cloud Native
# Cloud-Native Domain > **Layer 3: Domain Constraints** ## Domain Constraints → Design Implications | Domain Rule | Design Constraint | Rust Implication | |-------------|-------------------|------------------| | 12-Factor | Config from env | Environment-based config | | Observability | Metrics + traces | tracing + opentelemetry | | Health checks | Liveness/readiness | Dedicated endpoints | | Graceful shutdown | Clean termination | Signal handling | | Horizontal scale | Stateless design | No local state | | Container-friendly | Small binaries | Release optimization | --- ## Critical Constraints ### Stateless Design ``` RULE: No local persistent state WHY: Pods can be killed/rescheduled anytime RUST: External state (Redis, DB), no static mut ``` ### Graceful Shutdown ``` RULE: Handle SIGTERM, drain connections WHY: Zero-downtime deployments RUST: tokio::signal + graceful shutdown ``` ### Observability ``` RULE: Every request must be traceable WHY: Debugging distributed systems RUST: tracing spans, opentelemetry export ``` --- ## Trace Down ↓ From constraints to design (Layer 2): ``` "Need distributed tracing" ↓ m12-lifecycle: Span lifecycle ↓ tracing + opentelemetry "Need graceful shutdown" ↓ m07-concurrency: Signal handling ↓ m12-lifecycle: Connection draining "Need health checks" ↓ domain-web: HTTP endpoints ↓ m06-error-handling: Health status ``` --- ## Key Crates | Purpose | Crate | |---------|-------| | gRPC | tonic | | Kubernetes | kube, kube-runtime | | Docker | bollard | | Tracing | tracing, opentelemetry | | Metrics | prometheus, metrics | | Config | config, figment | | Health | HTTP endpoints | ## Design Patterns | Pattern | Purpose | Implementation | |---------|---------|----------------| | gRPC services | Service mesh | tonic + tower | | K8s operators | Custom resources | kube-runtime Controller | | Observability | Debugging | tracing + OTEL | | Health checks | Orchestration | `/health`, `/ready` | | Config | 12-factor | Env vars + secrets | ## Code Pattern: Graceful Shutdown ```rust use tokio::signal; async fn run_server() -> anyhow::Result<()> { let app = Router::new() .route("/health", get(health)) .route("/ready", get(ready)); let addr = SocketAddr::from(([0, 0, 0, 0], 8080)); axum::Server::bind(&addr) .serve(app.into_make_service()) .with_graceful_shutdown(shutdown_signal()) .await?; Ok(()) } async fn shutdown_signal() { signal::ctrl_c().await.expect("failed to listen for ctrl+c"); tracing::info!("shutdown signal received"); } ``` ## Health Check Pattern ```rust async fn health() -> StatusCode { StatusCode::OK } async fn ready(State(db): State<Arc<DbPool>>) -> StatusCode { match db.ping().await { Ok(_) => StatusCode::OK, Err(_) => StatusCode::SERVICE_UNAVAILABLE, } } ``` --- ## Common Mistakes | Mistake | Domain Violation | Fix | |---------|-----------------|-----| | Local file state | Not stateless | External storage | | No SIGTERM handling | Hard kills | Graceful shutdown | | No tracing | Can't debug | tracing spans | | Static config | Not 12-factor | Env vars | --- ## Trace to Layer 1 | Constraint | Layer 2 Pattern | Layer 1 Implementation | |------------|-----------------|------------------------| | Stateless | External state | Arc<Client> for external | | Graceful shutdown | Signal handling | tokio::signal | | Tracing | Span lifecycle | tracing + OTEL | | Health checks | HTTP endpoints | Dedicated routes | --- ## Related Skills | When | See | |------|-----| | Async patterns | m07-concurrency | | HTTP endpoints | domain-web | | Error handling | m13-domain-error | | Resource lifecycle