Domain Ml

Name: Domain Ml
Author: zhanghandong

zhanghandong/rust-skills

586 installs
1.3k repo stars
Updated May 24, 2026
zhanghandong/rust-skills

This is a copy of domain-ml by actionbook - installs and ranking accrue to the original listing.

Domain ML is a Rust skills layer that applies machine learning domain constraints and patterns for efficient inference and training code using crates like candle, tch-rs, burn, and ndarray.

About

Domain ML is a Layer 3 domain-constraints skill from zhanghandong/rust-skills for building ML and AI applications in Rust. It activates on keywords including machine learning, tensor, model inference, neural networks, deep learning, ndarray, tch-rs, burn, and candle. The skill maps domain rules to Rust design implications: large data requires zero-copy and streaming, GPU acceleration maps to candle and tch-rs, model portability uses ONNX, and batch processing prioritizes throughput with batched inference. Developers reach for Domain ML when writing Rust inference or training code and need framework-specific memory, GPU, and portability patterns rather than Python-centric ML guidance.

Translates 7 core ML domain rules into concrete Rust design constraints
Guides zero-copy memory handling, batched GPU inference, and ONNX portability
Maps high-level constraints to specific crates: ndarray, tch-rs, burn, candle, polars
Includes critical rules for memory efficiency, numerical precision, and reproducibility
Provides traceable Layer 3 → Layer 2 decision paths for ML application architecture

Domain Ml by the numbers

586 all-time installs (skills.sh)
+5 installs in the week ending Jul 28, 2026 (Skillselion tracking)
Security screen: MEDIUM risk (skills.sh audit)
Data as of Jul 28, 2026 (Skillselion catalog sync)

npx skills add https://github.com/zhanghandong/rust-skills --skill domain-ml

Add your badge

Show developers this skill is listed on Skillselion. Paste this into your README.

[![Listed on Skillselion](https://skillselion.com/badge/skills/zhanghandong/rust-skills/domain-ml.svg)](https://skillselion.com/skills/zhanghandong/rust-skills/domain-ml)

Installs	586
repo stars	★ 1.3k
Security audit	3 / 3 scanners passed
Last updated	May 24, 2026
Repository	zhanghandong/rust-skills ↗

How do you build efficient ML inference code in Rust?

Apply Rust-specific machine learning constraints and patterns when creating efficient inference or training code.

Who is it for?

Rust developers building ML inference or training pipelines who need domain-specific memory, GPU, and ONNX patterns instead of Python ML defaults.

Skip if: Python PyTorch or TensorFlow projects, frontend-only ML UIs, or general Rust code with no tensor, model, or inference requirements.

When should I use this skill?

A developer builds ML or AI features in Rust mentioning tensors, inference, candle, tch-rs, burn, ndarray, or ONNX model loading.

What you get

Rust ML code with zero-copy patterns, GPU-backed inference setup, ONNX integration, and batched throughput-oriented designs.

Rust inference code
GPU acceleration setup

By the numbers

References 4 Rust ML crates: candle, tch-rs, burn, and ndarray

Files

SKILL.mdMarkdownGitHub ↗

Machine Learning Domain

Layer 3: Domain Constraints

Domain Constraints → Design Implications

Domain Rule	Design Constraint	Rust Implication
Large data	Efficient memory	Zero-copy, streaming
GPU acceleration	CUDA/Metal support	candle, tch-rs
Model portability	Standard formats	ONNX
Batch processing	Throughput over latency	Batched inference
Numerical precision	Float handling	ndarray, careful f32/f64
Reproducibility	Deterministic	Seeded random, versioning

---

Critical Constraints

Memory Efficiency

RULE: Avoid copying large tensors
WHY: Memory bandwidth is bottleneck
RUST: References, views, in-place ops

GPU Utilization

RULE: Batch operations for GPU efficiency
WHY: GPU overhead per kernel launch
RUST: Batch sizes, async data loading

Model Portability

RULE: Use standard model formats
WHY: Train in Python, deploy in Rust
RUST: ONNX via tract or candle

---

Trace Down ↓

From constraints to design (Layer 2):

"Need efficient data pipelines"
    ↓ m10-performance: Streaming, batching
    ↓ polars: Lazy evaluation

"Need GPU inference"
    ↓ m07-concurrency: Async data loading
    ↓ candle/tch-rs: CUDA backend

"Need model loading"
    ↓ m12-lifecycle: Lazy init, caching
    ↓ tract: ONNX runtime

---

Use Case → Framework

Use Case	Recommended	Why
Inference only	tract (ONNX)	Lightweight, portable
Training + inference	candle, burn	Pure Rust, GPU
PyTorch models	tch-rs	Direct bindings
Data pipelines	polars	Fast, lazy eval

Key Crates

Purpose	Crate
Tensors	ndarray
ONNX inference	tract
ML framework	candle, burn
PyTorch bindings	tch-rs
Data processing	polars
Embeddings	fastembed

Design Patterns

Pattern	Purpose	Implementation
Model loading	Once, reuse	`OnceLock<Model>`
Batching	Throughput	Collect then process
Streaming	Large data	Iterator-based
GPU async	Parallelism	Data loading parallel to compute

Code Pattern: Inference Server

use std::sync::OnceLock;
use tract_onnx::prelude::*;

static MODEL: OnceLock<SimplePlan<TypedFact, Box<dyn TypedOp>, Graph<TypedFact, Box<dyn TypedOp>>>> = OnceLock::new();

fn get_model() -> &'static SimplePlan<...> {
    MODEL.get_or_init(|| {
        tract_onnx::onnx()
            .model_for_path("model.onnx")
            .unwrap()
            .into_optimized()
            .unwrap()
            .into_runnable()
            .unwrap()
    })
}

async fn predict(input: Vec<f32>) -> anyhow::Result<Vec<f32>> {
    let model = get_model();
    let input = tract_ndarray::arr1(&input).into_shape((1, input.len()))?;
    let result = model.run(tvec!(input.into()))?;
    Ok(result[0].to_array_view::<f32>()?.iter().copied().collect())
}

Code Pattern: Batched Inference

async fn batch_predict(inputs: Vec<Vec<f32>>, batch_size: usize) -> Vec<Vec<f32>> {
    let mut results = Vec::with_capacity(inputs.len());

    for batch in inputs.chunks(batch_size) {
        // Stack inputs into batch tensor
        let batch_tensor = stack_inputs(batch);

        // Run inference on batch
        let batch_output = model.run(batch_tensor).await;

        // Unstack results
        results.extend(unstack_outputs(batch_output));
    }

    results
}

---

Common Mistakes

Mistake	Domain Violation	Fix
Clone tensors	Memory waste	Use views
Single inference	GPU underutilized	Batch processing
Load model per request	Slow	Singleton pattern
Sync data loading	GPU idle	Async pipeline

---

Trace to Layer 1

Constraint	Layer 2 Pattern	Layer 1 Implementation
Memory efficiency	Zero-copy	ndarray views
Model singleton	Lazy init	OnceLock<Model>
Batch processing	Chunked iteration	chunks() + parallel
GPU async	Concurrent loading	tokio::spawn + GPU

---

Related Skills

When	See
Performance	m10-performance
Lazy initialization	m12-lifecycle
Async patterns	m07-concurrency
Memory efficiency	m01-ownership

Related skills

Microsoft FoundryDeploy, evaluate, and continuously improve Microsoft Foundry agents from a single agent interface.478k1.3k

Ai Research ReproductionOrchestrate trustworthy, auditable reproduction of deep learning repositories directly from their READMEs.164k507

Run TrainSafely execute selected deep learning training commands with standardized evidence capture.164k507

Explore RunSafely run isolated exploratory experiments with clear recording and conservative selection before committing changes.164k507

Paper Context ResolverFetch precise reproduction-critical details like dataset splits, preprocessing steps, or evaluation protocols from the original academic paper when the repo README leav141k507

Repo Intake And PlanScan unfamiliar AI research repositories and receive a minimal, trustworthy reproduction target before investing significant time.140k507

FAQ

Which Rust ML crates does Domain ML reference?

Domain ML references candle, tch-rs, burn, and ndarray for Rust machine learning work. GPU acceleration maps to candle and tch-rs, while model portability uses ONNX standard formats.

What Rust patterns does Domain ML enforce for large ML data?

Domain ML enforces zero-copy and streaming patterns for large ML datasets in Rust. Batch processing designs prioritize throughput over single-request latency using batched inference approaches.

Is Domain Ml safe to install?

skills.sh reports 3 of 3 security scanners passed. Review the Security Audits panel on this page before installing in production.

Data Science & MLagentsllmautomation