Ml Pipeline Workflow

Name: Ml Pipeline Workflow
Author: wshobson

wshobson/agents

8.5k installs
38.3k repo stars
Updated July 22, 2026
wshobson/agents

ml-pipeline-workflow is an agent skill that Build end-to-end MLOps pipelines from data preparation through model training, validation, and production deployment. Use when creating ML pipelines, implementi.

About

Build end-to-end MLOps pipelines from data preparation through model training, validation, and production deployment. Use when creating ML pipelines, implementing MLOps practices, or automating model training and deployment workflows. --- name: ml-pipeline-workflow description: Build end-to-end MLOps pipelines from data preparation through model training, validation, and production deployment. Use when creating ML pipelines, implementing MLOps practices, or automating model training and deployment workflows. --- # ML Pipeline Workflow Complete end-to-end MLOps pipeline orchestration from data preparation through model deployment. ## Overview This skill provides comprehensive guidance for building production ML pipelines that handle the full lifecycle: data ingestion → preparation → training → validation → deployment → monitoring. ## When to Use This Skill - Building new ML pipelines from scratch - Designing workflow orchestration for ML systems - Implementing data → model → deployment automation - Setting up reproducible training workflows - Creating DAG-based ML orchestration - Integrating ML components into production systems ## What This Skill Provides ### Core Capabilities 1.

Building new ML pipelines from scratch
Designing workflow orchestration for ML systems
Implementing data → model → deployment automation
Setting up reproducible training workflows
Creating DAG-based ML orchestration

Ml Pipeline Workflow by the numbers

8,482 all-time installs (skills.sh)
+168 installs in the week ending Jul 28, 2026 (Skillselion tracking)
Ranked #84 of 1,041 Cloud & Infrastructure skills by installs in the Skillselion catalog
Security screen: LOW risk (skills.sh audit)
Data as of Jul 28, 2026 (Skillselion catalog sync)

At a glance

ml-pipeline-workflow capabilities & compatibility

Capabilities: building new ml pipelines from scratch · designing workflow orchestration for ml systems · implementing data → model → deployment automatio · setting up reproducible training workflows · creating dag based ml orchestration
Use cases: documentation

From the docs

What ml-pipeline-workflow says it does

--- name: ml-pipeline-workflow description: Build end-to-end MLOps pipelines from data preparation through model training, validation, and production deployment.

SKILL.md

Use when creating ML pipelines, implementing MLOps practices, or automating model training and deployment workflows.

SKILL.md

--- # ML Pipeline Workflow Complete end-to-end MLOps pipeline orchestration from data preparation through model deployment.

SKILL.md

**Pipeline Architecture** - End-to-end workflow design - DAG orchestration patterns (Airflow, Dagster, Kubeflow) - Component dependencies and data flow - Error handling and retry strategies 2.

SKILL.md

npx skills add https://github.com/wshobson/agents --skill ml-pipeline-workflow

Add your badge

Show developers this skill is listed on Skillselion. Paste this into your README.

[![Listed on Skillselion](https://skillselion.com/badge/skills/wshobson/agents/ml-pipeline-workflow.svg)](https://skillselion.com/skills/wshobson/agents/ml-pipeline-workflow)

Installs	8.5k
repo stars	★ 38.3k
Security audit	3 / 3 scanners passed
Last updated	July 22, 2026
Repository	wshobson/agents ↗

What problem does ml-pipeline-workflow solve for developers using this skill?

Build end-to-end MLOps pipelines from data preparation through model training, validation, and production deployment. Use when creating ML pipelines, implementing MLOps practices, or automating model

Who is it for?

Developers who need ml-pipeline-workflow patterns described in the cached skill documentation.

Skip if: Skip when docs are empty or the task is outside the skill's documented scope.

When should I use this skill?

Build end-to-end MLOps pipelines from data preparation through model training, validation, and production deployment. Use when creating ML pipelines, implementing MLOps practices, or automating model

What you get

Actionable workflows and conventions from SKILL.md for ml-pipeline-workflow.

MLOps pipeline workflow design
staged training and deployment plan

Files

SKILL.mdMarkdownGitHub ↗

ML Pipeline Workflow

Complete end-to-end MLOps pipeline orchestration from data preparation through model deployment.

Overview

This skill provides comprehensive guidance for building production ML pipelines that handle the full lifecycle: data ingestion → preparation → training → validation → deployment → monitoring.

When to Use This Skill

Building new ML pipelines from scratch
Designing workflow orchestration for ML systems
Implementing data → model → deployment automation
Setting up reproducible training workflows
Creating DAG-based ML orchestration
Integrating ML components into production systems

What This Skill Provides

Core Capabilities

1. Pipeline Architecture

End-to-end workflow design
DAG orchestration patterns (Airflow, Dagster, Kubeflow)
Component dependencies and data flow
Error handling and retry strategies

2. Data Preparation

Data validation and quality checks
Feature engineering pipelines
Data versioning and lineage
Train/validation/test splitting strategies

3. Model Training

Training job orchestration
Hyperparameter management
Experiment tracking integration
Distributed training patterns

4. Model Validation

Validation frameworks and metrics
A/B testing infrastructure
Performance regression detection
Model comparison workflows

5. Deployment Automation

Model serving patterns
Canary deployments
Blue-green deployment strategies
Rollback mechanisms

Reference Documentation

See the references/ directory for detailed guides:

data-preparation.md - Data cleaning, validation, and feature engineering
model-training.md - Training workflows and best practices
model-validation.md - Validation strategies and metrics
model-deployment.md - Deployment patterns and serving architectures

Assets and Templates

The assets/ directory contains:

pipeline-dag.yaml.template - DAG template for workflow orchestration
training-config.yaml - Training configuration template
validation-checklist.md - Pre-deployment validation checklist

Usage Patterns

Basic Pipeline Setup

# 1. Define pipeline stages
stages = [
    "data_ingestion",
    "data_validation",
    "feature_engineering",
    "model_training",
    "model_validation",
    "model_deployment"
]

# 2. Configure dependencies
# See assets/pipeline-dag.yaml.template for full example

Production Workflow

1. Data Preparation Phase

Ingest raw data from sources
Run data quality checks
Apply feature transformations
Version processed datasets

2. Training Phase

Load versioned training data
Execute training jobs
Track experiments and metrics
Save trained models

3. Validation Phase

Run validation test suite
Compare against baseline
Generate performance reports
Approve for deployment

4. Deployment Phase

Package model artifacts
Deploy to serving infrastructure
Configure monitoring
Validate production traffic

Best Practices

Pipeline Design

Modularity: Each stage should be independently testable
Idempotency: Re-running stages should be safe
Observability: Log metrics at every stage
Versioning: Track data, code, and model versions
Failure Handling: Implement retry logic and alerting

Data Management

Use data validation libraries (Great Expectations, TFX)
Version datasets with DVC or similar tools
Document feature engineering transformations
Maintain data lineage tracking

Model Operations

Separate training and serving infrastructure
Use model registries (MLflow, Weights & Biases)
Implement gradual rollouts for new models
Monitor model performance drift
Maintain rollback capabilities

Deployment Strategies

Start with shadow deployments
Use canary releases for validation
Implement A/B testing infrastructure
Set up automated rollback triggers
Monitor latency and throughput

Integration Points

Orchestration Tools

Apache Airflow: DAG-based workflow orchestration
Dagster: Asset-based pipeline orchestration
Kubeflow Pipelines: Kubernetes-native ML workflows
Prefect: Modern dataflow automation

Experiment Tracking

MLflow for experiment tracking and model registry
Weights & Biases for visualization and collaboration
TensorBoard for training metrics

Deployment Platforms

AWS SageMaker for managed ML infrastructure
Google Vertex AI for GCP deployments
Azure ML for Azure cloud
OCI Data Science for Oracle Cloud Infrastructure deployments
Kubernetes + KServe for cloud-agnostic serving

Progressive Disclosure

Start with the basics and gradually add complexity:

1. Level 1: Simple linear pipeline (data → train → deploy) 2. Level 2: Add validation and monitoring stages 3. Level 3: Implement hyperparameter tuning 4. Level 4: Add A/B testing and gradual rollouts 5. Level 5: Multi-model pipelines with ensemble strategies

Common Patterns

Batch Training Pipeline

# See assets/pipeline-dag.yaml.template
stages:
  - name: data_preparation
    dependencies: []
  - name: model_training
    dependencies: [data_preparation]
  - name: model_evaluation
    dependencies: [model_training]
  - name: model_deployment
    dependencies: [model_evaluation]

Real-time Feature Pipeline

# Stream processing for real-time features
# Combined with batch training
# See references/data-preparation.md

Continuous Training

# Automated retraining on schedule
# Triggered by data drift detection
# See references/model-training.md

Troubleshooting

Common Issues

Pipeline failures: Check dependencies and data availability
Training instability: Review hyperparameters and data quality
Deployment issues: Validate model artifacts and serving config
Performance degradation: Monitor data drift and model metrics

Debugging Steps

1. Check pipeline logs for each stage 2. Validate input/output data at boundaries 3. Test components in isolation 4. Review experiment tracking metrics 5. Inspect model artifacts and metadata

Next Steps

After setting up your pipeline:

1. Explore hyperparameter-tuning skill for optimization 2. Learn experiment-tracking-setup for MLflow/W&B 3. Review model-deployment-patterns for serving strategies 4. Implement monitoring with observability tools

Related Skills

experiment-tracking-setup: MLflow and Weights & Biases integration
hyperparameter-tuning: Automated hyperparameter optimization
model-deployment-patterns: Advanced deployment strategies

Related skills

Azure AiIntegrates Azure AI Content Safety, Document Intelligence, Speech, and Search services into Java-based agents and applications.479k1.3k

Azure PrepareGenerate the exact Azure infrastructure files, Dockerfiles, and azure.yaml configuration needed before deploying any new or modernized application.479k1.3k

Azure StorageConnect agents and applications to Azure Blob Storage, File Shares, Queues, Tables, and Data Lake without leaving the coding environment.478k1.3k

Appinsights InstrumentationAutomatically instrument web applications running on Azure App Service with Application Insights for observability without manual configuration.478k1.3k

Azure Resource LookupInstantly list, query, and discover any Azure resources across subscriptions without leaving the agent chat.478k1.3k

Azure AigatewayConfigure Azure API Management as a secure, governed gateway for routing traffic to LLMs, MCP servers, and agent tools.478k1.3k

How it compares

Pick ml-pipeline-workflow over single-model training skills when you need full lifecycle MLOps structure from data prep through monitored deployment.

About

Ml Pipeline Workflow by the numbers

ml-pipeline-workflow capabilities & compatibility

What ml-pipeline-workflow says it does

Add your badge

What problem does ml-pipeline-workflow solve for developers using this skill?

Who is it for?

When should I use this skill?

What you get

Files

ML Pipeline Workflow

Overview

When to Use This Skill

What This Skill Provides

Core Capabilities

Reference Documentation

Assets and Templates

Usage Patterns

Basic Pipeline Setup

Production Workflow

Best Practices

Pipeline Design

Data Management

Model Operations

Deployment Strategies

Integration Points

Orchestration Tools

Experiment Tracking

Deployment Platforms

Progressive Disclosure

Common Patterns

Batch Training Pipeline

Real-time Feature Pipeline

Continuous Training

Troubleshooting

Common Issues

Debugging Steps

Next Steps

Related Skills

Related skills

How it compares

FAQ

What does ml-pipeline-workflow do?

When should I use ml-pipeline-workflow?

Is ml-pipeline-workflow safe to install?

This week in AI coding