
Machine Learning
Turn notebook experiments into reproducible PyTorch and scikit-learn pipelines with splits, tuning, tracking, and interpretability.
Overview
machine-learning is an agent skill most often used in Build (also Ship, Validate) that constructs reproducible PyTorch and scikit-learn pipelines with tuning, tracking, and interpretability.
Install
npx skills add https://github.com/itallstartedwithaidea/agent-skills --skill machine-learningWhat is this skill?
- End-to-end ML pipelines with PyTorch and scikit-learn
- Train/validation/test splits, stratified cross-validation, and learning curves
- Hyperparameter optimization with experiment configuration tracking
- SHAP values, feature importance, and partial dependence for interpretability
- Reproducible workflows: versioned experiments, deterministic training, stored artifacts
Adoption & trust: 1 installs on skills.sh; 18 GitHub stars; 3/3 security scanners passed (skills.sh audits); trending (+100% hot-view momentum).
What problem does it solve?
You have notebook prototypes but no structured splits, tracked experiments, or explainability, so you cannot trust or reproduce model results.
Who is it for?
Indie builders adding ML features to a product who want scikit-learn or PyTorch pipelines with experiment logs and SHAP-style explanations.
Skip if: Pure LLM prompt tuning with no classical ML training, or teams that only need a one-off Kaggle notebook with no reproducibility requirements.
When should I use this skill?
When you need end-to-end ML pipeline construction with model selection, training, evaluation, interpretability, hyperparameter tuning, and experiment tracking.
What do I get? / Deliverables
You get a documented ML workflow with proper validation, hyperparameter search, tracked metrics and artifacts, and interpretability reports suitable for iteration or release review.
- Training and evaluation pipeline with documented splits
- Experiment log with configs, metrics, and artifacts
- Interpretability outputs (SHAP, importance, partial dependence)
Recommended Skills
Journey fit
Spans multiple journey phases - primary shelf plus alternate fits below.
Most ML pipeline work lands in build when you are implementing models and training code, even though evaluation habits matter in ship. Training, evaluation, and pipeline code are backend/data-layer concerns rather than UI or distribution.
Where it fits
Prototype a churn classifier with held-out test metrics before committing engineering time to a full feature.
Implement training and inference modules with PyTorch or sklearn tied to your API.
Run stratified cross-validation and learning curves as evidence before shipping a model-backed endpoint.
How it compares
Use for disciplined training pipelines instead of ad-hoc notebook cells without splits, tracking, or interpretability gates.
Common Questions / FAQ
Who is machine-learning for?
Solo developers and small teams building predictive or classification features who use PyTorch or scikit-learn and need reproducible, explainable training workflows.
When should I use machine-learning?
In validate when scoping feasibility with proper evaluation design; in build when implementing training code; in ship when you need cross-validation and learning-curve evidence before launch.
Is machine-learning safe to install?
Check the Security Audits panel on this Prism page; ML skills often need shell and network for package installs and should not run untrusted training data without review.
SKILL.md
READMESKILL.md - Machine Learning
# Machine Learning Part of [Agent Skills™](https://github.com/itallstartedwithaidea/agent-skills) by [googleadsagent.ai™](https://googleadsagent.ai) ## Description Machine Learning provides end-to-end ML pipeline construction with PyTorch and scikit-learn, covering model selection, training, evaluation, interpretability, hyperparameter tuning, and experiment tracking. The agent builds reproducible ML workflows that follow software engineering best practices: version-controlled experiments, deterministic training, and interpretable results. The gap between a working notebook and a production ML pipeline is enormous. This skill bridges that gap by enforcing structured experiment management, proper train/validation/test splits, stratified cross-validation, learning curve analysis, and systematic hyperparameter optimization. The agent tracks every experiment with its configuration, metrics, and artifacts, making it possible to reproduce any result months later. Model interpretability is treated as a first-class requirement, not an optional post-hoc analysis. Every model comes with SHAP values, feature importance rankings, and partial dependence plots that explain what the model learned and why it makes specific predictions. Black-box predictions without explanations are insufficient for scientific and business-critical applications. ## Use When - Building classification or regression models - Tuning hyperparameters systematically - Explaining model predictions with SHAP or feature importance - Setting up experiment tracking for ML projects - Evaluating model performance with proper cross-validation - Training PyTorch models with structured training loops ## How It Works ```mermaid graph TD A[Dataset] --> B[Train/Val/Test Split] B --> C[Feature Engineering] C --> D[Model Selection] D --> E[Hyperparameter Tuning: Optuna] E --> F[Cross-Validation] F --> G[Best Model Training] G --> H[Evaluation on Test Set] H --> I[Interpretability: SHAP] I --> J[Experiment Logging] J --> K[Model Registry] ``` The pipeline enforces a strict separation between tuning (using validation data) and final evaluation (using held-out test data). The test set is touched exactly once, preventing information leakage from repeated evaluation. ## Implementation ```python import torch import torch.nn as nn from torch.utils.data import DataLoader, TensorDataset from sklearn.model_selection import StratifiedKFold from sklearn.metrics import classification_report, roc_auc_score import optuna import shap import numpy as np class Classifier(nn.Module): def __init__(self, input_dim: int, hidden_dim: int, dropout: float): super().__init__() self.net = nn.Sequential( nn.Linear(input_dim, hidden_dim), nn.ReLU(), nn.Dropout(dropout), nn.Linear(hidden_dim, hidden_dim // 2), nn.ReLU(), nn.Dropout(dropout), nn.Linear(hidden_dim // 2, 1), ) def forward(self, x: torch.Tensor) -> torch.Tensor: return self.net(x) def train_epoch(model, loader, optimizer, criterion, device): model.train() total_loss = 0 for X_batch, y_batch in loader: X_batch, y_batch = X_batch.to(device), y_batch.to(device) optimizer.zero_grad() pred = model(X_batch).squeeze() loss = criterion(pred, y_batch.float()) loss.backward() optimizer.step() total_loss += loss.item() * len(X_batch) return total_loss / len(loader.dataset) def hyperparameter_search(X: np.ndarray, y: np.ndarray, n_trials: int = 50) -> dict: def objective(trial): hidden = trial.suggest_int("hidden_dim", 32, 256) lr = trial.su